Automatic Image Captioning Using Neural Networks
DOI:
https://doi.org/10.3126/jiee.v3i1.34335Keywords:
CNN, Image Captioning, Image Description, LSTM, RNNAbstract
 Automatically generating a natural language description of an image is a major challenging task in the field of artificial intelligence. Generating description of an image bring together the fields: Natural Language Processing and Computer Vision. There are two types of approaches i.e. top-down and bottom-up. For this paper, we approached top-down that starts from the image and converts it into the word. Image is passed to Convolutional Neural Network (CNN) encoder and the output from it is fed further to Recurrent Neural Network (RNN) decoder that generates meaningful captions. We generated the image description by passing the real time images from the camera of a smartphone as well as tested with the test images from the dataset. To evaluate the model performance, we used BLEU (Bilingual Evaluation Understudy) score and match predicted words to the original caption.
Downloads
Published
How to Cite
Issue
Section
License
Upon acceptance of an article, the copyright for the published works remains in the JIEE, Thapathali Campus and the authors.