Semantic attention algorithm

Year: 2,016
Authors: Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, Jiebo Luo
Journal:  IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Programming languages: Jupyter Notebook, Lua, Python

Automatically generating a natural language description of an image has attracted interests recently both because of its importance in practical applications and because it connects two major artificial intelligence fields: computer vision and natural language processing. Existing approaches are either top-down, which start from a gist of an image and convert it into words, or bottom-up, which come up with words describing various aspects of an image and then combine them. In this paper, we propose a new algorithm that combines both approaches through a model of semantic attention. Our algorithm learns to selectively attend to semantic concept proposals and fuse them into hidden states and outputs of recurrent neural networks. The selection and fusion form a feedback connecting the top-down and bottom-up computation.

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.