Link of the Paper: https://arxiv.org/abs/1711.09151

Motivation:

SRE实战 互联网时代守护先锋,助力企业售后服务体系运筹帷幄!一键直达领取阿里云限量特价优惠。
  • LSTM units are complex and inherently sequential across time.
  • Convolutional networks have shown advantages on machine translation and conditional image generation.

Innovation:

  • The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 ) 人工智能 第1张    Paper Reading - Convolutional Image Captioning ( CVPR 2018 ) 人工智能 第2张

  • The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 ) 人工智能 第3张

Improvement:

  • Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 ) 人工智能 第4张

General Points:

  • Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
  • Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
  • An illustration of a classical RNN architecture for image captioning is provided below.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 ) 人工智能 第5张

扫码关注我们
微信号:SRE实战
拒绝背锅 运筹帷幄