Image Captioning Project

Novel concept-based image captioning models using LSTM and multi-encoder transformer architecture

Captioning an image involves using a combination of vision and language models to describe the image in an expressive and concise sentence. Successful captioning task requires extracting as much ...

Nature

Visual spatial relationship sensitive transformer for image captioning

Image captioning is a cross-modal task that combines computer vision and natural language processing to generate natural language descriptions of visual content. Recent advances have explored the ...

9to5Mac

Apple trained an AI that captions images better than models ten times its size

Apple researchers have developed a new way to train AI models for image captioning that delivers more accurate, detailed descriptions while using far smaller models. Here are the details. In a new ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Novel concept-based image captioning models using LSTM and multi-encoder transformer architecture

Visual spatial relationship sensitive transformer for image captioning

Apple trained an AI that captions images better than models ten times its size

Trending now