Model | Deployable

Image Caption Generator

Generate captions that describe the contents of images.

Save Like

Get this model

Try in a Node-RED flow

Try in CodePen

Try in serverless app

By IBM Developer Staff
Updated September 21, 2018 | Published March 20, 2018

Overview

Note: Model Asset eXchange is moving to Machine Learning Exchange (MLX) – a Linux Foundation AI (LFAI) project. Additional info can be found on the MLX GitHub page.

This model generates captions from a fixed vocabulary that describe the contents of images in the COCO Dataset. The model consists of an encoder model – a deep convolutional net using the Inception-v3 architecture trained on ImageNet-2012 data – and a decoder model – an LSTM network that is trained conditioned on the encoding from the image encoder model. The input to the model is an image, and the output is a sentence describing the image content.

The model is based on the Show and Tell Image Caption Generator Model.

Model Metadata

Domain	Application	Industry	Framework	Training Data	Input Data Format
Vision	Image Caption Generator	General	TensorFlow	COCO	Images

References

O. Vinyals, A. Toshev, S. Bengio, D. Erhan, “Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge”, IEEE transactions on Pattern Analysis and Machine Intelligence, 2016.
im2txt TensorFlow Model GitHub Page
COCO Dataset Project Page

Licenses

Component	License	Link
This repository	Apache 2.0	LICENSE
Model Weights	MIT	Pretrained Show and Tell Model
Model Code (3rd party)	Apache 2.0	im2txt
Test assets	Various	Asset README

Options available for deploying this model

This model can be deployed using the following mechanisms:

Deploy from Dockerhub:

docker run -it -p 5000:5000 codait/max-image-caption-generator

Deploy on Red Hat OpenShift:

Follow the instructions for the OpenShift web console or the OpenShift Container Platform CLI in this tutorial and specify codait/max-image-caption-generator as the image name.
Deploy on Kubernetes:
```
kubectl apply -f https://raw.githubusercontent.com/IBM/MAX-Image-Caption-Generator/master/max-image-caption-generator.yaml
```
A more elaborate tutorial on how to deploy this MAX model to production on IBM Cloud can be found here.
Locally: follow the instructions in the model README on GitHub

Example Usage

You can test or use this model

using cURL
in a Node-RED flow
in CodePen
in a serverless app

Test the model using cURL

Once deployed, you can test the model from the command line. For example if running locally:

curl -F "image=@assets/surfing.jpg" -X POST http://127.0.0.1:5000/model/predict

{
  "status": "ok",
  "predictions": [
    {
      "index": "0",
      "caption": "a man riding a wave on top of a surfboard .",
      "probability": 0.038827644239537
    },
    {
      "index": "1",
      "caption": "a person riding a surf board on a wave",
      "probability": 0.017933410519265
    },
    {
      "index": "2",
      "caption": "a man riding a wave on a surfboard in the ocean .",
      "probability": 0.0056628732021868
    }
  ]
}

Test the model in a Node-RED flow

Complete the node-red-contrib-model-asset-exchange module setup instructions and import the image-caption-generator getting started flow.

Test the model in CodePen

Learn how to send an image to the model and how to render the results in CodePen.

Test the model in a serverless app

You can utilize this model in a serverless application by following the instructions in the Leverage deep learning in IBM Cloud Functions tutorial.

Resources and Contributions

If you are interested in contributing to the Model Asset Exchange project or have any queries, please follow the instructions here.

- Image Caption Generator Web AppA reference application that uses the Image Caption Generator model
- Model Asset eXchange (MAX)A place for developers to find and use free and open source deep learning models.
- MAX tutorialsLearn how to deploy and use MAX deep learning models.
- Center for Open-Source Data & AI Technologies (CODAIT)Improving the Enterprise AI Lifecycle in Open Source.

Code Pattern

Create a web app to interact with machine learning generated image captions

March 28, 2019

Get Involved

Workshops

Partners

Artificial intelligence