Image Classification

Caption images

This page creates captions for an image, Upload an image and try.

The encoder uses the pretrained InceptionV3 model:

        iv3 = InceptionV3(weights='imagenet', include_top=False)
        x = iv3.output
        x = keras.layers.GlobalAveragePooling2D()(x)
        encoder = Model(inputs=iv3.input, outputs=x)

        features_input = Input(shape=(2048,))  
        features_layer = Dropout(0.5)(features_input)
        features_layer = Dense(embedding_dim, activation='relu')(features_layer)

        captions_input = Input(shape=(max_length,))
        captions_layer = Embedding(vocab_size, embedding_dim, mask_zero=True)(captions_input)
        captions_layer = Dropout(0.5)(captions_layer)
        captions_layer = LSTM(256)(captions_layer)

        decoder_output = Add()([features_layer, captions_layer])
        decoder_output = Dense(256, activation='relu')(decoder_output)
        outputs = Dense(vocab_size, activation='softmax')(decoder_output)

        model = Model(inputs=[features_input, captions_input], outputs=outputs)