1

I have build a CNN autoencoder using keras and it worked fine for the MNIST test data set. I am now trying it with a different data set collected from another source. There are pure images and I have to read them in using cv2 which works fine. I then convert these images into a numpy array which again I think works fine. But when I try to do the .fit method it gives me this error.

Error when checking target: expected conv2d_39 to have shape (100, 100, 1) but got array with shape (100, 100, 3)

I tried converting the images to grey scale but they then get the shape (100,100) and not (100,100,1) which is what the model wants. What am I doing wrong here?

Here is the code that I am using:

def read_in_images(path):
    images = []
    for files in os.listdir(path):
        img = cv2.imread(os.path.join(path, files))
        if img is not None:
            images.append(img)
    return images

train_images = read_in_images(train_path)
test_images = read_in_images(test_path)
x_train = np.array(train_images)
x_test = np.array(test_images) # (36, 100, 100, 3)

input_img = Input(shape=(100,100,3))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)


x = Conv2D(16, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(168, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)


autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')


autoencoder.fit(x_train, x_train,
            epochs=25,
            batch_size=128,
            shuffle=True,
            validation_data=(x_test, x_test),
            callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])

The model works fine with the MNIST data set but not with my own data set. Any help will be appreciated.

4
  • Here input_img = Input(shape=(100,100,3)) you already mentioned 3 channel and if you are getting this error, there are still contradictory for your error. And to convert your shape (100,100) to (100,100,1) use numpy.expand_dims. Commented Jun 4, 2019 at 6:30
  • I have changed the code a bit and now I read in a grey scale image from cv2. and did the np.expand(x_train, axis=3) to get (36, 100, 100, 1) but the model does nothing. It runs but the loss is loss: -3104.3462 - val_loss: -2954.8867. My original Autoencoder gave me this - loss: 0.1052 - val_loss: 0.1038 for MNIST Commented Jun 4, 2019 at 6:53
  • I have also tried flattening the array as well before putting it into Input() Commented Jun 4, 2019 at 7:28
  • Your input and output shapes are different, which should not be the case for an autoencoder. Commented Jun 4, 2019 at 7:46

2 Answers 2

2

Your input and output shapes are different. That triggers the error (I think).

decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

should be

decoded = Conv2D(num_channels, (3, 3), activation='sigmoid', padding='same')(x)
Sign up to request clarification or add additional context in comments.

7 Comments

Actually in a encoder-decoder network, you try to reconstruct the input and not predict a class. So the output must be the same shape as the input, that's what the 1x1 Conv2D does
Sorry, I meant num_channels. Shouldn't the number of channels be same in input and output? In the posted code, I see 3 channels in input and 1 in output and that is what I felt is triggering the error. The original problem was never grayscale to start with.
Oh yeah that change everything ahah ! You are right, the channels must corresponds !
Also, where do you find 1x1 Convolution? It was a 3x3 convolution with 1 output channel. Your answer is correct, but it does not solve the original problem. It is just hacking I would say. :P
I know what a 1x1 conv filter is. I just don't see it anywhere in the code and you mentioned it. So was wondering.
|
1

I ran some tests, and with data loaded in grayscale like that :

img = cv2.imread(os.path.join(path, files), 0)

then expand the dim of the final loaded array like :

x_train = np.expand_dims(x_train, -1)

and finaly normalize you data with a simple :

x_train = x_train / 255.

(the input of your model must be : input_img = Input(shape=(100, 100, 1))

The loss becomes normal again and the model run well !

UPDATE after comment

In order to keep all the rgb channel throught the network, you need an output corresponding to your input shape.
Here if you want image with shape (100, 100, 3), you need an output of (100, 100, 3) from your decoder.

The decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x) will shrink the output to have a shape (100, 100, 1)

So you simply need to change the number of filters, here we want 3 colors channels so the conv must be like that :

decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)

4 Comments

That worked thank you. Is there anyway to keep it in color though?
The problem is that the final 1x1 conv layer will shrink your image to a (100, 100, 1) array, if you want to keep the 3 colors dim, you need to change the last layer to : decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x) If you do that, you don't need anymore to load to grayscale and expand the dims
One last thing, be careful with your second Conv2D layer, you set a filter 168 instead of a filter 16
@ThibaultBacqueyrisses kindly update your answer with the output channel detail, as it is an accepted answer. Might get people confused.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.