1

Quick disclaimer: I'm pretty new to Keras, machine learning, and programming in general.

I'm trying to create a basic autoencoder for (currently) a single image. While it seems to run just fine, the output is just a white image. Here's what I've got:

img_height, img_width = 128, 128

input_img = '4.jpg'
output_img = '5.jpg'

# load image
x = load_img(input_img)
x = img_to_array(x)  # array with shape (128, 128, 3)
x = x.reshape((1,) + x.shape)  # array with shape (1, 128, 128, 3)

# define input shape
input_shape = (img_height, img_width, 3)

model = Sequential()
# encoding
model.add(Conv2D(128, (3, 3), activation='relu', input_shape=input_shape, 
padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))

# decoding
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(UpSampling2D(size=(2,2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(UpSampling2D(size=(2,2)))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(3, (3, 3), activation='sigmoid', padding='same'))

model.compile(loss='binary_crossentropy', optimizer='adam')
print(model.summary())

checkpoint = ModelCheckpoint("autoencoder-loss-{loss:.4f}.hdf5", monitor='loss', verbose=0, save_best_only=True, mode='min') 
model.fit(x, x, epochs=10, batch_size=1, verbose=1, callbacks=[checkpoint])

y = model.predict(x)

y = y[0, :, :, :]
y = array_to_img(y)
save_img(output_img, y)

I've looked at a handful of tutorials for reference, but I still can't figure out what my issue is.

Any guidance/suggestions/help would be greatly appreciated.

Thanks!

1 Answer 1

2

this solved the problem. The code was just missing

x = x.astype('float32') / 255.

This is a numpy built-in function to convert the values contained in that vector to floats.

This allows us to get decimal values, where the values are divided by 255. RGB values are stored as 8 bit integers, so we divide the values in the vector by 255 (2^8 - 1), to represent the colour as a decimal value between 0.0 and 1.0.

Sign up to request clarification or add additional context in comments.

5 Comments

That's really helpful. I guess what I'm really looking for is essentially the same image as output. For that, it doesn't seem like I should need any label, as the output should "match" the input. Can you point me in the direction as to how I might do that?
Oh I see. Looking around on the Keras blog, I was able to find this tutorial on convolutional autoencoders. Perhaps that could be what you're looking for?
That tutorial solved the problem (even though I've already seen it about 1000 times). I was missing x = x.astype('float32') / 255. which normalizes pixel values between 0 and 1. Thanks for the help!
That's awesome! The astype(..) bit is a numpy built-in function to convert the values contained in that vector to floats. This allows us to get decimal values too when the values are then divided by 255. RGB values are stored as 8 bit integers, so we divide the values in the vector by 255 (2^8 - 1), to represent the colour as a decimal value between 0.0 and 1.0.
This answer, though it happened to be useful to the OP, is incorrect: The model.fit(x, x, ...) is exactly the right thing to do, when you're training an autoencoder...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.