AutoEncoder shape

Question

I try to create a autoencoder in Tensorflow without using contriib. Here is the original code

https://github.com/Machinelearninguru/Deep_Learning/blob/master/TensorFlow/neural_networks/autoencoder/simple_autoencoder.py

Here is the program I modify:

    import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt




ae_inputs = tf.placeholder(tf.float32, (None, 32, 32, 1))  # input to the network (MNIST images)


xi = tf.nn.conv2d(ae_inputs, 
                 filter=tf.Variable(tf.random_normal([5,5,1,32])), 
                 strides=[1,2,2,1],
                 padding='SAME')
print("xi {0}".format(xi))

xi = tf.nn.conv2d(xi, 
                 filter=tf.Variable(tf.random_normal([5,5,32,16])), 
                 strides=[1,2,2,32],
                 padding='SAME')
print("xi {0}".format(xi))

xi = tf.nn.conv2d(xi, 
                 filter=tf.Variable(tf.random_normal([5,5,16,8])), 
                 strides=[1,4,4,16],
                 padding='SAME')
print("xi {0}".format(xi))






xo = tf.nn.conv2d_transpose(xi, 
                 filter=tf.Variable(tf.random_normal([5,5,16,8])), 
                 output_shape=[1, 8, 8, 16],
                 strides=[1,4,4,1],
                 padding='SAME')
print("xo {0}".format(xo))

xo = tf.nn.conv2d_transpose(xo, 
                 filter=tf.Variable(tf.random_normal([5,5,32,16])), 
                 output_shape=[1, 16, 16, 32],
                 strides=[1,2,2,1],
                 padding='SAME')
print("xo {0}".format(xo))

xo = tf.nn.conv2d_transpose(xo, 
                 filter=tf.Variable(tf.random_normal([5,5,1,32])), 
                 output_shape=[1, 32, 32, 1],
                 strides=[1,2,2,1],
                 padding='SAME')

print("xo {0}".format(xo))

And the result from the print is that:

xi Tensor("Conv2D:0", shape=(?, 16, 16, 32), dtype=float32) xi Tensor("Conv2D_1:0", shape=(?, 8, 8, 16), dtype=float32) xi Tensor("Conv2D_2:0", shape=(?, 2, 2, 8), dtype=float32) xo Tensor("conv2d_transpose:0", shape=(1, 8, 8, 16), dtype=float32) xo Tensor("conv2d_transpose_1:0", shape=(1, 16, 16, 32), dtype=float32) xo Tensor("conv2d_transpose_2:0", shape=(1, 32, 32, 1), dtype=float32)

It seems the output has the good shape but I don't really sure about all the parameters in the conv2 and conv2_transpose.

Can someone correct my code if needed

edit: @Lau I add the relu function as you tel me but I don't known where to add the bias:

xi = tf.nn.conv2d(ae_inputs,
             filter=tf.Variable(tf.random_normal([5,5,1,32])),
             strides=[1,2,2,1],
             padding='SAME')
xi = tf.nn.relu(xi)
# xi = max_pool(xi,2)
print("xi {0}".format(xi))

xi = tf.nn.conv2d(xi,
                 filter=tf.Variable(tf.random_normal([5,5,32,16])),
                 strides=[1,2,2,1],
                 padding='SAME')
xi = tf.nn.relu(xi)
# xi = max_pool(xi,2)
print("xi {0}".format(xi))

xi = tf.nn.conv2d(xi,
                 filter=tf.Variable(tf.random_normal([5,5,16,8])),
                 strides=[1,4,4,1],
                 padding='SAME')
xi = tf.nn.relu(xi)
# xi = max_pool(xi,4)
print("xi {0}".format(xi))






xo = tf.nn.conv2d_transpose(xi,
                 filter=tf.Variable(tf.random_normal([5,5,16,8])),
                 output_shape=[tf.shape(xi)[0], 8, 8, 16],
                 strides=[1,4,4,1],
                 padding='SAME')
xo = tf.nn.relu(xo)

print("xo {0}".format(xo))

xo = tf.nn.conv2d_transpose(xo,
                 filter=tf.Variable(tf.random_normal([5,5,32,16])),
                 output_shape=[tf.shape(xo)[0], 16, 16, 32],
                 strides=[1,2,2,1],
                 padding='SAME')
xo = tf.nn.relu(xo)

print("xo {0}".format(xo))

xo = tf.nn.conv2d_transpose(xo,
                 filter=tf.Variable(tf.random_normal([5,5,1,32])),
                 output_shape=[tf.shape(xo)[0], 32, 32, 1],
                 strides=[1,2,2,1],
                 padding='SAME')
xo = tf.nn.tanh(xo)
print("xo {0}".format(xo))
return xo

I don't understand what is the difference with the original code:

# encoder
# 32 x 32 x 1   ->  16 x 16 x 32
# 16 x 16 x 32  ->  8 x 8 x 16
# 8 x 8 x 16    ->  2 x 2 x 8
print('inputs {0}'.format(inputs))

net = lays.conv2d(inputs, 32, [5, 5], stride=2, padding='SAME')
print('net {0}'.format(net))

net = lays.conv2d(net, 16, [5, 5], stride=2, padding='SAME')
print('net {0}'.format(net))

net = lays.conv2d(net, 8, [5, 5], stride=4, padding='SAME')
print('net {0}'.format(net))

# decoder
# 2 x 2 x 8    ->  8 x 8 x 16
# 8 x 8 x 16   ->  16 x 16 x 32
# 16 x 16 x 32  ->  32 x 32 x 1
net = lays.conv2d_transpose(net, 16, [5, 5], stride=4, padding='SAME')
print('net {0}'.format(net))

net = lays.conv2d_transpose(net, 32, [5, 5], stride=2, padding='SAME')
print('net {0}'.format(net))

net = lays.conv2d_transpose(net, 1, [5, 5], stride=2, padding='SAME', activation_fn=tf.nn.tanh)

print('net {0}'.format(net))
return net

Edit2 :

@Lau I make the new version of the autoencoder with your modifications:

mean = 0
    stdvev = 0.1
    with tf.name_scope('L0'):
        xi = tf.nn.conv2d(ae_inputs,
                     filter=tf.truncated_normal([5,5,1,32], mean = mean, stddev=stdvev),
                     strides=[1,1,1,1],
                     padding='SAME')
        xi =  tf.nn.bias_add(xi, bias_variable([32]))
        xi = max_pool(xi,2)
        print("xi {0}".format(xi))

    with tf.name_scope('L1'):
        xi = tf.nn.conv2d(xi,
                         filter=tf.truncated_normal([5,5,32,16], mean = mean, stddev=stdvev),
                         strides=[1,1,1,1],
                         padding='SAME')
        xi =  tf.nn.bias_add(xi, bias_variable([16]))
        xi = max_pool(xi,2)
        print("xi {0}".format(xi))

    with tf.name_scope('L2'):
        xi = tf.nn.conv2d(xi,
                         filter=tf.truncated_normal([5,5,16,8], mean = mean, stddev=stdvev),
                         strides=[1,1,1,1],
                         padding='SAME')
        xi =  tf.nn.bias_add(xi, bias_variable([8]))
        xi = max_pool(xi,4)
        print("xi {0}".format(xi))


    with tf.name_scope('L3'):
        xo = tf.nn.conv2d_transpose(xi,
                         filter=tf.truncated_normal([5,5,16,8], mean = mean, stddev=stdvev),
                         output_shape=[tf.shape(xi)[0], 8, 8, 16],
                         strides=[1,4,4,1],
                         padding='SAME')
        xo =  tf.nn.bias_add(xo, bias_variable([16]))
        print("xo {0}".format(xo))

    with tf.name_scope('L4'):
        xo = tf.nn.conv2d_transpose(xo,
                         filter=tf.truncated_normal([5,5,32,16], mean = mean, stddev=stdvev),
                         output_shape=[tf.shape(xo)[0], 16, 16, 32],
                         strides=[1,2,2,1],
                         padding='SAME')
        xo =  tf.nn.bias_add(xo, bias_variable([32]))
        print("xo {0}".format(xo))

    with tf.name_scope('L5'):
        xo = tf.nn.conv2d_transpose(xo,
                         filter=tf.truncated_normal([5,5,1,32], mean = mean, stddev=stdvev),
                         output_shape=[tf.shape(xo)[0], 32, 32, 1],
                         strides=[1,2,2,1],
                         padding='SAME')
        xo =  tf.nn.bias_add(xo, bias_variable([1]))
        xo = tf.nn.tanh(xo)
        print("xo {0}".format(xo))

But the result is the same, the decoded value are not the same.

Edit3:

I change thefilter definition from

filter=tf.truncated_normal([5,5,16,8], mean = mean, stddev=stdvev),

to

 filter= tf.get_variable('filter2',[5,5,16,8]),

The result seems to converge to better result but still converge to a different value. In the original code (0.006) and my version 0.015. I think it comes from the initialize value of the filter and the bias. How can I manage that?

Lau · Accepted Answer · 2018-11-03 13:54:23Z

1

You forgot a bias and an activation. So your network is weaker than a PCA. I suggest you use tf.layersinstead. If you want to use tf.nn, then please use tf.get_variable. Furthermore you have to add: tf.nn.bias_add tf.nn.relu (or any other activation)

If you want to know if the code works just test it with:

sess = tf.Session()
sess.run(tf.tf.global_variables_initializer())
test_output = sess.run(xo, feed_dict={ae_inputs : np.random.random((1, 32, 32, 1))}
print(test_output)

EDIT Ok, so the code you posted uses basically the tf.layers API, where bias and activation is included. The tf.nn API is more basic and only applies convolution, but without activation or bias.

Based on your edit I think you want to implement the the CAE in the nn API. A typical encoder layer would be this:

conv = tf.nn.conv2d(
                     nput=input_tensor,
                     filter=tf.get_variable("conv_weight_name", shape=[height,
                                                                width,
                                                                number_input_feature_maps,
                                                                number_output_feature_maps]),
                     strides=[1, 1, 1, 1],
                     padding="SAME")
bias = tf.nn.bias_add(conv, tf.get_variable("name_bias",
                                            [number_output_feature_maps]))
layer_out = tf.nn.relu(bias)

Here is a typical layer for transpose convolution.

conv_transpose = tf.nn.conv2d_transpose(value=input_tensor,
                       filter=tf.get_variable("deconnv_weight_name", shape=[height,
                                                                     width,
                                                                     number_output_feature_maps,
                                                                     number_input_feature_maps]),
                       output_shape=[batc_size, height_output, width_ouput, feature_maps_output],
                       strides=[1, 1, 1, 1])
bias = tf.nn.bias_add(conv_transpose, tf.get_variable("name_bias", shape=[number_output_feature_maps]))

layer_out = tf.nn.relu(bias)
           `

If you have questions about the names, just ask in the commnets.

edited Nov 3, 2018 at 13:54

answered Nov 2, 2018 at 9:37

Lau

1,48511 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

dev dev Over a year ago

I answer you in my original post

Lau Over a year ago

Can you additionally post the link of the source? I extend my answer later

dev dev Over a year ago

Thank you I add the link

dev dev Over a year ago

I publish the modifications in the original post. It seems I get other result but is still doesn't fit

Lau Over a year ago

The code basis works now, please post a new question, where you ask, why your AE does not work. If you do not use an activation your method is not an AE, but an PCA. The reasons why activations is missing in tf.nn is because its a more basic API. As I said tf.layersis the correpsonging one to tf.contrib.layers

|

Collectives™ on Stack Overflow

AutoEncoder shape

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related