PyTorch Linear Regression Issue

Question

I am trying to implement a simple linear model in PyTorch that can be given x data and y data, and then trained to recognize the equation y = mx + b. However, whenever I try to test my model after training, it thinks that the equation is y= mx + 2b. I'll show my code, and hopefully someone will be able to spot an issue. Thank you in advance for any help.

import torch

D_in = 500
D_out = 500
batch=200
model=torch.nn.Sequential(
     torch.nn.Linear(D_in,D_out),
)

Next I create some data and set a rule. Let's do 3x+4.

x_data=torch.rand(batch,D_in)
y_data=torch.randn(batch,D_out)

for i in range(batch):
    for j in range(D_in):
         y_data[i][j]=3*x_data[i][j]+5 # model thinks y=mx+c -> y=mx+2c?

loss_fn=torch.nn.MSELoss(size_average=False)
optimizer=torch.optim.Adam(model.parameters(),lr=0.001)

Now to training...

for epoch in range(500):
    y_pred=model(x_data)
    loss=loss_fn(y_pred,y_data)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

Then I test my model with a Tensor/matrix of just 1's.

test_data=torch.ones(batch,D_in) 
y_pred=model(test_data)

Now, I'd expect to get 3*1 + 4 = 7, but instead, my model thinks it is 11.

[[ 10.7286,  11.0499,  10.9448,  ...,  11.0812,  10.9387,
      10.7516],
    [ 10.7286,  11.0499,  10.9448,  ...,  11.0812,  10.9387,
      10.7516],
    [ 10.7286,  11.0499,  10.9448,  ...,  11.0812,  10.9387,
      10.7516],
    ...,
    [ 10.7286,  11.0499,  10.9448,  ...,  11.0812,  10.9387,
      10.7516],
    [ 10.7286,  11.0499,  10.9448,  ...,  11.0812,  10.9387,
      10.7516],
    [ 10.7286,  11.0499,  10.9448,  ...,  11.0812,  10.9387,
      10.7516]])

Similarly, if I change the rule to y=3x+8, my model guesses 19. So, I am not sure what is going on. Why is the constant being added twice? By the way, if I just set the rule to y=3x, my model correctly infers 3, and for y=mx in general my model correctly infers m. For some reason, the constant term is throwing it off. Any help to solve this problem is much appreciated. Thanks!

Yes, the loss goes to a very small number, as in 0.005 or less. — Rohan Singh
– Rohan Singh, Commented Jul 5, 2018 at 20:28

M. Deckers · Accepted Answer · 2018-07-05 20:38:15Z

2

Your network does not learn long enough. It gets a vector with 500 features to describe a single datum.

Your network has to map the big input of 500 features to an output including 500 values. Your trainingdata is randomly created, not like your simple example, so I think you just have to train longer to fit your weights to approximate this function from R^500 to R^500.

If I reduce the input and output dimensionality and increase the batch size, learning rate and training steps I get the expected result:

import torch

D_in = 100
D_out = 100
batch = 512

model=torch.nn.Sequential(
     torch.nn.Linear(D_in,D_out),
)

x_data=torch.rand(batch,D_in)
y_data=torch.randn(batch,D_out)
for i in range(batch):
    for j in range(D_in):
         y_data[i][j]=3*x_data[i][j]+4 # model thinks y=mx+c -> y=mx+2c?

loss_fn=torch.nn.MSELoss(size_average=False)
optimizer=torch.optim.Adam(model.parameters(),lr=0.01)

for epoch in range(10000):
    y_pred=model(x_data)
    loss=loss_fn(y_pred,y_data)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

test_data=torch.ones(batch,D_in)
y_pred=model(test_data)
print(y_pred)

If you just want to approximate f(x) = 3x + 4 with only one input you could also set D_in and D_out to 1.

answered Jul 5, 2018 at 20:38

M. Deckers

1,34311 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rohan Singh Over a year ago

Thank you! I'm just still curious as to why it would always have a consistent error of 2c instead of c .... ? But thank you a lot this makes sense

Collectives™ on Stack Overflow

PyTorch Linear Regression Issue

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related