I am trying to implement a simple linear model in PyTorch that can be given x data and y data, and then trained to recognize the equation y = mx + b. However, whenever I try to test my model after training, it thinks that the equation is y= mx + 2b. I'll show my code, and hopefully someone will be able to spot an issue. Thank you in advance for any help.
import torch
D_in = 500
D_out = 500
batch=200
model=torch.nn.Sequential(
torch.nn.Linear(D_in,D_out),
)
Next I create some data and set a rule. Let's do 3x+4.
x_data=torch.rand(batch,D_in)
y_data=torch.randn(batch,D_out)
for i in range(batch):
for j in range(D_in):
y_data[i][j]=3*x_data[i][j]+5 # model thinks y=mx+c -> y=mx+2c?
loss_fn=torch.nn.MSELoss(size_average=False)
optimizer=torch.optim.Adam(model.parameters(),lr=0.001)
Now to training...
for epoch in range(500):
y_pred=model(x_data)
loss=loss_fn(y_pred,y_data)
optimizer.zero_grad()
loss.backward()
optimizer.step()
Then I test my model with a Tensor/matrix of just 1's.
test_data=torch.ones(batch,D_in)
y_pred=model(test_data)
Now, I'd expect to get 3*1 + 4 = 7, but instead, my model thinks it is 11.
[[ 10.7286, 11.0499, 10.9448, ..., 11.0812, 10.9387,
10.7516],
[ 10.7286, 11.0499, 10.9448, ..., 11.0812, 10.9387,
10.7516],
[ 10.7286, 11.0499, 10.9448, ..., 11.0812, 10.9387,
10.7516],
...,
[ 10.7286, 11.0499, 10.9448, ..., 11.0812, 10.9387,
10.7516],
[ 10.7286, 11.0499, 10.9448, ..., 11.0812, 10.9387,
10.7516],
[ 10.7286, 11.0499, 10.9448, ..., 11.0812, 10.9387,
10.7516]])
Similarly, if I change the rule to y=3x+8, my model guesses 19. So, I am not sure what is going on. Why is the constant being added twice? By the way, if I just set the rule to y=3x, my model correctly infers 3, and for y=mx in general my model correctly infers m. For some reason, the constant term is throwing it off. Any help to solve this problem is much appreciated. Thanks!