I have written a conv. neural network from scratch before, but I've decided to use Pytorch for its speed. However, I could not find documentation as to how to format for the conv2d layer. In general, there seems to be a lot of overheads and wrappers which prevents me from viewing what exactly is happening and writing my code accordingly.
I have trained a model on the MNIST dataset, and imported the model in order to run it (as per the tutorial):
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 8, 3, stride = 1, padding = 1)
self.pool = nn.MaxPool2d(2, stride = 2)
self.conv2 = nn.Conv2d(8, 8, 3, stride = 1, padding = 1)
self.linear1 = nn.Linear(7 * 7 * 8, 128)
self.linear2 = nn.Linear(128, 128)
self.linear3 = nn.Linear(128, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = torch.flatten(x, 1)
x = F.relu(self.linear1(x))
x = F.relu(self.linear2(x))
x = self.linear3(x)
return x
my_model = NeuralNetwork()
my_model.load_state_dict(torch.load("model_weights.pth", weights_only=True))
my_model.eval()
Now, I have a web application where:
- The user draws on a 28x28 canvas in black and white.
- The drawing is put into a flattened array of size 784, consisting of 0's (white on canvas) and 1's (black on canvas). (e.g. [0, 0, 1, 1, 1, 1, 0, 0, ..., 1, 1])
I have a sample code of what I wish to perform:
formatted_array = some_formatting_function(flattened_array_of_0_and_1)
x = torch.tensor(formatted_array)
pred = my_model(x)
guessed_digit = some_reading_function(pred)
print(guessed_digit)
# eventually return the guessed_digit
What should my some_formatting_function and some_reading_function be?