0

I'm going through the "Make Your Own Neural Networks" book and following through the examples to implement my first NN. I understood the basic concepts and in particular this equation where the output is calculated doing a matrix dot product of the inputs and weights:

X = W * I

Where X is the output before applying the Sigmoid, W the link weights and I the inputs.

Now in the book, they do have a function that takes in this input as an array and then they translate that array to a 2 dimensional one. My understanding is that, the value of X is calculated like this based on:

W = [0.1, 0.2, 0.3
     0.4, 0.5, 0.6
     0.7, 0.8, 0.9]

I = [1
     2
     3]

So if I now pass in an array for my inputs like [1,2,3], why is that I need do the following to have it converted to a 2-D array as it is done in the book:

inputs = numpy.array(inputs, ndmin=2).T

Any ideas?

5
  • Neural Networks typically use a two dimensional array for inputs and outputs with the underlying idea that you have n rows representing the observations, and m columns representing the features. In your case, you only have 1 feature, your input (which is also what you are trying to predict) and 3 observations. In other words it's more like a convention which happens to be useful for understanding how you use and store data. Commented Dec 6, 2019 at 12:31
  • Could you please elaborate that with an answer and an example explanation? Commented Dec 6, 2019 at 12:35
  • Imho @TheHalf-BloodPrince already pointed out everything relevant. Perhaps another point which might help is: By asserting that inputs are always two-dimensional, you make sure that the orientation of the resulting dot-product np.dot(W, I) is always a column vector for each feature (each independent variable/combination, etc... has its "own" row), independently of the number of features. F.i. for n_features=1 and with ndmin=1, you'd get a row-vector from np.dot(W, I), Commented Dec 6, 2019 at 12:44
  • which on the one hand does not follow conventions, on the other hand posssibly requires transposing etc. depending on the algorithm of the next step. Commented Dec 6, 2019 at 12:44
  • Thanks @Scotty1-! I added an answer with further explanations and an example if that can help! Commented Dec 6, 2019 at 12:55

1 Answer 1

2

Your input here is a one-dimensional list (or a one-dimensional array):

I = [1, 2, 3]

The idea behind this one-dimensional array is the following: if these numbers represent the width in centimetres of a flower petal, its length, and its weight in grams: your flower petal will have a width of 1cm, a length of 2cm, and a weight of 3g.

Converting your input I to a 2-D array is necessary here for two things:

  • first, by default, converting this list to a NumPy array using numpy.array(inputs) will yield an array of shape (3,), with the second dimension left undefined. By setting ndmin=2, it forces the dimensions to be (3, 1), which allows to not generate any NumPy-related problems, for instance when using matrix multiplication, etc.
  • secondly, and perhaps more importantly, as I said in my comment, data in Neural Networks are conventionally stored in arrays this way, under the idea that each row in your array will represent a different feature (so there is a unique list for each feature). In other words, it's just a conventional way to say your not confusing apples and pears (in that case, length and weight)

So when you do inputs = numpy.array(inputs, ndmin=2).T, you end up with:

array([[1],    # width
       [2],    # length
       [3]])   # weight

and not:

array([1, 2, 3])

Hope it made things a bit clearer!

Sign up to request clarification or add additional context in comments.

6 Comments

I like your answer and thus upvoted, but there is still an error in it: the features are in general in the first dimension but in numpy in the second (axis=1), and the samples in the second but in numpy in the first (axis=0). So if your input contains three features (and NOT samples), it will be I = [[1], [2], [3]], resulting in inputs: array([[1, 2, 3]]), whereas three samples I = [1, 2, 3] will result in inputs: array([[1], [2], [3]]).
Well wanted to edit it to add some clarification and now it ended up confusing, because I only added half the edits before SO stopped me from being able to add edits... :D So for clarification: features are stored in the columns (axis=1), while samples are stored in the rows (axis=0).
So that means, each row represents one feature? In the example above the first row represents a feature vector consisting of width of different flower petals? Is my understanding correct?
No, assuming you have n_feats=2 features and n_smpls=4 samples, then inputs will look like inputs = np.random.rand(n_smpls, n_feats). That means, for each of the 4 samples you have 2 features, with the samples being in the rows and the features being in the columns. So if you select inputs[2, :] you will get the features of the third samples, and for inputs[:, 1] you will get all samples of the second feature.
Ok so that means, the inputs will be a 4*2 matrix with 4 samples and 2 features?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.