Numpy array with different datatypes behaves strange

Question

I am trying to extract data from MIDI files using the music21 python module. The issue I am having is when I try to get data for note duration from a note that is an n-tuple (e.g. a 32nd note Triplet). The module usually returns the note length as a fraction of quarter lengths (e.g. a quarter note is 1.0, an eight is 0.5, etc.). However, for triplets it returns a python Fraction object (e.g. Fraction(1, 12)). I have the following loop:

for note in notes_from_stream:
    temp_arr = []
    temp_arr.append(get_midi_representation(note))
    temp_arr.append(note.duration.dots)
    temp_arr.append(note.duration.quarterLength)
    tup = float(note.duration.quarterLength.numerator) / (note.duration.quarterLength.denominator)
    temp_arr.append(tup)

    note_list_arr.append(temp_arr)

The temp_arr is added to note_list_arr at the end of each iteration. After the loop finishes, I create a new 2x2 numpy array from note_list_arr with the numpy.asarray() function. So, the actual problem is after all the data is in the numpy array I get the following contents in it:

[[128 0 Fraction(1, 12) 0.08333333333333333]
[128 0 Fraction(1, 24) 0.041666666666666664]]

The problem with this is that it contains the Fraction object, but if I remove the line which puts it there (temp_arr.append(note.duration.quarterLength) and leave only the one which calculates the real number value of the fraction, I get the following:

[[  1.28000000e+02   0.00000000e+00   8.33333333e-02]
[  1.28000000e+02   0.00000000e+00   4.16666667e-02]]

All the values in the array get converted to floats with exponent notation. How can I avoid this?

hpaulj · Accepted Answer · 2016-02-24 02:55:01Z

1

What is the dtype for

[[128 0 Fraction(1, 12) 0.08333333333333333]
 [128 0 Fraction(1, 24) 0.041666666666666664]]

I'm guessing object. Each element is a pointer to a different kind of item, some integers, some floats, and some Fraction. This is nearly the same as a nested list. It's the result of numpy trying to put a diverse set of objects into one array.

[[  1.28000000e+02   0.00000000e+00   8.33333333e-02]
 [  1.28000000e+02   0.00000000e+00   4.16666667e-02]]

looks like dtype float. Both the integers and the floats are stored and displayed as floats. Don't worry about the scientific notation; that's just what the display does to handle the range of values. They are regular floats.

And the first is a (n,4) array, n notes, 4 values per note.

You probably could have written the creation as:

temp_arr = [get_midi_representation(note)),
           note.duration.dots,
           note.duration.quarterLength,
           float(note.duration.quarterLength.numerator) /            
                (note.duration.quarterLength.denominator)]

answered Feb 24, 2016 at 2:55

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

MZokov Over a year ago

I have not set a specific dtype for any of the arrays, because when i tried to do so, I was getting a TypeError about something to do with a buffer. Isn't there a way to avoid storing integers as floats? The cases where this happens in the data will not be that often and so I would have some entries where the array consists of mainly floats and other cases where it has integers.

hpaulj Over a year ago

Read the docs about structured arrays and compound dtype. Look at SO questions about reading CSV files with genfromtxt.

Collectives™ on Stack Overflow

Numpy array with different datatypes behaves strange

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related