0

I am trying to extract data from MIDI files using the music21 python module. The issue I am having is when I try to get data for note duration from a note that is an n-tuple (e.g. a 32nd note Triplet). The module usually returns the note length as a fraction of quarter lengths (e.g. a quarter note is 1.0, an eight is 0.5, etc.). However, for triplets it returns a python Fraction object (e.g. Fraction(1, 12)). I have the following loop:

for note in notes_from_stream:
    temp_arr = []
    temp_arr.append(get_midi_representation(note))
    temp_arr.append(note.duration.dots)
    temp_arr.append(note.duration.quarterLength)
    tup = float(note.duration.quarterLength.numerator) / (note.duration.quarterLength.denominator)
    temp_arr.append(tup)

    note_list_arr.append(temp_arr)

The temp_arr is added to note_list_arr at the end of each iteration. After the loop finishes, I create a new 2x2 numpy array from note_list_arr with the numpy.asarray() function. So, the actual problem is after all the data is in the numpy array I get the following contents in it:

[[128 0 Fraction(1, 12) 0.08333333333333333]
[128 0 Fraction(1, 24) 0.041666666666666664]]

The problem with this is that it contains the Fraction object, but if I remove the line which puts it there (temp_arr.append(note.duration.quarterLength) and leave only the one which calculates the real number value of the fraction, I get the following:

[[  1.28000000e+02   0.00000000e+00   8.33333333e-02]
[  1.28000000e+02   0.00000000e+00   4.16666667e-02]]

All the values in the array get converted to floats with exponent notation. How can I avoid this?

1 Answer 1

1

What is the dtype for

[[128 0 Fraction(1, 12) 0.08333333333333333]
 [128 0 Fraction(1, 24) 0.041666666666666664]]

I'm guessing object. Each element is a pointer to a different kind of item, some integers, some floats, and some Fraction. This is nearly the same as a nested list. It's the result of numpy trying to put a diverse set of objects into one array.

[[  1.28000000e+02   0.00000000e+00   8.33333333e-02]
 [  1.28000000e+02   0.00000000e+00   4.16666667e-02]]

looks like dtype float. Both the integers and the floats are stored and displayed as floats. Don't worry about the scientific notation; that's just what the display does to handle the range of values. They are regular floats.

And the first is a (n,4) array, n notes, 4 values per note.

You probably could have written the creation as:

temp_arr = [get_midi_representation(note)),
           note.duration.dots,
           note.duration.quarterLength,
           float(note.duration.quarterLength.numerator) /            
                (note.duration.quarterLength.denominator)]
Sign up to request clarification or add additional context in comments.

2 Comments

I have not set a specific dtype for any of the arrays, because when i tried to do so, I was getting a TypeError about something to do with a buffer. Isn't there a way to avoid storing integers as floats? The cases where this happens in the data will not be that often and so I would have some entries where the array consists of mainly floats and other cases where it has integers.
Read the docs about structured arrays and compound dtype. Look at SO questions about reading CSV files with genfromtxt.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.