1

I have a 2D txt file:

[[1406], [1408], [1402], [1394, 102462], [1393], [20388], [20387, 20386], [1386], [1443, 1446, 766], [1432, 1438, 1430, 1416], [1442], [1434], [1430, 1416, 1417, 1419, 3446], [1429], [20011], [20015], [4435], [4441], [4443], [4444], [4448], [2433, 1413, 1418], [4450], [3444], [2478, 823, 3447], [3447], [2481, 1425, 942, 2476, 4449], [2482, 120, 3444], [13512], [3446], [13528]]

Is there any way that I can read this file into python? I tried with:

from numpy import genfromtxt
con2 = genfromtxt('muti.txt', delimiter=',')
con2 = con2.astype(np.int64)

And the result shows like:

nan
nan
nan
nan
nan
nan
nan
1413.0
nan
nan
nan
nan
823.0
nan
nan
nan
1425.0
942.0
2476.0
nan
nan
120.0
nan
nan
nan
nan

A lot of nan within the array. Could someone please help me with it?

1
  • What format does the input have? From what I understood, you could just remove the brackets. data = [int(t) for t in data.replace("]","").replace("[","").split()] Commented Dec 15, 2015 at 20:16

2 Answers 2

3

I don't know what functionality numpy has for this, but since your text file happens to be valid JSON, you could just load it as JSON, flatten it, and then convert the result to a numpy array.

>>> import json
>>> import numpy as np
>>> with open('muti.txt', 'r') as f: arr = json.load(f)
>>> np_arr = np.array([n for subarr in arr for n in subarr]).astype(np.int64)
>>> np_arr
array([  1406,   1408,   1402,   1394, 102462,   1393,  20388,  20387,
        20386,   1386,   1443,   1446,    766,   1432,   1438,   1430,
         1416,   1442,   1434,   1430,   1416,   1417,   1419,   3446,
         1429,  20011,  20015,   4435,   4441,   4443,   4444,   4448,
         2433,   1413,   1418,   4450,   3444,   2478,    823,   3447,
         3447,   2481,   1425,    942,   2476,   4449,   2482,    120,
         3444,  13512,   3446,  13528], dtype=int64)
Sign up to request clarification or add additional context in comments.

Comments

1

If this is just text, then just replace/remove the brackets, split, an cast to int.

txt = "[[1406], [1408], [1402], [1394, 102462], [1393], [20388], [20387, 20386], [1386], [1443, 1446, 766], [1432, 1438, 1430, 1416], [1442], [1434], [1430, 1416, 1417, 1419, 3446], [1429], [20011], [20015], [4435], [4441], [4443], [4444], [4448], [2433, 1413, 1418], [4450], [3444], [2478, 823, 3447], [3447], [2481, 1425, 942, 2476, 4449], [2482, 120, 3444], [13512], [3446], [13528]]"
data = [int(t) for t in txt.replace("]","").replace("[","").split(',')]

[1406,
1408,
1402,
1394,
...
13512,
3446,
13528]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.