1

I have a text file that looks like this:

# Comments 
PARAMETER  0  0
      1045        54
      1705         0                           time 1
         1        10       100   0.000e+00   9999   A
         2        20       200   0.2717072   9999   B
         3        30       300   0.0282928   9999   C
         1       174        92   2999.4514   9999   APEW-1
         2       174        92   54.952499   9999   ART-3A
         1       174        97   5352.1299   9999   APEW-2
         1       173       128   40.455467   9999   APEW-3
         2       173       128   1291.1320   9999   APEW-3
         3       173       128   86.562599   9999   ART-7B
...

I want to create a dictionary that looks like below (basically skipping the header and certain columns and goes to the data that I need):

my_dict = {'A':(1,10,100),'B':(2,20,200), 'C':(3,30,300), 'APEW-1':(1,174,92), ...}

These data point are observation points and their respective values are depth, y, x. Therefore one observation point can have multiple values for different depths (the first column). I am trying to avoid rename the labels by adding a suffix for duplicates. I wonder if there is any way around it. What I want to do with them is to call a observation point name and extract the coordinates. I am not sure if the dictionary is the right tool for this purpose. It is an small dataset and doesn't need to be fast. I am using Numpy, Python 2.7.

1 Answer 1

1

loadtxt can do it:

>>> dtype=np.rec.fromrecords([[0, 0, 0, b'APEW-1']]).dtype
>>> x = np.loadtxt(fn, skiprows=4, usecols=(0,1,2,5), dtype=dtype)
>>>
>>> result = {}
>>> for x0, x1, x2, key in x:
...     try:
...         result[key.decode()].append((x0,x1,x2))
...     except KeyError:
...         result[key.decode()] = [(x0,x1,x2)]
... 
>>> result
{'A': [(1, 10, 100)], 'B': [(2, 20, 200)], 'C': [(3, 30, 300)], 'APEW-1': [(1, 174, 92)], 'ART-3A': [(2, 174, 92)], 'APEW-2': [(1, 174, 97)], 'APEW-3': [(1, 173, 128), (2, 173, 128)], 'ART-7B': [(3, 173, 128)]}

Notes:

  • we abuse rec.fromrecords to create a compund dtype describing the columns, be sure to use a template string as long as the longest you expect

    • there is probably an official way of creating compound dtypes that doesn't involve creating a throw-away array but this is easy and works
  • loadtxt paramemters are self-explanatory, because of the compound dtype it generates a 1d recordd array
  • if there were no duplicate keys, we could use dict comprehension to translate the record array to dict f0-f3 are the auto generated field names

    • to accomodate duplicates we pack the values which are tuples in lists
    • most lists contain just one tuple, but some will have more

py2 version: main difference no need to use byte strings / decode, dictionary forgets order of items

>> dtype=np.rec.fromrecords([[0, 0, 0, 'APEW-1']]).dtype
>>> x = np.loadtxt(fn, skiprows=4, usecols=(0,1,2,5), dtype=dtype)
>>>
>>> result = {}
>>> for x0, x1, x2, key in x:
...     try:
...         result[key].append((x0,x1,x2))
...     except KeyError:
...         result[key] = [(x0,x1,x2)]
... 
>>> result
{'A': [(1, 10, 100)], 'B': [(2, 20, 200)], 'C': [(3, 30, 300)], 'APEW-1': [(1, 174, 92)], 'ART-3A': [(2, 174, 92)], 'APEW-2': [(1, 174, 97)], 'APEW-3': [(1, 173, 128), (2, 173, 128)], 'ART-7B': [(3, 173, 128)]}
Sign up to request clarification or add additional context in comments.

10 Comments

Thank you so much Paul. I would never come to this solution anytime soon!
You are welcome. I know this has a few totally non-obvious steps. Found it only by trial-and-error.
Can you update your question and add a line like this to your example file? I'll check, then.
There is a problem when the last column contains string names like ABC-5D. The names in the last column can be a mix of letter, number and symbols with various lengths.
Sorted. But there is potentially a problem with duplicate keys. Let me know how you'd like them handled.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.