0

I have some large csv file, with about 200 header names (the first one of which is empty). I want to get some chosen columns and copy them to a new output.csv file. My problem comes grabbing the header which has no name! (empty first element in the header)

So the input.csv looks something like,

            ,header1,header2,header3,header4, ... , header200
            value0, value2, value2, value3, value4, ..., value200
            ,2,3,30,,, ... , 10
            66,2,3,30,, ... , 10

etc (all rows have the same number of elements even if empty).

After reading various questions I've recycled some code from write CSV columns out in a different order in Python

to write,

import csv
from operator import itemgetter         

SelectedSignals = ['header1',  'header4'] 



fiin=open('input.csv','rb') #open to read "r" in binary mode "b"
fiout=open('output.csv','wb') #open to write "w" in binary mode "b"

reader = csv.reader(fiin, delimiter=',')
writer = csv.writer(fiout, delimiter=',')

AllSignalNames = reader.next()
name2index = dict((name, index) for index, name in enumerate(AllSignalNames))
writeindices = [name2index[name] for name in SelectedSignals]
reorderfunc = itemgetter(*writeindices) # itemgetter was imported from operator module
writer.writerow(SelectedSignals)

for row in reader:
    writer.writerow(reorderfunc(row))

this gives the desired output, say,

            ,header1,header4
            value0, value4
            ,30
            66,30

but the problem is doing,

  SelectedSignals = [' ', 'header1',  'header4'] 

to grab the first column. which returns KeyError

I'm a python beginner, so any hints are appreciated.

1 Answer 1

1

In the CSV format, the first header should be a zero-length string (''), not a space (' '), which is what you use in SelectedSignals.

You could also add a fake column name to your name2index dict, for example name2index['header0'] = 0 just after name2index = ... and then use 'header0' in SelectedSignals.

Alternatively, you could use a default value for the dict (when it can't find the header you want, it would use this default value): name2index.get(name, 0) instead of name2index[name] in your writeindices expression.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.