I have a csv file in which there are always two first columns, but with varying number of columns for different files. The csv can look like this:
Gondi,4012,227,233,157,158,149,158
Gondi,4013,227,231,156,159,145,153
Gondu,4014,228,233,157,158,145,153
Gondu,4015,227,231,156,159,149,158
For now I am working with NumPy, and my code for loading this data is:
import numpy as np
def readfile(fname):
with open(fname) as f:
ncols = len(f.readline().split(','))
name = np.loadtxt(fname, delimiter=',', usecols=[0],dtype=str)
ind = np.loadtxt(fname, delimiter=',', usecols=[1],dtype=int)
data = np.loadtxt(fname, delimiter=',', usecols=range(2,ncols),dtype=int)
return data,name,ind
Can I do the same thing with pandas more efficiently?