I am trying to import a dataset from a CSV file using Pandas. The problem is that I have some empty cells in the last 5 columns of the first few rows. These cells get populated later on. How do I import the entire dataset and not the non-empty columns only?
1 Answer
The empty cells should just be a series of commas in the csv file.
#write a data frame to csv
pd.DataFrame({'A':[1, 2, 3],'B':[np.nan,4,5],'C':[np.nan,6,7] }).to_csv('/tmp/df.csv')
#I'm using iPython here, but however you want to view the df
!cat /tmp/df.csv
,A,B,C
0,1,,
1,2,4.0,6.0
2,3,5.0,7.0
#read the df back in from CSV
pd.read_csv('/tmp/df.csv', index_col=0)
A B C
0 1 NaN NaN
1 2 4.0 6.0
2 3 5.0 7.0
Looking at the first (non-header) line of the CSV, you can see the blank cells are indicated by the commas with nothing after them.