This is a very naive question but after referring to multiple articles, I am raising this concern. I have a column in the dataset where the column has numeric/blank/null values.
I have data like below:
fund_value
Null
123
-10
I wrote a method to handle it but it doesn't work and keeps on giving me the error:
def values(x):
if x:
if int(x) > 0:
return 'Positive'
elif int(x) < 0:
return 'Negative'
else:
return 'Zero'
df2 = pd.read_csv('/home/siddhesh/Downloads/s2s_results.csv') # Assuming it as query results
df2 = df2.astype(str)
df2['fund_value'] = df2.fund_value.apply(values)
Error:
Traceback (most recent call last):
File "/home/../Downloads/pyspark/src/sample/actual_dataset_testin.py", line 31, in <module>
df2['fund_value'] = df2.fund_value.apply(values)
File "/home/../.local/lib/python3.8/site-packages/pandas/core/series.py", line 4357, in apply
return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File "/home/../.local/lib/python3.8/site-packages/pandas/core/apply.py", line 1043, in apply
return self.apply_standard()
File "/home/../.local/lib/python3.8/site-packages/pandas/core/apply.py", line 1099, in apply_standard
mapped = lib.map_infer(
File "pandas/_libs/lib.pyx", line 2859, in pandas._libs.lib.map_infer
File "/home/../Downloads/pyspark/src/sample/actual_dataset_testin.py", line 16, in values
if int(x) > 0:
ValueError: invalid literal for int() with base 10: 'nan'
I even tried if x=="" or if not x: but nothing worked.
Expected Output:
fund_value
Zero
Positive
Negative
Nulla string or a properNan?Nanvalue