0

I want to convert a string object from json file into integer using pyspark.

df1.select(df1["`result.price`"]).dtypes

Out[15]: [('result.price', 'string')]
 df1=df1.withColumn(df1.select(df1["`result.price`"]),F.col(df1.select(df1["`result.price`"])).cast(T.IntegerType()))

'DataFrame' object has no attribute '_get_object_id'
1
  • any suggestion? Commented Feb 23, 2022 at 15:49

1 Answer 1

0

If you want to modify inline:

Since you are trying to modify the data type of nested struct field, I think you need to apply the new StructType.

Take a look at this https://stackoverflow.com/a/63270808/2956135

If you are okay with extracting to a different column:

df1 = df1.withColumn('price', F.col('result.price').cast(T.IntegerType()))

TL;DR

Why your line gives an error?

There is a few mistakes in this syntax.

df1 = df1.withColumn(df1.select(df1["`result.price`"]),F.col(df1.select(df1["`result.price`"])).cast(T.IntegerType()))

First, 1st argument of withColumn has to be string of a column name that you want to save as.

Second, F.col's argument has to be string of a column name or reference to the column.

So, this syntax should not throw an error, however, the casted value is saved to the new column.

df1 = df1.withColumn('result.price', F.col('result.price').cast(T.IntegerType()))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.