0

I have a dataframe df1 with all the columns are in string(100+ Columns), now i want to cast it to appropriate type with inferschema

Like, for example what we do if we have a csv file and we want the inferschame val df = spark.read.format("csv").option("header", "true").option("inferSchema", "true").load(csvFilePath)

Expecting something like, val df = df1.read.option("inferSchema", "true")

A new df with appropriate datatype,

Note that we can't change the df1 Need the solution that will work on databricks

2
  • can you add sample input output dataframes? Commented May 14, 2024 at 11:01
  • Please provide enough code so others can better understand or reproduce the problem. Commented May 14, 2024 at 13:14

1 Answer 1

0

To change the data types of the column as per the column data, you need to check each column data for each data type and cast it as per the data type. This involves lot of manual operations as per the data.

Instead of that, you can directly get the required dataframe by using csv file.

First write the dataframe to a csv file in dbfs like below.

df.write.csv('/Filestore/tables/two.csv',header = 'true')

Then, read the csv file to a dataframe by enabling the inferschema.

df2=spark.read.option('InferSchema','True').csv('/Filestore/tables/two.csv',header = 'true')
df2.printSchema()
df2.show()

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.