1

I am reading a csv file which has | delimiter at last , while load method make last column in dataframe with no name and no values in Spark 1.6

df.withColumnRenamed(df.columns(83),"Invalid_Status").drop(df.col("Invalid_Status"))

val df = sqlContext.read.format("com.databricks.spark.csv").option("delimiter","|").option("header","true").load("filepath") 
val df2 = df.withColumnRenamed(df.columns(83),"Invalid_Status").

I expected result 
root
 |-- FddCell: string (nullable = true)
 |-- Trn_time: string (nullable = true)
 |-- CELLNAME.FddCell: string (nullable = true)
 |-- Invalid_Status: string (nullable = true)

but actual output is
root
 |-- FddCell: string (nullable = true)
 |-- Trn_time: string (nullable = true)
 |-- CELLNAME.FddCell: string (nullable = true)
 |-- : string (nullable = true)

with no value in column so I have to drop this column and again make new column.
1
  • So you want it to be Null? What do you want the value of the column to be? Commented Jul 26, 2019 at 9:52

1 Answer 1

0

It is not completely clear what you want to do, to just rename the column to Invalid_Status or to drop the column entirely. What I understand is, you are trying to operate (rename/drop) on the last column which has no name.

But I will try to help you with both the solution -

To Rename the column with same values (blanks) as it is:

val df2 = df.withColumnRenamed(df.columns.last,"Invalid_Status")

Only To Drop the last column without knowing its name, use:

val df3 = df.drop(df.columns.last)

And then add the "Invalid_Status" column with default values:

val requiredDf = df3.withColumn("Invalid_Status", lit("Any_Default_Value"))
Sign up to request clarification or add additional context in comments.

8 Comments

I want to assign some default value to column renamed as "Invalid _Status".
I updated the answer with the default value in Invalid_Status Column. Please check if it solves your issue.
@Parthe you are creating new column using withColumn(). I wanted to add default value to existing column that exist but has no name.
This is already mentioned in this below page: [stackoverflow.com/questions/50260820/… Hope this will solve your entire problem.
My column also has no name and it has no value. For eg: When I load csv data to dataframe, the dataframe is loaded with three column.Can you see the last column. I want to add name to this column and assign some value. > `|-- DCR: string (nullable = true) |--SHO: string (nullable = true) |-- : string (nullable = true)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.