10

I have a json data file which contain one property [creationDate] which is unix epoc in "long" number type. The Apache Spark DataFrame schema look like below:

root 
 |-- creationDate: long (nullable = true) 
 |-- id: long (nullable = true) 
 |-- postTypeId: long (nullable = true)
 |-- tags: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- title: string (nullable = true)
 |-- viewCount: long (nullable = true)

I would like to do some groupBy "creationData_Year" which need to get from "creationDate".

What's the easiest way to do this kind of convert in DataFrame using Java?

3 Answers 3

16

After checking spark dataframe api and sql function, I come out below snippet:

DateFrame df = sqlContext.read().json("MY_JSON_DATA_FILE");

DataFrame df_DateConverted = df.withColumn("creationDt", from_unixtime(df.col("creationDate").divide(1000)));

The reason why "creationDate" column is divided by "1000" is cause the TimeUnit is different. The orgin "creationDate" is unix epoch in "milli-second", however spark sql "from_unixtime" is designed to handle unix epoch in "second".

Sign up to request clarification or add additional context in comments.

4 Comments

Your original issue was regarding the "group-by" granularity of long creationDate ?
Yes, I would like to group-by "Year" & "Month" of "creationDate", then do some aggregation.
So what was wrong with grouping by original creationDate column?
Because the origin data type in JSON is "long" (unix epoc) , I need to convert this field to "Year" and "Month", for example: 1452066042000 need to convert to "2016" for "createDate_Year" column and "1" for "createDate_Month" column. This way, I can utilize spark df.groupBy() and some other aggregation function to computing.
8

pyspark converts from Unix epoch milliseconds to dataframe timestamp

df.select(from_unixtime((df.my_date_column.cast('bigint')/1000)).cast('timestamp').alias('my_date_column'))

Comments

5

In spark scala,

spark.sql("select from_unixtime(1593543333062/1000) as ts").show(false)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.