0

I am trying to read data from GP and ingest to HDFS using Spark. I need an integer column to partition the data which I read from the GP table. The problem here is I don't have a primary column or any column that has unique values. In this scenario, column that I can rely on the most is the timestamp column where I can convert it to Integer/Long .

The data in timestamp column is present in the format:

select max(last_updated_timestamp) from schema.tablename => 2018-12-13 13:29:55

Could anyone let me know how can I cast the timestamp column including its milliseconds and produce an EPOCH value from it which I can use it in my spark code ?

1 Answer 1

1

You can use extract(epoch from last_updated_timestamp).

Sign up to request clarification or add additional context in comments.

2 Comments

Ok. Just asking, I am getting the values with decimal numbers: 1550551127.310969. Is it possible to get it as 1550551127310969 ?
I think its ok for your needs just to multiply by 1000000: extract(epoch from last_updated_timestamp)*1000000 or even larger number if you get more digits after the decimal point.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.