0

I would like to convert a datetime string to timestamp in dask cudf and then sort the dataframe by this column.

Example:

import dask_cudf as ddf
import pandas as pd

# Sample data (replace with your actual data)
cdf = cudf.DataFrame({
    'city': ['Dallas', 'Bogota', 'Chicago', 'Juarez'],
    'timestamp': ['2019-12-29 14:15:08 UTC', '2019-12-30 10:30:15 UTC', '2019-12-31 18:45:30 UTC', '2020-01-01 03:20:45 UTC']
})

# Create a Dask-cuDF DataFrame
dask_df = ddf.from_cudf(cdf, npartitions=2)

def to_timestamp(x):
    import time
    import datetime
    element = datetime.datetime.strptime(x,"%Y-%m-%d %H:%M:%S UTC")
    return datetime.datetime.timestamp(element)

dask_df['timestamp'] = dask_df['timestamp'].map_partitions(to_timestamp, meta=("timestamp", "str"))

dask_df.head()

I got error:

TypeError: strptime() argument 1 must be str, not Series

How can I do this for large dataframe on dask cudf ?

==========update ==========

I have tried this:

   dask_df["timestamp"] = dask_df["timestamp"].map_partitions(to_timestamp, meta=("timestamp", "str"))

and got error:

  TypeError: strptime() argument 1 must be str, not Series

1 Answer 1

0

This map_partitions thread seems to cover all the tricks of using map_partitions on a row-by-row basis.

Furthermore, you can refactor your function somewhat. The import statements can be moved outside of the function to save on loading time. You're only using datetime in the function therefore you can skip on importing time. The function could then look like this:

def to_timestamp(x):
    datetime_object = datetime.datetime.strptime(x,"%Y-%m-%d %H:%M:%S UTC")
    timestamp = datetime.datetime.timestamp(element)
    return timestamp
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.