Best way to sanitize some data for importing into postgresql?

Question

I have two columns with date in the YYMMDD format and a time in the HHMMSS format, they are strings like 150103 132244. These are close to a quarter of a billion records. What would be the best way to sanitize the data prior to importing to PostgreSQL? Is there a way to do this while importing, for instance?

It depends what else you need to do with the data, I guess. If it's otherwise in a suitable CSV format, I'd probably use COPY FROM into text columns and then re-format the fields once in the DB. But there's any number of scripting languages that could prepare the file first if you preferred. — IMSoP
– IMSoP, Commented Feb 1, 2017 at 16:26
@IMSoP: Yes, this is a raw csv file. Sorry, do you mean pre-process before importing with, say Python? Or you mean doing something after importing? — Dervin Thunk
– Dervin Thunk, Commented Feb 1, 2017 at 16:29

klin · Accepted Answer · 2017-02-01 17:13:32Z

2

Your data can be converted to timestamp with time zone using the function to_timestamp():

with example(d, t) as (
    values ('150103', '132244')
)

select d, t, to_timestamp(concat(d, t), 'yymmddhh24miss')
from example;

   d    |   t    |      to_timestamp      
--------+--------+------------------------
 150103 | 132244 | 2015-01-03 13:22:44+01
(1 row)

You can import a file into a table with temporary columns (d, t):

create table example(d text, t text);
copy example from ....

add a timestamp with time zone column, convert the data and drop redundant text columns:

alter table example add tstamp_column timestamptz;

update example
set tstamp_column = to_timestamp(concat(d, t), 'yymmddhh24miss');

alter table example drop d, drop t;

answered Feb 1, 2017 at 17:13

klin

123k15 gold badges241 silver badges263 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Best way to sanitize some data for importing into postgresql?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related