0

I wanted to write teh data into the mysql database. I am reading the current data first from the database and the calculate a new values. The new values should be written in the same order as the data in the databse as shown below. I don't want to overwrite existing data. I don't want to use to_sql.

I receive the following error message:

(mysql.connector.errors.DatabaseError) 1265 (01000): Data truncated for column 'log_return' at row 1 [SQL: 'INSERT INTO

The full code is below.

import sqlalchemy as sqlal
import pandas as pd
import numpy as np

mysql_engine = sqlal.create_engine(xxx)
mysql_engine.raw_connection()

metadata = sqlal.MetaData()

product  = sqlal.Table('product', metadata,
                       sqlal.Column('ticker', sqlal.String(10), primary_key=True, nullable=False, unique=True),                   
                       sqlal.Column('isin', sqlal.String(12), nullable=True),
                       sqlal.Column('product_name', sqlal.String(80), nullable=True),
                       sqlal.Column('currency', sqlal.String(3), nullable=True),
                       sqlal.Column('market_data_source', sqlal.String(20), nullable=True),
                       sqlal.Column('trading_location', sqlal.String(20), nullable=True),
                       sqlal.Column('country', sqlal.String(20), nullable=True),
                       sqlal.Column('sector', sqlal.String(80), nullable=True)
                       )

market_price_data = sqlal.Table('market_price_data', metadata,
                                sqlal.Column('Date', sqlal.DateTime, nullable=True),
                                sqlal.Column('ticker', sqlal.String(10), sqlal.ForeignKey('product.ticker'), nullable=True), 
                                sqlal.Column('adj_close', sqlal.Float, nullable=True),
                                sqlal.Column('log_return', sqlal.Float, nullable=True)
                                ) 

metadata.create_all(mysql_engine) 

GetTimeSeriesLevels = pd.read_sql_query('SELECT Date, ticker, adj_close FROM market_price_data Order BY ticker ASC', mysql_engine)
GetTimeSeriesLevels['log_return'] = np.log(GetTimeSeriesLevels.groupby('ticker')['adj_close'].apply(lambda x: x.div(x.shift(1)))).dropna()
GetTimeSeriesLevels['log_return'].fillna('NULL', inplace=True)
insert_yahoo_data = market_price_data.insert().values(GetTimeSeriesLevels [['log_return']].to_dict('records'))
mysql_engine.execute(insert_yahoo_data)

The database is looks like the following.

Date                ticker  adj_close log_return
2016-11-21 00:00:00 AAPL    111.73    NULL  
2016-11-22 00:00:00 AAPL    111.8     NULL  
2016-11-23 00:00:00 AAPL    111.23    NULL      
2016-11-25 00:00:00 AAPL    111.79    NULL  
2016-11-28 00:00:00 AAPL    111.57    NULL  
2016-11-23 00:00:00 ACN     119.82    NULL  
2016-11-25 00:00:00 ACN     120.74    NULL  
2016-11-28 00:00:00 ACN     120.76    NULL  
2016-11-29 00:00:00 ACN     120.94    NULL  
2016-11-30 00:00:00 ACN     119.43    NULL  
...

It should look like this:

Date                ticker  adj_close log_return
2016-11-21 00:00:00 AAPL    111.73    NULL
2016-11-22 00:00:00 AAPL    111.8     0.000626
2016-11-23 00:00:00 AAPL    111.23    -0.005111
2016-11-25 00:00:00 AAPL    111.79    0.005022
2016-11-28 00:00:00 AAPL    111.57    -0.001970
2016-11-21 00:00:00 ACN     119,68   NULL
2016-11-22 00:00:00 ACN     119,48   -0,001672521
23.11.2016 00:00:00 ACN     119,82   0,002841623
2016-11-25 00:00:00 ACN     120,74   0,007648857
2016-11-28 00:00:00 ACN     120,76   0,000165631    
...
1
  • @Parfait, Thanks for the answer. I included now the sqlalchemy import into the code above. Commented Jan 14, 2017 at 22:25

1 Answer 1

1

While shamefully, I don't know sqlalchemy only raw SQL, consider dumping pandas dataframe into a temp table then join it with final table:

# DUMP TO TEMP TABLE (REPLACING EACH TIME)
GetTimeSeriesLevels.to_sql(name='log_return_temp', con=mysql_engine, if_exists='replace', 
                           index=False)

# SQL UPDATE (USING TRANSACTION)
with engine.begin() as conn:     
    conn.execute("UPDATE market_price_data f" +
                 " INNER JOIN log_return_temp t" +
                 " ON f.Date = t.Date" +
                 " AND f.ticker = t.ticker" +
                 " SET f.log_return = t.log_return;")

engine.dispose()

Alternatively, consider doing your log transformation directly in MySQL! From what I see, in your pandas/numpy code, you are log transforming the quotient of current row's adj_close with last row's adj_close. MySQL can run a self join to line up current and last row. And MySQL maintains natural log in its mathematical operators.

Below is the select statement that can be dumped to a temp table with CREATE AS ... or converted into a complex UPDATE query with nested SELECT statements:

SELECT t1.*, LOG(t1.adj_close / t2.adj_close) As log_return
FROM    
   (SELECT m.Date, m.ticker, m.adj_close, 
           (SELECT Count(*) FROM market_price_data sub 
            WHERE sub.Date <= m.Date AND sub.ticker = m.ticker) AS rank
    FROM market_price_data m) As t1

INNER JOIN 
   (SELECT m.Date, m.ticker, m.adj_close, 
           (SELECT Count(*) FROM market_price_data sub 
            WHERE sub.Date <= m.Date AND sub.ticker = m.ticker) AS rank
    FROM market_price_data m) As t1

ON t1.rank = (t2.rank - 1) AND t1.ticker = t2.ticker AND t1.Date = t2.Date
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks somehow I receive the same error message: **(mysql.connector.errors.DatabaseError) 1265 (01000): Data truncated for column 'log_return' at row 1 [SQL: **
What is the decimal point precision of your log_return data type? Or how many decimals do you allow? Round the Python figure the same.
Thanks it works now becasue I hat in the finnna a NULL defined for NaN instead of 0. I would like more having a NULL maybe you have a hint? Thanks.
Finna? Python's NaNs should translate to MySQL's NULL. So don't pass a string quoted NULL. In fact, remove the fillna line, specifically GetTimeSeriesLevels['log_return'].fillna('NULL', inplace = True)
Thanks a lot that is what I was looking for.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.