3

Is there some sort of adaptor that allows querying a postgresql database like it was a pandas dataframe?

1
  • You can read a postgresql table (or in that sense results a SQL query) into a data frame. Is that what you want? Or are you looking for something that avoids SQL totally. Commented Feb 25, 2016 at 22:50

2 Answers 2

4

Update (16th March 2016)

It is possible, but you would have to have a compiler, which evaluates your query and transforms it into SQL clauses.

The fact that SQL is a higher level language and that DBMS interpret SQL clauses with regard to not only the query, but also the data and its distribution, makes this really hard do to performantly.

Wes McKinney is trying to do this with Ibis project and has a nice writeup about some of the challenges.


Previous post

Unfortunately that's not possible, because SQL is higher level language than Python.

With pandas you specify what and how you want to do something, whereas with SQL you only specify what you want. The SQL server is then free to decide how to serve your query. When you add an index to a table, the SQL server can then use that index to serve your query faster without you rewriting your query.

If you instructed your database how you want it to execute your query, then you would also need to rewrite your SQL statements if you wanted them to use an index.


That being said, I commonly use the pattern in neurite's answer for analysis, using SQL to perform initial aggregation (and reduce size of data) and then perform other operations in pandas.

Sign up to request clarification or add additional context in comments.

Comments

1

Not sure if this is exactly what you want but you can load postgres tables into pandas and manipulate them from there.

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql.html http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html

Shamelessly stolen from the pages referenced above:

import pandas
from sqlalchemy import create_engine

engine = create_engine(
   'postgresql+pg8000://scott:tiger@localhost/test',
    isolation_level='READ UNCOMMITTED'
)
df = pandas.read_sql('SELECT * FROM <TABLE>;' con=engine)

1 Comment

Thanks, but the table is to big to be loaded into memory (and moving it to a hd5 type of storage is not in plan for the near future). What I am looking for is an interface that will directly interact with the database.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.