0

I have about 100 .db files stored on my Google Drive which I want to run the same SQL query on. I'd like to store these query results in a single .csv file.

I've managed to use the following code to write the results of a single SQL query into a .csv file, but I am unable to make it work for multiple files.

conn = sqlite3.connect('/content/drive/My Drive/Data/month_2014_01.db')

df = pd.read_sql_query("SELECT * FROM messages INNER JOIN users ON messages.id = users.id WHERE text LIKE '%house%'", conn)

df.to_csv('/content/drive/My Drive/Data/Query_Results.csv')

This is the code that I have used so far to try and make it work for all files, based on this post.

databases = []

directory = '/content/drive/My Drive/Data/'
for filename in os.listdir(directory):
    flname = os.path.join(directory, filename)
    databases.append(flname)

for database in databases:
    try:
        with sqlite3.connect(database) as conn:

            conn.text_factory = str
            cur = conn.cursor()
            cur.execute(row["SELECT * FROM messages INNER JOIN users ON messages.id = users.id WHERE text LIKE '%house%'"])
            df.loc[index,'Results'] = cur.fetchall()

    except sqlite3.Error as err:
        print ("[INFO] %s" % err)

But this throws me an error: TypeError: tuple indices must be integers or slices, not str. I'm obviously doing something wrong and I would much appreciate any tips that would point towards an answer.

2
  • 1
    Which line of code throws the error? Commented Apr 25, 2020 at 18:16
  • you should consider accepting Parfait's excellent answer Commented Oct 24, 2021 at 0:19

1 Answer 1

1

Consider building a list of data frames, then concatenate them together in a single data frame with pandas.concat:

gdrive = "/content/drive/My Drive/Data/"
sql = """SELECT * FROM messages 
          INNER JOIN users ON messages.id = users.id 
          WHERE text LIKE '%house%'
      """

def build_df(db)
    with sqlite3.connect(os.path.join(gdrive, db)) as conn:
         df = pd.read_sql_query(sql, conn) 

    return df

# BUILD LIST OF DFs WITH LIST COMPREHENSION
df_list = [build_df(db) for db in os.listdir(gdrive) if db.endswith('.db')]

# CONCATENATE ALL DFs INTO SINGLE DF FOR EXPORT
final_df = pd.concat(df_list, ignore_index = True)

final_df.to_csv(os.path.join(gdrive, 'Query_Results.csv'), index = False)

Better yet, consider SQLite's ATTACH DATABASE and append query results into a master table. This also avoids using the heavy data science, third-party library, pandas, for simple data migration needs. Plus, you keep all database data inside SQLite without worrying about data type conversion and i/o transfer issues.

import csv
import sqlite3

with sqlite3.connect(os.path.join(gdrive, 'month_2014_01')) as conn:
     # CREATE MASTER TABLE
     cur = conn.cursor()
     cur.execute("DROP TABLE IF EXISTS master_query")
     cur.execute("""CREATE TABLE master_query AS
                    SELECT * FROM tmp.messages 
                    INNER JOIN tmp.users 
                        ON tmp.messages.id = tmp.users.id 
                    WHERE text LIKE '%house%'
                 """)
     conn.commit()

     # ITERATIVELY ATTACH AND APPEND RESULTS
     for db in os.listdir(gdrive):
         if db.endswith('.db'):
             cur.execute("ATTACH DATABASE ? AS tmp", [db])
             cur.execute("""INSERT INTO master_query
                            SELECT * FROM tmp.messages 
                            INNER JOIN tmp.users 
                                ON tmp.messages.id = tmp.users.id 
                            WHERE text LIKE '%house%'
                         """)
             cur.execute("DETACH DATABASE tmp")
             conn.commit()

     # WRITE TUPLE OF ROWS TO CSV
     data = cur.execute("SELECT * FROM master_query")

     with open(os.path.join(gdrive, 'Query_Results.csv'), 'wb') as f: 
         writer = csv.writer(f) 
         writer.writerow([i[0] for i in cur.description])  # HEADERS
         writer.writerows(data)                            # DATA

     cur.close()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.