2

I have a CSV file that looks something like this:

Date,Person,Time Out,Time Back,Restaurant,Calories,Delicious?
6/20/2016,August,11:58,12:45,Black Bear,850,Y
6/20/2016,Marcellus,12:00,12:30,Brought Lunch,,Y
6/20/2016,Jessica,11:30,12:30,Wendy's,815,N
6/21/2016,August,12:05,1:01,Brought Lunch,,Y

So far I have managed to print each row into a list of strings (ex. - ['Date', 'Person', 'Time Out', etc.] or ['6/20/2016', 'August', '11:58' etc.]).

Now I need to do 2 more things:

  1. Add an ID header and sequential numeric string to each row (for ex. - ['ID', 'Date', 'Person', etc.] and ['1', '6/20/2016', 'August', etc.])
  2. Separate each row so that they can be formatted into insert statements rather than just having the program print out every single row one after another (for ex. - INSERT INTO Table ['ID', 'Date', 'Person', etc.] VALUES ['1', '6/20/2016', 'August', etc.])

Here is the code that has gotten me as far as I am now:

import csv

openFile = open('test.csv', 'r')
csvFile = csv.reader(openFile)
for row in csvFile:
    print (row)
openFile.close()
4
  • 2
    Is the ID column in your SQL table a primary key? If so you could rely on SQL's auto_increment property and just ignore the ID column in the insert query. Commented Jun 30, 2016 at 22:57
  • Why don't you use MySQL's built-in LOAD DATA INFILE to load directly from the CSV file to the database, instead of parsing it in Python? Commented Jul 1, 2016 at 0:19
  • @Mumpo, Yes it is. I am not very familiar with MySQL so I did not know this was an option but seem far more viable than what I had listed above. Thanks for the tip. Commented Jul 1, 2016 at 15:53
  • @Barmar, Mainly because there are some other things that I want to do with the file before inserting it into MySQL and figured it wouldn't be too hard to mess around with this first. However if ti doesn't work that is definitely and option. Thanks for putting that out there. Commented Jul 1, 2016 at 15:55

3 Answers 3

7

Try this (I ignored the ID part since you can use the mySQL auto_increment)

import csv

openFile = open('test.csv', 'r')
csvFile = csv.reader(openFile)
header = next(csvFile)
headers = map((lambda x: '`'+x+'`'), header)
insert = 'INSERT INTO Table (' + ", ".join(headers) + ") VALUES "
for row in csvFile:
    values = map((lambda x: '"'+x+'"'), row)
    print (insert +"("+ ", ".join(values) +");" )
openFile.close()
Sign up to request clarification or add additional context in comments.

6 Comments

That worked. Thanks for the help! As a side note, do you know if there would be any way to get rid of unwanted spaces between values in the VALUES list?
Those spaces were intentional, just replace the line before the last with this one: print (insert +"("+ ",".join(values) +");" )
Oh no I understand that part, I was more so referring to the actual strings of data themselves (for ex. - 'UNKNOWN ', to instead be 'UNKNOWN',)
You can use .strip() to achieve that. values = map((lambda x: '"'+x.strip()+'"'), row)
Again worked perfectly. And just to make sure I get my moneys worth, how would I go about getting this to work on a group of files in a directory rather than just one named file? I've tried using glob.glob(filePath) and then csv.reader(filePathObj) but instead of getting back the contents of the files I get back the names instead (for ex. - (INSERT INTO Table 'C:/Users/etc') VALUES ('C:/Users/etc'))
|
1

You can use this functions if you want to mantain type conversion, i have used it to put data into google big query with a string sql statement.

PS: You can put other types on the function

import csv

def convert(value):
    for type in [int, float]:
        try:
            return type(value)
        except ValueError:
            continue
    # All other types failed it is a string
    return value


def construct_string_sql(file_path, table_name, schema_name):
    string_SQL = ''
    try:
        with open(file_path, 'r') as file:            
            reader = csv.reader(file)
            headers = ','.join(next(reader))
            for row in reader:
                row = [convert(x) for x in row].__str__()[1:-1]
                string_SQL += f'INSERT INTO {schema_name}.{table_name}({headers}) VALUES ({row});'
    except:
        return ''

    return string_SQL 

Comments

1

You can use this open-source tool to generate batch INSERT statements: https://simranjitk.github.io/sql-converter/.

1 Comment

Winner 👑 And great that this is fully privat. Top!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.