72

I'm trying to import a very large .csv file (~4gb) into mysql. I was considering using phpmyadmin, but then you have a max upload size of 2mb. Someone told me that I have to use the command line.

I was going to use these directions to import it: http://dev.mysql.com/doc/refman/5.0/en/mysqlimport.html#c5680

What would be the command to set the first row in the .csv table as the column names in the mysql table? This option is available through phpmyadmin, so their must be a mysql command line version too, right?. Please help me. Thank you.

-Raj

9 Answers 9

159

Try this command

 load data local infile 'file.csv' into table table
 fields terminated by ','
 enclosed by '"'
 lines terminated by '\n'
 (column1, column2, column3,...)

The fields here are the actual table fields that the data needs to sit in. The enclosed by and lines terminated by are optional and can help if you have columns enclosed with double-quotes such as Excel exports, etc.

For further details check the manual.

For setting the first row as the table column names, just ignore the row from being read and add the values in the command.

Sign up to request clarification or add additional context in comments.

18 Comments

Few questions, firstly, do column1, column2 etc, need to have quotes around it? And my issue is that their are about 50+ column names I need to import. The first line contains all that data, so If there was some way to mysql read the first line, and set them as the column names, that would be the best. It would be way too tedious to write each name one-by-one. Thank you.
You dont need to write it, see that the column names are comma separated, so just cut the first line from your csv file and paste it in the command, And as far as I know, in phpMyAdmin, the csv is being read first to generate a query like this and then the import is being done.
Hmm, did not think of that. Let me give that a go. Thank you. Do you, by any chance, know how to copy from a text editor and paste into ubuntu terminal?
Either right click, you'll get a option paste, or use Ctrl+Shift+V
Okay, well I need to upload the 4gb file on the server and try it. Thanks for help!
|
23

try this:

mysql -uusername -ppassword --local-infile scrapping -e "LOAD DATA LOCAL INFILE 'CSVname.csv'  INTO TABLE table_name  FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'"

1 Comment

This worked for me, however I was getting ERROR 3948 (42000) at line 1: Loading local data is disabled; this must be enabled on both the client and server sides error and the resolution was here at stackoverflow.com/a/60717467/6487887. (Added here so that other folks can directly go to that link)
16

You could do a

mysqlimport --columns='head -n 1 $yourfile' --ignore-lines=1 dbname $yourfile`

That is, if your file is comma separated and is not semi-colon separated. Else you might need to sed through it too.

4 Comments

does the table have to be created already with the headers? And what format is $yourfile, would *.csv work?
Your import file needs the headers, the 'head -n 1 $yourfile' returns the first comma-separated row of your CSV file. --ignore-lines=1 then ignores that one row since it would else try to insert these into your table. The default delimiter is tab (\t), so also add an --fields-terminated-by=',' clause to use a comma-delimiter instead. Please read linux.die.net/man/1/mysqlimport and dev.mysql.com/doc/refman/5.7/en/load-data.html
Shouldn't this be: `mysqlimport --columns=$(head -n 1 FILE) --ignore-lines=1 dbname FILE where file is the database file? Having --columns='head -n 1 $yourfile' produces a syntax error. Also you may have to add the option --local since many mysql servers are by default configured by the --secure-file-priv option.
--columns='head -n 1 $yourfile' raises syntax error. --columns=$(head -n 1 FILE) worked
10

You can simply import by

mysqlimport --ignore-lines=1 --lines-terminated-by='\n' --fields-terminated-by=',' --fields-enclosed-by='"' --verbose --local -uroot -proot db_name csv_import.csv

Note: Csv File name and Table name should be same

Comments

6

For importing csv with a header row using mysqlimport, just add

--ignore-lines=N

(ignores the first N lines of the data file)

This option is described in the page you've linked.

2 Comments

I'm not trying to ignore the first line. I want to use the first line as column headers.
You can't do this with mysqlimport, but you can add the option --columns=column_list to give the command the order of your csv fields for your table.
1

Another option is to use the csvsql command from the csvkit library.

Example usage directly on command line:

csvsql --db mysql:///test --tables yourtable --insert yourfile.csv

This can be executed directly on the command line, or built into a python or shell script for automation if you need to do this for a number of files.

csvsql allows you to create database tables on the fly based on the structure of your csv, so it is a lite-code way of getting the first row of your csv to automagically be cast as the MySQL table header.

Full documentation and further examples here: https://csvkit.readthedocs.io/en/1.0.3/scripts/csvsql.html

Comments

0

I know this says command line, but just a tidbit of something quick to try that might work, if you've got MySQL workbench and the csv isn't too large, you can simply

  • SELECT * FROM table
  • Copy entire CSV
  • Paste csv into the query results section of Workbench
  • Hope for the best

I say hope for the best because this is MySQL Workbench. You never know when it's going to explode


If you want to do this on a remote server, you would do

mysql -h<server|ip> -u<username> -p --local-infile bark -e "LOAD DATA LOCAL INFILE '<filename.csv>'  INTO TABLE <table>  FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'"

Note, I didn't put a password after -p as putting one on the command line is considered bad practice

Comments

0

Most answers missing an important point like if you have created csv file exported from Microsoft Excel on windows and importing the same in linux environment, you will get unexpected result.

the correct syntax would be

load data local infile 'file.csv' into table table fields terminated by ',' enclosed by '"' lines terminated by '\r\n'

here the difference is '\r\n' as against simply '\n

Comments

0

Most of the answers above are correct and revolve around uploading the data using terminal with local_infile but the problem with this approach is that if you are having shared hosting and phpMyAdmin instance then you might stuck with below where your shared hosting provider won't let you change the local_infile settings.

+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| local_infile  | OFF   |
+---------------+-------+

In order to get a workaround solution where I had to insert about 200,000 rows in the db. I wrote below shell script which did the job. You can increase or decrease the BATCH_SIZE as per your use case.

#!/bin/bash

# MySQL credentials
DB_HOST="host"
DB_USER="db_user"
DB_PASS="db_pass"
DB_NAME="db_name"
TABLE_NAME="table_name"

# Path to the CSV file
CSV_FILE="data.csv"

# Field Separator (comma in this case)
IFS=','

# Batch size
BATCH_SIZE=1000
counter=0
SQL_BATCH="INSERT INTO $TABLE_NAME (sub_category, product_name, product_composition, product_price, product_manufactured, product_desc, product_usp, product_interactions) VALUES "'),"

# Read CSV file line by line
while read -r sub_category product_name product_composition product_price product_manufactured product_desc product_usp product_interactions; do

  # Escape single quotes to prevent SQL syntax errors
  sub_category=$(echo "$sub_category" | sed "s/'/''/g")
  product_name=$(echo "$product_name" | sed "s/'/''/g")
  product_composition=$(echo "$product_composition" | sed "s/'/''/g")
  product_price=$(echo "$product_price" | sed "s/'/''/g")
  product_manufactured=$(echo "$product_manufactured" | sed "s/'/''/g")
  product_desc=$(echo "$product_desc" | sed "s/'/''/g")
  product_usp=$(echo "$product_usp" | sed "s/'/''/g")
  product_interactions=$(echo "$product_interactions" | sed "s/'/''/g")

  # Append the current row values to the SQL batch
  SQL_BATCH="$SQL_BATCH ('$sub_category', '$product_name', '$product_composition', '$product_price', '$product_manufactured', '$product_desc', '$product_usp', '$product_interactions'),"

  # Increment the counter
  ((counter++))

  # If we have reached the batch size, execute the SQL
  if [[ $counter -eq $BATCH_SIZE ]]; then
    # Remove the last comma and add a semicolon to complete the SQL statement
    SQL_BATCH="${SQL_BATCH%,};"
    
    # Execute the batch insert
    mysql -h "$DB_HOST" -u "$DB_USER" -p"$DB_PASS" -D "$DB_NAME" -e "$SQL_BATCH"
    
    # Reset the batch and counter
    SQL_BATCH="INSERT INTO $TABLE_NAME (sub_category, product_name, product_composition, product_price, product_manufactured, product_desc, product_usp, product_interactions) VALUES "
    counter=0
  fi

done < "$CSV_FILE"

# Execute the remaining records if there are any
if [[ $counter -gt 0 ]]; then
  # Remove the last comma and add a semicolon
  SQL_BATCH="${SQL_BATCH%,};"
  
  # Execute the remaining batch
  mysql -h "$DB_HOST" -u "$DB_USER" -p"$DB_PASS" -D "$DB_NAME" -e "$SQL_BATCH"
fi

echo "Data import complete."

This workaround solution might take some time in case of large data but does the job.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.