1

I'm trying to take backup of tables in my database server.

I have around 200 tables. I have a shell script that contains commands to take backups of each table like:

backup.sh

psql -u username ..... table1 ... file1;
psql -u username ..... table2 ... file2;
psql -u username ..... table3 ... file3;

I can run the script and create backups in my machine. But as there are 200 tables, it's gonna run the commands sequentially and takes lot of time. I want to run the backup commands in parallel. I have seen articles where in they suggested to use && after each command or use nohup command or wait command.

But I don't want to edit the script and include around 200 such commands.

Is there any way to run these list of shell script commands parallelly? something like nodejs does? Is it possible to do it? Or am I looking at it wrong?

Sample command in the script:

psql --host=somehost --port=5490 --username=user --dbname=db -c '\copy dbo.tablename TO "/home/username/Desktop/PostgresFiles/tablename.csv" with DELIMITER ","';
5
  • 2
    Chances are the database is already giving you tae tables as fast as it can, and attempting to make it do more things at the same time will just produce congestion. Commented Nov 20, 2019 at 17:54
  • My shell commands are using \copy which is a client side CSV creation which was working fine when I ran the script normally. But if I use the above mentioned approach, it is considering \copy as COPY which is server side copy for which the db user doesn't have access to. Commented Nov 20, 2019 at 17:58
  • @tripleee updated question with sample command in the script. Commented Nov 20, 2019 at 18:00
  • That should not happen unless there are things going on which you aren't showing, such as passing the command to a remote shell. Commented Nov 20, 2019 at 18:06
  • I tried the command that is mentioned in the below answer by drldcsta Commented Nov 20, 2019 at 18:07

3 Answers 3

2

You can leverage xargs to run command in parallel, AND control the number of concurrent jobs. Running 200 backup jobs might overwhelm your database, and result in less than optimal performance.

Assuming you have backup.sh with one backup command per line

xargs -P5 -I{} bash -c "{}" < backup.sh

The commands in backup.sh should be modified to allow quoting (using single quote when possible, escaping double quote):

psql --host=somehost --port=5490 --username=user --dbname=db -c '\copy dbo.tablename TO \"/home/username/Desktop/PostgresFiles/tablename.csv\" with DELIMITER \",\"';

Where -P5 control the number of concurrent jobs. This will be able to process command lines WITHOUT double quotes. For the above script, you change "\copy ..." to '\copy ...'

Simpler alternative will be to use a helper backup-table.sh, which will take two parameters (table, file), and use

xargs -P5 -I{} backup-table.sh "{}" < tables.txt

And put all the complex quoting into the backup-table.sh

Sign up to request clarification or add additional context in comments.

6 Comments

I'm getting psql: warning: extra command-line argument error for all the words in the script.
Oops, I did not realize that the full command has double quotes. I've posted small update - but it will require changing double quotes in backup.sh to single quote, as per answer. Hope it will get you more mileage.
Same issue even after replacing " with ' and ' with ".
Can you share sample line from backup.txt (after the changes) ?
Updated in the question. BTW, I'm using Ubuntu. I suppose it shouldn't cause an issue.
|
1
doit() {
  table=$1
  psql --host=somehost --port=5490 --username=user --dbname=db -c '\copy dbo.'$table' TO "/home/username/Desktop/PostgresFiles/'$table'.csv" with DELIMITER ","';
}
export -f doit

sql --listtables -n postgresql://user:pass@host:5490/db | parallel -j0 doit

Comments

0

Is there any logic in the script other than individual commands? (EG: and if's or processing of output?).

If it's just a file with a list of scripts, you could write a wrapper for the script (or a loop from the CLI) EG:

$ cat help.txt
echo 1
echo 2
echo 3

$ while read -r i;do bash -c "$i" &done < help.txt
[1] 18772
[2] 18773
[3] 18774
1
2
3
[1]   Done                    bash -c "$i"
[2]-  Done                    bash -c "$i"
[3]+  Done                    bash -c "$i"

$ while read -r i;do bash -c "$i" &done < help.txt
[1] 18820
[2] 18821
[3] 18822
2
3
1
[1]   Done                    bash -c "$i"
[2]-  Done                    bash -c "$i"
[3]+  Done                    bash -c "$i"

Each line of help.txt contains a command and I run a loop where I take each command and run it in subshell. (this is a simple example where I just background each job. You could get more complex using something like xargs -p or parallel but this is a starting point)

2 Comments

My shell commands are using \copy which is a client side CSV creation which was working fine when I ran the script normally. But if I use the above mentioned approach, it is considering \copy as COPY which is server side copy for which the db user doesn't have access to.
You need to use read -r to have read not mangle backslashes. You really basically always want the -r option with read unless you specifically require the wacky legacy behavior from the original sh.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.