Linux/Bash - Create Loop for "find" to output to a file

Question

Apologies if this has been answered, I'm somewhat new to Linux but I didn't see anything here that was on target.

Anyway, I'm running this command:

find 2013-12-28 -name '*.gz' | xargs zcat | gzip > /fast/me/2013-12-28.csv.gz

The issue is that I need to run this command for about 250 distinct dates, so doing this one at a time is quite tedious.

What I want to do is have a script that will increment the date by 1 day after the "find" and in the file name. I really don't even know what this would look like, what commands to use, etc.

Background:

The find command is being used in a folder that's full of folders, each for 1 day of data. Each day's folder contains 24 subfolders, with each subfolder containing about 100 gzipped CSV files. So the find command is necessary 2 levels up from the folder because it will scan through each folder to combine all the data. The end result is that all the zipped up files are combined into 1 large zipped up file.

If anyone can help it would be hugely appreciated, otherwise I have about 250 more commands to execute, which obviously will suck.

Do these top-level folders contain the dates in their names? That would make things easier. — eigenchris
– eigenchris, Commented Dec 4, 2014 at 21:00
Hi eigenchris, yes the top level folders are all named like "2014-01-01", "2014-01-02", etc. The subfolders are named "0-0", "0-1", "0-2", etc. The actual files look like "00:00:00.csv.gz", "00:05:00.csv.gz", "00:10:00.csv.gz", etc. — Scott
– Scott, Commented Dec 4, 2014 at 21:06

jmunsch · Accepted Answer · 2014-12-04 22:27:48Z

3

What about something like this?

prev_date="2013-12-28"
for i in {0..250}; do
    next_date=$(date -d"$prev_date +1 day" +%Y-%m-%d)
    prev_date=$next_date
    find $next_date -name '*.gz' | xargs zcat | gzip > /fast/me/$next_date.csv.gz
done

It should iterate through 250 dates like:

edited Dec 4, 2014 at 22:27

answered Dec 4, 2014 at 21:02

jmunsch

24.3k12 gold badges102 silver badges120 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Scott Over a year ago

Hmm it made the files ok, but it didn't seem to execute the find command correctly. The output files should be between 500 MB and 1.5 GB, but the ones it made are empty.

Scott Over a year ago

Thanks for that though! Awesome stuff :)

Scott Over a year ago

ok it worked! thanks so much! i just had to remove the quotes in the 4th line around $next_date

eigenchris · Accepted Answer · 2014-12-04 21:26:16Z

0

jmunsch's solution works very well if the dates are sequential. Otherwise you could do this:

(edited to replace dash characters with colons)

for folderName in $(find . -type d -mindepth 1 -maxdepth 1 )
do
   date=$(basename $folderName)
   dateWithColons=$(echo $date | sed "s#-#:#g")  # this will replace - with :
   find "$folderName" -name '*.gz' | xargs zcat | gzip > /fast/me/$dateWithColons.csv.gz
done

edited Dec 4, 2014 at 21:26

answered Dec 4, 2014 at 21:17

eigenchris

5,8512 gold badges23 silver badges31 bronze badges

Collectives™ on Stack Overflow

Linux/Bash - Create Loop for "find" to output to a file

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related