Using cut and grep commands in unix

Question

I have a file (file1.txt) with text as:

aaa,,,,,
aaa,10001781,,,,
aaa,10001782,,,,
bbb,10001783,,,,

My file2 contents are:

11111111
10001781
11111222

I need to search second field of file1 in file2 and delete the line from file1 if pattern is matching.So output will be:

aaa,,,,,
aaa,10001782,,,,
bbb,10001783,,,,

Can I use grep and cut commands for this?

John1024 · Accepted Answer · 2015-06-08 17:30:40Z

8

This prints lines from file1.txt only if the second field is not in file2:

$ awk -F, 'FNR==NR{a[$1]=1; next;} !a[$2]' file2 file1.txt
aaa,,,,,
aaa,10001782,,,,
bbb,10001783,,,,

How it works

This works by reading file2 and keeping track of all lines seen in an associative array a. Then, lines in file1.txt are printed only if its column 2 is not in a. In more detail:

FNR==NR{a[$1]=1; next;}

When reading file2, set a[$1] to 1 to signal that we have seen the value on this line. We then instruct awk to skip the rest of the commands and start over on the next line.

This section is only run for file2 because file2 is listed first on the command line and FNR==NR only when we are reading the first file listed on the command line. This is because FNR is the number of lines read from the current file and NR is the total number of lines read so far. These two are equal only for the first file.
!a[$2]

When reading file1.txt, a[$2] evaluates to true if column 2 was seen in file2. Since ! is negation, !a[$2] evaluates to true when column 2 was not seen. When this evaluates to true, the line is printed.

Alternative

This is the same logic, expressed in a slightly different style, as suggested in the comments by Tom Fenech:

$ awk -F, 'FNR==NR{a[$1]; next;} !($2 in a)' file2 file1.txt
aaa,,,,,
aaa,10001782,,,,
bbb,10001783,,,,

edited Jun 8, 2015 at 17:30

answered Jun 8, 2015 at 7:06

John1024

115k15 gold badges152 silver badges183 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

user1768029 Over a year ago

Thanks for your reply.as jurgemaister pointed that all command versions in my solaris are pointing to old folder '/usr/bin/'.Hence i get error as "awk: syntax error near line 1,awk: bailing out near line 1"

anishsane Over a year ago

Try running which -a awk gawk. See if you have a newer awk installed.

John1024 Over a year ago

@user1768029 OK. In place of the default awk, try running nawk or /usr/xpg4/bin/awk or /usr/xpg6/bin/awk.

Tom Fenech Over a year ago

I would have probably gone with simply setting the key in the array a[$1]; then using $2 in a but either way works. Nice explanation anyway.

John1024 Over a year ago

@TomFenech Very good; I added code with your approach to the answer.

Jørgen R · Accepted Answer · 2015-06-08 07:09:50Z

1

Soulution with grep

$ grep -vf file2 file1.txt
aaa,,,,,
aaa,10001782,,,,
bbb,10001783,,,,

John1024's awk soulution would be faster for large files though.

answered Jun 8, 2015 at 7:09

Jørgen R

10.8k7 gold badges45 silver badges61 bronze badges

9 Comments

user1768029 Over a year ago

I get error as - "grep: illegal option -- f". I am using bash shell.Can we use this command in bash ?

Jørgen R Over a year ago

Which OS and which version of grep do you use?

user1768029 Over a year ago

SunOS boc02 5.10 Generic_144488-12 sun4u sparc SUNW,Sun-Fire-V240

tripleee Over a year ago

Or just update your PATH to have /usr/xpg4/bin before /usr/bin. Usually this is set in your .profile or similar.

tripleee Over a year ago

This assumes that the field cannot occur as a substring of an unwanted field, and that it will not occur in another column than the one you intended to search. For these reasons, the Awk solution is much superior.

|

Collectives™ on Stack Overflow

Using cut and grep commands in unix

2 Answers 2

How it works

Alternative

5 Comments

9 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

How it works

Alternative

5 Comments

9 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related