2

Background information:

I am trying to write a small shell script, which searches a pattern (string) in a .fas-file and prints the line and position, where the pattern was found. The following code snippet works, when I call the shell script:

Script (search.sh):

#!/bin/bash

awk 's=index($0, "CAATCTCC"){print "line=" NR, "start position=" s}' 100nt_upstream_of_mTSS.fas 

Command line call:

$ ./search.sh

First problem:

When I change the script to:

awk 's=index($0, "CAATCTCC"){print "line=" NR, "start position=" s}' 

and do the following command line call in my bash:

$ ./search.sh 100nt_upstream_of_mTSS.fas

"nothing" happens (something is running, but it takes way too long and no results come up, so terminate the process).

Worth to know:

  • I am in the directory, where search.sh is located
  • the file 100nt_upstream_of_mTSS.fas is located there, too
  • search.sh is executable

I might be "screen blind", but I can't find the reason, why I am unable to pass a command line argument to my script.


Solution - see comments

Note: Only the first occurence of the pattern in a line is found this way.


Second problem:

Furthermore, I would like to make the motif (the string) I search for variable. I tried this:

Script:

#!/bin/bash
FILE=$1
MOTIF=$2
awk 's=index($0, "$MOTIF"){print "line=" NR, "start position=" s}' "$FILE"

Command line call:

$ ./search.sh 100nt_upstream_of_mTSS.fas CAATCTCC

Idea: First command-line argument worked and was substitued correctly. Why is the second one not substituted correctly?

Solution so far:

Script:

#!/bin/bash
file=$1
awk -v s="$2" 'i=index($0, s){print "line: " NR, "pos: " i}' "$file"

Testing:

Testfile (test.txt):

1 GAGAGAGAGA

2 CTCTCTCTCT

3 TATATATATA

4 CGCGCGCGCG

5 CCCCCCCCCC

6 GGGGGGGGGG

7 AAAAAAAAAA

8 TTTTTTTTTT

9 TGATTTTTTT

10 CCCCCCCCGA

 $ ./search.sh test.txt GA

will print:

line: 1 pos: 1

line: 4 pos: 2

line: 6 pos: 1

line: 9 pos: 2

line: 10 pos: 9

This script will print line and first match position in the line of only the first character of my pattern. How do I manage to have all results printed and the full pattern being used?

1 Answer 1

2

As far as I understood you want to pass the Input_file(file which you want to process by script) as an argument, if this is the case then following may help you in same.

cat search.sh
#!/bin/bash
variable=$1
awk 's=index($0, "CAATCTCC"){print "line=" NR, "start position=" s}' "$variable"

./search.sh 100nt_upstream_of_mTSS.fas 
Sign up to request clarification or add additional context in comments.

1 Comment

Yes, this helped! So, awk needs the variable parameter in the end, and I missed that? I feel like I did not fully understand the awk structure yet, but I will do some more research.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.