1

Below is my script. I have attempted many print statements to work out why it is only accessing the first array element. The pattern match works. The array holds a minimum 40 elements. I have checked and it is full. I have printed each line, and each line prints.

my $index = 0;
open(FILE, "$file") or die "\nNot opening $file for reading\n\n";
open(OUT, ">$final") or die "Did not open $final\n";
while (<FILE>) {
    foreach my $barcode (@barcode) {
        my @line = <FILE>;
        foreach $_ (@line) {
            if ($_ =~ /Barcode([0-9]*)\t$barcode[$index]\t$otherarray[$index]/) {
                my $bar = $1;
                $_ =~ s/.*//;

                print OUT ">Barcode$bar"."_"."$barcode[$index]\t$otherarray[$index]";
            }
            print OUT $_;
        }
        $index++;
    }
}

Okay, lets say the input was:

File:
Barcode001    001    abc
Barcode002    002    def
Barcode003    003    ghi

@barcode holds:

001
002
003

@otherarray holds:

abc
def
ghi

The output result for this script is currently printing only:

Barcode001_001 abc

It should be printing:

>Barcode001_001    abc
>Barcode002_002    def
>Barcode003_003    ghi

Where it should be printing a whole load up to ~40 lines.

Any ideas? There must be something wrong with the way I am accessing the array elements? Or incrementing? Hoping this isn't something too silly! Thanks in advance.

It needs the index because I am trying to match arrays in parallel, as they are ordered. Each line needs to match the corresponding indices of the arrays to each line in the file.

7
  • 3
    Don't use $_ for outer loops. Name the variables. Commented Mar 9, 2015 at 15:59
  • @choroba, I have changed as you suggested, still not working though? Commented Mar 9, 2015 at 16:12
  • What does @barcode look like? Commented Mar 9, 2015 at 16:18
  • The @barcode looks like : 001, 002, 003, and the output expected would be Barcode001_001, Barcode002_002, Barcode003_003 Commented Mar 9, 2015 at 16:22
  • I have edited my question, sorry for my lack of clarity, I was trying to simplify it, I require the index because I am trying to match more than one array but at the same element index per line. Commented Mar 9, 2015 at 16:33

3 Answers 3

2

It's a little hard to answer with certainty without more information about the contents of @barcode and FILE, but there is something odd in your code which makes me think that it might be the problem.

The construct while (<FILE>) { ... } will, until end of file, read a line from FILE into $_ and then execute the contents of the loop. In your code, you also read all the lines from FILE from within the loop that iterates over @barcode. I think it is likely that you intended to check each line from FILE against all the elements of @barcode, which would make the loop look like the following:

while (my $line = <FILE>) {
    foreach my $barcode (@barcode) {
        if ($line =~ /Barcode([0-9]*)\t$barcode/) {
            my $bar = $1;
            print OUT ">Barcode$bar"."_"."$barcode\n";
        }
        else {
            print OUT $line;
        }
    }
}

I've taken the liberty of doing a bit of code tidying, but I may have made some unwarranted assumptions.

Sign up to request clarification or add additional context in comments.

Comments

0

Your core problem in the above is - in your first iteration you slurp all of your file into @lines. But because it's lexically scoped to the loop, it disappears when that loop completes.

Furthermore:

  • I would strongly suggest that you don't use $_ like that.

$_ is a special variable that's set implicitly in loops. I'd strongly suggest that you need to replace that with something that isn't a special variable, because that's a sure way to cause yourself pain.

  • turn on use strict; and use warnings;

  • use 3 argument open with a lexical filehandle.

  • perltidy your code, so the bracketing looks right.

  • you've a search and replace pattern on $_ that's emptying it completely, but then you're trying to print it. You may well not be printing what you think you're printing.

  • You're accessing <FILE> outside and inside your loop. This will cause you problems.

    • Barcode([0-9]*) - with a '*' there you're saying 'zero or more' is valid. You may want to consider \d+ - one or more digits.

    • referencing multiple arrays by index is messy. I'd suggest coalescing them into a hash lookup (lookup by key - barcode)

This line:

my @line = <FILE>;

reads your whole file into @line. But you do this inside the while loop that iterates... each line in <FILE>. Don't do that, it's horrible.

Is this something like what you wanted?

#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

my @barcode = qw (
    001
    002
    003
);

my @otherarray = qw (
    abc
    def
    ghi
);

my %lookup;
@lookup{@barcode} = @otherarray;

print Dumper \%lookup;

#commented because I don't have your source data
#my $file   = "input_file_name";
#my $output = "output_file_name";

#open( my $input,  "<", $file )  or die "\nNot opening $file for reading\n\n";
#open( my $output, ">", $final ) or die "Did not open $final\n";

#while ( my $line = <$input> )
while ( my $line = <DATA> ) {
    foreach my $barcode (@barcode) {
        if ( my ($bar) = ( $line =~ /Barcode(\d+)\s+$barcode/ ) ) {
            print ">Barcode$bar" . "_" . "$barcode $lookup{$barcode}\n";
            #print {$output} ">Barcode$bar" . "_" . "$lookup{$barcode}\n";
        }
    }
}

__DATA__
Barcode001    001
Barcode002    002
Barcode003    003

Prints:

$VAR1 = {
          '001' => 'abc',
          '002' => 'def',
          '003' => 'ghi'
        };
>Barcode001_001 abc
>Barcode002_002 def
>Barcode003_003 ghi

2 Comments

Sorry, that is only a small part of my code, I always use strict and warnings. The search and replace is there because the script itself is a little more detailed and I cut down. However, the logic of your script is what I currently have in place. It is still just printing only the first element repeatedly. I will edit the question to make it clearer, thank you for trying to help :)
Looking at your script, I think the problem is - my @lines = <FILE> which will slurp the whole file, but do so into a lexically scoped variable - which will be discarded at the end of that iteration of the loop, and replaced with an empty one, next iteration (because you've already read in all of <FILE>).
0

It turns out it was a simple issue as I had suspected being a Monday. I had a colleague go through it with me, and it was the placing of the index:

#my $index = 0; #This means the index is iterated through, 
                #but for each barcode for one line, then it continues 
                #counting up and misses the other values, therefore 
                #repeatedly printing just the first element of the array.
open(FILE, "$file") or die "\nNot opening $file for reading\n\n";
open(OUT, ">$final") or die "Did not open $final\n";
while (<FILE>) {
    $index = 0; #New placement of $index for initialising.
    foreach my $barcode (@barcode) {
        my @line = <FILE>;
        foreach $_ (@line) {
            if ($_ =~ /Barcode([0-9]*)\t$barcode[$index]\t$otherarray[$index]/) {
                my $bar = $1;
                $_ =~ s/.*//;
            print OUT ">Barcode$bar"."_"."$barcode[$index]\t$otherarray[$index]";
            }
        print OUT $_;
        $index++; #Increment here
        }
    #$index++; 
    }
}

Thanks to everyone for their responses, for my original and poorly worded question they would have worked and may be more efficient, but for the purpose of the script and my edited question, it needs to be this way.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.