-4

I'm working on the following pipeline (codeshare.io/243VJE):

#!/bin/bash
#SBATCH --nodes=1 --ntasks=1 --cpus-per-task=10
#SBATCH --time=2-00:00:00
#SBATCH --mem=40gb
#
#SBATCH --job-name=pan_pca
#SBATCH --output=pan_pca.out
#SBATCH --array=[1-46]%46
#
#SBATCH --partition=DSS,NBFC

NAMES=$1
d=$(sed -n "$SLURM_ARRAY_TASK_ID"p $NAMES)


ID="$(echo ${d}/*_hap[0-1]_hprc_r2.fa.gz | sed 's#/path/to/temp_pca/[A-Z,0-9,a-z]\+_[0-9,a-z]\+/##' | sed 's#_hap[0-9]_hprc_r2.fa.gz##')"


HAP=( "hap0" "hap1" "hap2")

SEX="$(echo ${d} | sed 's#/path/to/temp_pca/[A-Z,0-9,a-z]\+_##')"

CHR_13=("chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9"
        "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17"
        "chr18" "chr19" "chr20" "chr21" "chr22" "chrX" "chrY" "chrM" )
CHR_38=("chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9"
        "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17"
        "chr18" "chr19" "chr20" "chr21" "chr22" "chrX" "chrY" "chrUn"
        "chrM" "chrEBV" )
CHR_P=("chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9"
       "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17"
       "chr18" "chr19" "chr20" "chr21" "chr22" "chrY" "chrUn" )
CHR_M=("chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9"
       "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17"
       "chr18" "chr19" "chr20" "chr21" "chr22" "chrX" "chrUn" "chrM" )

SEQKIT="./tools/seqkit"
BGZIP="./tools/bgzip"

cd ~

#PREPARES THE INPUT
if [ -e $d/select.done ]
then
  echo "regions selecion done!"
else
  for h in "${HAP[@]}"; do
        if [ "$h" == "hap0" ]
        then
            if [ "$SEX" == "ref" ]
            then
                for c in "${CHR_38[@]}"; do
          ###generate chromosome FASTA
                    ${SEQKIT} grep -f $d/${ID}_${c}_${h}_contigs_one.list $d/${ID}_${h}_hprc_r2.fa.gz -j 10 |
            ${BGZIP} -@10 > $d/${ID}_${c}.fna.gz
                done
            elif [ "$SEX" == "t2t" ]; then
                for c in "${CHR_13[@]}"; do
          ###generate chromosome FASTA
                    ${SEQKIT} grep -f $d/${ID}_${c}_${h}_contigs_one.list $d/${ID}_${h}_hprc_r2.fa.gz -j 10 |
            ${BGZIP} -@10 > $d/${ID}_${c}.fna.gz
                done
            elif [ "$SEX" == "m" ] || [ "$SEX" == "f" ]; then
                echo "nothing to be done, running in diploid mode"
            fi
        fi
        else
            if [ "$SEX" == "m" ]
      then
        if [ "$h" == "hap1" ]
        then
          for c in "${CHR_P[@]}"; do
            ###generate chromosome FASTA
            ${SEQKIT} grep -f $d/${ID}_${c}_${h}_contigs_one.list $d/${ID}_${h}_hprc_r2.fa.gz -j 10 |
              ${BGZIP} -@10 > $d/${ID}_${c}_${h}.fna.gz
          done
        else
          for c in "${CHR_M[@]}"; do
            ###generate chromosome FASTA
            ${SEQKIT} grep -f $d/${ID}_${c}_${h}_contigs_one.list $d/${ID}_${h}_hprc_r2.fa.gz -j 10 |
              ${BGZIP} -@10 > $d/${ID}_${c}_${h}.fna.gz
                    done
        fi
            elif [ "$SEX" == "f" ]; then
        for c in "${CHR_M[@]}"; do
          ###generate chromosome FASTA
                    ${SEQKIT} grep -f $d/${ID}_${c}_${h}_contigs_one.list $d/${ID}_${h}_hprc_r2.fa.gz -j 10 |
            ${BGZIP} -@10 > $d/${ID}_${c}_${h}.fna.gz
                done
            elif [ "$SEX" == "ref" ] || [ "$SEX" == "t2t" ]; then
                echo "nothing to be done, running in haploid mode"
            fi
  done
  touch $d/select.done
fi

and wish to automate all steps; however, for some reason it's throwing this error at an else...

syntax error near unexpected token 'else'

Even pasting it in shellcheck.net doesn't seem to help, or at least I cannot figure out what to do. I'm a bit at a loss, let me know whether there is something I'm missing, thanks!


P. S. removing the first if conditional where I check for "$h" == "hap0" seems to solve the issue, but at that point for two samples I have to run things interactively...

15
  • 5
    Just taking a step back here: this really is hard to read due to inconsistent indent depth, random placement of linebreaks before or after then, and you have a control flow syntax problem, which probably means you have lead yourself astray with which ifs, fis and elses belong together. Throw a code indenter at your script. (or, alternatively, I'm not convinced a shell scripting language is what you'd want to use here: other programming languages have much more expressive ways) Commented yesterday
  • 3
    (There's a fair bit of algorithmic criticism I'd have here, too: grepping in text files is fast, gzip decompression isn't, and you're decompressing the same files over and over again just to grep for different strings in them, for example) Commented yesterday
  • 2
    Even if you don't find a tool for fixing the indentation on shell code (and I don't know of tools for that either), at least fix the indentation manually. Almost any sensible editor should be able to add a level of indentation with tab, and to remove one with backspace. Here, splitting the code into functions would also help in keeping the whole thing manageable, because then you have less stuff at a time to look at. Anywhere you have if foo; then lots of stuff here; else lots of other stuff; fi, the inner blocks should be functions. Commented yesterday
  • 2
    -1 ... your code is very messy, which makes it difficult to read ... that makes it harder to spot errors, and makes it easier to introduce errors while writing the code ... please fix the indentation levels ... the error may even show itself Commented yesterday
  • 2
    Please make sure to apply the feedback you get from your previous questions to your current coding, e.g. unix.stackexchange.com/questions/799725/… and unix.stackexchange.com/questions/799725/… tell you about shellcheck.net, quoting variables, and variable naming conventions, all of which apply to your current script. Commented yesterday

4 Answers 4

7

You have a spurious fi (or else) in the middle of your block.

Shellcheck does point this out as its very first error:

Line 44:
        else
        ^-- SC1009 (info): The mentioned syntax error was in this else clause.

Typically this means you have a dangling else (which you do).

Here's a cleaned up and tagged version of your code loop:

#PREPARES THE INPUT
if [ -e "$d/select.done" ]
then
    echo "regions selecion done!"

else
    for h in "${HAP[@]}"
    do
        if [ "$h" == "hap0" ]
        then
            if [ "$SEX" == "ref" ]
            then
                for c in "${CHR_38[@]}"
                do
                    ###generate chromosome FASTA
                    "$SEQKIT" grep -f "$d/${ID}_${c}_${h}_contigs_one.list" "$d/${ID}_${h}_hprc_r2.fa.gz" -j 10 | "$BGZIP" -@10 > "$d/${ID}_${c}.fna.gz"
                done
            elif [ "$SEX" == "t2t" ]
            then
                for c in "${CHR_13[@]}"
                do
                    ###generate chromosome FASTA
                    "$SEQKIT" grep -f "$d/${ID}_${c}_${h}_contigs_one.list" "$d/${ID}_${h}_hprc_r2.fa.gz" -j 10 | "$BGZIP" -@10 > "$d/${ID}_${c}.fna.gz"
                done
            elif [ "$SEX" == "m" ] || [ "$SEX" == "f" ]
            then
                echo "nothing to be done, running in diploid mode"

            fi
        fi    ## UNINTENDED ??

        else    ## OTHERWISE WHAT'S THIS ??

            if [ "$SEX" == "m" ]
            then
                if [ "$h" == "hap1" ]
                then
                    for c in "${CHR_P[@]}"
                    do
                        ###generate chromosome FASTA
                        "$SEQKIT" grep -f "$d/${ID}_${c}_${h}_contigs_one.list" "$d/${ID}_${h}_hprc_r2.fa.gz" -j 10 | "$BGZIP" -@10 > "$d/${ID}_${c}_${h}.fna.gz"
                    done
                else
                    for c in "${CHR_M[@]}"
                    do
                        ###generate chromosome FASTA
                        "$SEQKIT" grep -f "$d/${ID}_${c}_${h}_contigs_one.list" "$d/${ID}_${h}_hprc_r2.fa.gz" -j 10 | "$BGZIP" -@10 > "$d/${ID}_${c}_${h}.fna.gz"
                    done
                fi

            elif [ "$SEX" == "f" ]
            then
                for c in "${CHR_M[@]}"
                do
                    ###generate chromosome FASTA
                    "$SEQKIT" grep -f "$d/${ID}_${c}_${h}_contigs_one.list" "$d/${ID}_${h}_hprc_r2.fa.gz" -j 10 | "$BGZIP" -@10 > "$d/${ID}_${c}_${h}.fna.gz"
                done
            elif [ "$SEX" == "ref" ] || [ "$SEX" == "t2t" ]
            then
                echo "nothing to be done, running in haploid mode"
            fi

        ## MISSING fi

    done
    touch "$d/select.done"
fi

Using a consistent indentation approach would help - whether that's putting do and then on the same line as the corresponding constructor, or putting them on the line below (as I have done).

You should also look to double quoting your variables pretty much whenever you use them. There's no harm in quoting when it's not strictly necessary, but there can be unintended consequences when missing double quotes like much of your code. I've provided a cleaned up version here, but the same approach should apply to the upper configuration/definition block too.

11
  • 3
    I wonder if using case...esac would make the code more readable? Following the ifs and elses and elsifs is a pain. Commented yesterday
  • 2
    And maybe functions to handle each case, those for generate chromosome.. loops can be abstracted. Commented yesterday
  • 1
    Well, bioinformatics or not, repeating those loops is unnecessary and should be moved into a function. This is, however, not a code review forum, so whatever solves the given issue is ok by me (but if it were me, I would definitely rewrite the code to be much shorter and easier to maintain). Commented yesterday
  • 1
    All I'm really saying is that I'd want to understand the intent of the code - preferably including an English language description - before I started rewriting it. Commented yesterday
  • 3
    @Matteo all the more important that you are strict about consistency in your code formatting! Seriously, I've advised and helped quite a lot of students who weren't that experienced with writing code but had to for their research. "Keep your code consistently formatted to spot your own mistakes" is the number 1 trick. (Number 2 is "learn git and commit your code often, and make sure you write meaningful commit messages, so you can always go back and compare what worked and what now doesn't work", because it saves a lot of days over the course of a thesis.) Commented yesterday
3

Not an answer, just a suggestion of an enhancement to the code Chris Davies provided - instead of writing essentially the same loop+grep+zip multiple times, even without using a function you could move the loop+grep+zip to a single location (untested and no attempt to fix existing issues):

for h in "${HAP[@]}"
do
    unset chrs
    if [ "$h" == "hap0" ]
    then
        if [ "$SEX" == "ref" ]
        then
            chrs=( "${CHR_38[@]}" )
        elif [ "$SEX" == "t2t" ]
        then
            chrs=( "${CHR_13[@]}" )
        elif [ "$SEX" == "m" ] || [ "$SEX" == "f" ]
        then
            echo "nothing to be done, running in diploid mode"
        fi
    fi    ## UNINTENDED ??

    else    ## OTHERWISE WHAT'S THIS ??

        if [ "$SEX" == "m" ]
        then
            if [ "$h" == "hap1" ]
            then
                chrs=( "${CHR_P[@]}" )
            else
                chrs=( "${CHR_M[@]}" )
            fi

        elif [ "$SEX" == "f" ]
        then
            chrs=( "${CHR_M[@]}" )
        elif [ "$SEX" == "ref" ] || [ "$SEX" == "t2t" ]
        then
            echo "nothing to be done, running in haploid mode"
        fi

    ## MISSING fi

    for c in "${chrs[@]}"
    do
        ###generate chromosome FASTA
        "$SEQKIT" grep -f "$d/${ID}_${c}_${h}_contigs_one.list" "$d/${ID}_${h}_hprc_r2.fa.gz" -j 10 | "$BGZIP" -@10 > "$d/${ID}_${c}_${h}.fna.gz"
    done

done

but for conciseness and maintainability you might want to use a case statement instead of all those if-elses, e.g.:

for h in "${HAP[@]}"; do
    unset chrs
    case "${h}_${SEX}" in
        hap0_m|hap0_f     ) echo "nothing to be done, running in diploid mode" ;;
        hap0_ref          ) chrs=( "${CHR_38[@]}" ) ;;
        hap0_t2t          ) chrs=( "${CHR_13[@]}" ) ;;

        hap1_m            ) chrs=( "${CHR_P[@]}" ) ;;
        hap1_f            ) chrs=( "${CHR_M[@]}" ) ;;
        hap1_ref|hap1_t2t ) echo "nothing to be done, running in haploid mode" ;;

        hap2_m|hap2_f     ) chrs=( "${CHR_M[@]}" ) ;;
        hap2_ref|hap2_t2t ) echo "nothing to be done, running in haploid mode" ;;
    esac

    for c in "${chrs[@]}"; do
        ###generate chromosome FASTA
        "$SEQKIT" grep -f "$d/${ID}_${c}_${h}_contigs_one.list" "$d/${ID}_${h}_hprc_r2.fa.gz" -j 10 | "$BGZIP" -@10 > "$d/${ID}_${c}_${h}.fna.gz"
    done
done
1
  • 1
    many thanks. I might need to re-run this pipeline on far more samples; I'm really happy to test this code have something more concise and easy to edit for future tasks! Commented 19 hours ago
3

I've had a go at attempting to convert your script to use shell functions and to use nested case statements instead of if statements. I've also fixed numerous quoting errors, but I may have missed a few.

It doesn't actually answer your question, Chris Davies' answer does a good job of doing that. It just shows an IMO better (and easier to read, easier to spot errors, and easier to modify) way of doing basically the same thing.

DO NOT RUN THIS CODE AS-IS, IT IS FOR ILLUSTRATION PURPOSES ONLY

I can not emphasise that warning enough! I do not know if it accurately implements the logical flow you were attempting, it was just my best guess at deciphering your intentions from all the if statements. You need to, at bare minimum, check every single one of the case clauses to make sure they are correct and that each clause is executing the correct function with the correct arguments.

The functions could certainly be optimised, Marcus Müller's comments about repeatedly decompressing the same files are particularly relevant here - you may want to create a temp directory (e.g. with mktemp -d or similar) before the main loop (for h in "${HAP[@]}"), decompress all the files into it, and (inside the functions) grep the uncompressed files. Then delete the temp directory after the main loop.

It might even be worthwhile writing an awk or perl script or one-liner for the functions to grep all the uncompressed files in one pass rather than multiple passes. Also worth noting is that perl's IO layer can transparently read compressed files so if doing just one pass there'd be no need to decompress the files before grepping them.

Speaking of perl, check out bioperl.org. And the equivalent for python, biopython.org. You may find that there are already well-tested and documented solutions to or examples for many of the things you're writing yourself.

You probably also don't need the $SEQKIT or $BGZIP variables since this version uses functions and the program names they contain only get used a few times. Hard-coding them in the functions would probably be fine.

And you can undoubtedly come up with much better function names than I did. Names that reflect what the functions do, rather than just based on the style of output filenames. seq_diploid and seq_haploid, perhaps...don't know, I know programming, not genetics.

Anyway:

#!/bin/bash
#SBATCH --nodes=1 --ntasks=1 --cpus-per-task=10
#SBATCH --time=2-00:00:00
#SBATCH --mem=40gb
#
#SBATCH --job-name=pan_pca
#SBATCH --output=pan_pca.out
#SBATCH --array=[1-46]%46
#
#SBATCH --partition=DSS,NBFC

NAMES=$1
d=$(sed -n "$SLURM_ARRAY_TASK_ID"p "$NAMES")

ID=$(printf "%s" "$d"/*_hap[0-1]_hprc_r2.fa.gz |
       sed 's#/path/to/temp_pca/[A-Z,0-9,a-z]\+_[0-9,a-z]\+/##;
            s#_hap[0-9]_hprc_r2.fa.gz##')

HAP=(hap0 hap1 hap2)

SEX=$(printf "%s" "$d" | sed 's#/path/to/temp_pca/[A-Z,0-9,a-z]\+_##')

# Eliminated duplication of common array elements
COMMON=(chr1  chr2  chr3  chr4  chr5  chr6  chr7  chr8  chr9  chr10 chr11
        chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22)
CHR_13=("${COMMON[@]}" chrX chrY chrM)
CHR_38=("${COMMON[@]}" chrX chrY chrUn chrM chrEBV)
CHR_P=( "${COMMON[@]}" chrY chrUn)
CHR_M=( "${COMMON[@]}" chrX chrUn chrM)

SEQKIT='./tools/seqkit'
BGZIP='./tools/bgzip'

seq_c () {
  ### Generate chromosome FASTA with output files "$d/${ID}_${c}.fna.gz"
  # INPUT:
  # $1  - directory
  # $2  - ID
  # $3  - hap
  # $4+ - chr array

  # define the following variables to be local to the
  # function so they don't interfere with any global
  # variables in the main part of the script that might
  # have the same name.
  local d ID h c

  d=$1  ; shift
  ID=$2 ; shift
  h=$3  ; shift

  for c in "$@"; do
    "$SEQKIT" grep -f "$d/${ID}_${c}_${h}_contigs_one.list" \
                      "$d/${ID}_${h}_hprc_r2.fa.gz" -j 10 | 
      "$BGZIP" -@10 > "$d/${ID}_${c}.fna.gz"
  done
}

seq_c_h () {
  ### Generate chromosome FASTA with output files "$d/${ID}_${c}_${h}.fna.gz"
  # INPUT:
  # $1  - directory
  # $2  - ID
  # $3  - hap
  # $4+ - chr array

  local d ID h c
  d=$1  ; shift
  ID=$2 ; shift
  h=$3  ; shift

  for c in "$@"; do
    "$SEQKIT" grep -f "$d/${ID}_${c}_${h}_contigs_one.list" \
                      "$d/${ID}_${h}_hprc_r2.fa.gz" -j 10 |
      "$BGZIP" -@10 > "$d/${ID}_${c}_${h}.fna.gz"
  done
}


cd ~

# PREPARES THE INPUT
if [ -e "$d/select.done" ]; then
  echo "regions selection done!"
else
  for h in "${HAP[@]}"; do
    case "$h" in
      hap0) case "$SEX" in
              ref) seq_c "$d" "$ID" "$h" "${CHR_38[@]}" ;;
              t2t) seq_c "$d" "$ID" "$h" "${CHR_13[@]}" ;;
              m|f) echo "nothing to be done, running in diploid mode" ;;
            esac ;;

      hap1) case "$SEX" in
              m) seq_c_h "$d" "$ID" "$h" "${CHR_P[@]}" ;;
              # To me, it seems there maybe should be other $SEX cases for
              # hap1 here, but your original if statements didn't have them
            esac ;;

         *) case "$SEX" in
              m) seq_c_h "$d" "$ID" "$h" "${CHR_M[@]}" ;;
              f) seq_c_h "$d" "$ID" "$h" "${CHR_M[@]}" ;;
              ref|t2t) echo "nothing to be done, running in haploid mode" ;;
            esac ;;
    esac
  done
  touch "$d/select.done"
fi
8
  • 2
    The * (default / fall-through) case could probably be hap2 instead, as that's the only other possible value for $h in your code. Depends on whether you re-use roughly the same code in other scripts with more cases. In which case, it would probably make more sense to just write one script that works for all of them. Wherever possible, try to parameterise stuff so that you don't repeat yourself - repeating stuff is a very common source of errors, you end up with multiple slightly different and conflicting versions of the same thing and never know which is the "latest" or "most correct". Commented yesterday
  • I'd avoid the if...else nesting, something like: if "..selection done"; then exit 0 and below the actual process, which would result IMO in a clearer flow and code reading. Commented 23 hours ago
  • @schrodingerscatcuriosity yeah, i though about doing that but I didn't know if the OP intended to do anything after the fi of that if statement. And the main point was to replace the bulk of the if/else/elif statements with much simpler case statements and some function definitions. Commented 21 hours ago
  • @cas thanks this makes the use of functions clearer to me; I will attempt to use them in the next pipeline I'll build taking inspiration from this. If I understood correctly, since your code is spitting cases by hap it is correct that for hap1 you only have one case because it's the only one that for individuals of male sex has a Y chromosome. Commented 19 hours ago
  • This is lovely, thanks cas! Commented 15 hours ago
1

And here's my attempt to tidy up a little. I quoted all (or at least most) variables, I replaced your CAPS names with lower case since it is bad practice to use capitalized names for unexported shell variables, I collapsed your chromosome arrays since they're all just the same set of autosomal chromosomes and only differ in the sex, mitochondrial and unaligned ones.

Of course this has not been tested as I don't have access to your data, but it should do pretty much exactly what your script does:

#!/bin/bash
#SBATCH --nodes=1 --ntasks=1 --cpus-per-task=10
#SBATCH --time=2-00:00:00
#SBATCH --mem=40gb
#
#SBATCH --job-name=pan_pca
#SBATCH --output=pan_pca.out
#SBATCH --array=[1-46]%46
#
#SBATCH --partition=DSS,NBFC



## Convert repeated code into functions
generate_fasta(){
  local d=$1
  local id=$2
  local c=$3
  local h=$4
    "$seqkit" grep -f "$d"/"$id"_"$c"_"$h"_contigs_one.list "$d"/"$id"_"$h"_hprc_r2.fa.gz -j 10 |
    "$bgzip" -@10 > "$d"/"$id"_"$c".fna.gz
}

d=$(sed -n "$SLURM_ARRAY_TASK_ID"p "$1")

## Avoid upper case variable names for unexported variables
id="$(echo ${d}/*_hap[0-1]_hprc_r2.fa.gz | sed 's#/path/to/temp_pca/[A-Z,0-9,a-z]\+_[0-9,a-z]\+/##' | sed 's#_hap[0-9]_hprc_r2.fa.gz##')"

hap=( "hap0" "hap1" "hap2")

sex="$(echo ${d} | sed 's#/path/to/temp_pca/[A-Z,0-9,a-z]\+_##')"

## combine your chr arrays, no need to redefine everything
main_chroms=("chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9"
             "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17"
             "chr18" "chr19" "chr20" "chr21" "chr22" )

chr_13=( "${main_chroms[@]}" "chrX" "chrY" "chrM" )
chr_38=( "${main_chroms[@]}" "chrX" "chrY" "chrUn" "chrM" "chrEBV")
chr_P=( "${main_chroms[@]}" "chrY" "chrUn")
chr_M=( "${main_chroms[@]}" "chrX" "chrUn" "chrM")

seqkit="./tools/seqkit"
bgzip="./tools/bgzip"


cd ~

#PREPARES THE INPUT
if [ -e "$d"/select.done ]
then
  echo "region selection done!"
else
  for h in "${hap[@]}"; do
    ## Use variables to decide what you need
    case $sex in
      "m")
        case $h in
          "hap0")
            chroms=
            mode="diploid";;
          "hap1")
            chroms=("${chr_P[@]}");;
          "hap2")
            chroms=("${chr_M[@]}");;
        esac;;
      "f")
        echo "Enter1"
        case $h in
          "hap0")
            chroms=
            mode="diploid"
            ;;
          
          "hap1"|"hap2")
            chroms=("${chr_M[@]}")
        esac;;
      "ref")
        case $h in
          "hap0")
            chroms=("${chr_38[@]}");;
          "hap1"|"hap2")
            chroms=("${chr_M[@]}")
            mode="haploid";;
        esac;;
      "t2t")
        case $h in
          "hap0")
            chroms=("${chr_13[@]}");;
          "hap1"|"hap2")
            chroms=
            mode="haploid"
        esac;;
      *)
        echo "Unknown sex $sex"
    esac
    if [[ -z "${chroms[@]}" ]]; then
      echo "nothing to be done, running in $mode mode"
    else
          for c in "${chroms[@]}"; do
              ###generate chromosome FASTA
              echo generate_fasta "$d" "$id" "$c" "$h"
          done
    fi
  done
  touch "$d"/select.done
fi

If I were you, I would also replace my ugly and clunky case statement with the far more elegant and concise version provided by Ed Morton.

1
  • thanks for this. I see what you mean Ed provided a cleaner way but also your approach is a big step forward compared to mine! Thanks for sharing I will try to apply it to future cases. Commented 19 hours ago

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.