0

Looking to be able to parse an array based on a variable and take the next 2 characters

array=( 7501 7302 8403 9904 )

if var = 73, result desired is 02
if var = 75, result desired is 01
if var = 84, result desired is 03
if var = 99, result desired is 04

Sorry if this is an elementary question, but I've tried variations of cut and grep and cannot find the solution.

Any help is greatly appreciated.

2 Answers 2

2

You can use this search function using printf and awk:

srch() {
    printf "%s\n" "${array[@]}" | awk -v s="$1" 'substr($1, 1, 2) == s{
    print substr($1, 3)}' ;
}

Then use it as:

srch 75
01

srch 73
02

srch 84
03

srch 99
04
Sign up to request clarification or add additional context in comments.

10 Comments

Hmm. If it really is just two-digit keys, you're right that we can't have inputs so long that the difference between O(1) and O(n) is all that big in practice; probably would be sticking in the space where startup and subshell/pipeline costs are the main drivers. OTOH, if this lookup is done inside a tight loop (ie. one per line of input), then we could get to a place where calls can add up, just from the aforementioned constant-factor expenses.
@CharlesDuffy: 1e+37 iterations (awk: r:0.573s, u:0.276s, s:0.253s) vs (map: r:0.215s, u:0.104s, s:0.079s)
Thanks for this benchmark. Above function will become faster if I run printf "%s\n" "${array[@]}" > tempFIle.tmp and then keep only awk -v s="$1" 'substr($1, 1, 2) == s{ print substr($1, 3)}' tempFIle.tmp in the function body.
@l'L'l, might I ask for implementation details on that test? I'd like to reproduce -- I'm seeing a much bigger difference than that between for ((i=0; i<1000000; ++i)); do : "${replacements[37]}"; done -- at 15.4s per million iterations -- and for ((i=0; i<1000000; ++i)); do srch 37; done >/dev/null (which I'm too impatient to let complete before posting this comment).
@CharlesDuffy: Sorry, I was away for a few minutes. You have to consider the test I ran was on a cray xe6, so it was blazingly fast. gist.github.com/anonymous/27859e55f9726619f339db0fb96d0b30. I shortened up the loops since you're probably not on a cray :) I think you'll see a drastic difference between the map method and awk in normal setting in the time it actually takes.
|
2

Since bash arrays are sparse, even in older versions of bash that don't have associative arrays (mapping arbitrary strings as keys), you could have a regular array that has keys only for numeric indexes that you wish to map. Consider the following code, which takes your input array and generates an output array of that form:

array=( 7501 7302 8403 9904 )

replacements=( )                    # create an empty array to map source to dest
for arg in "${array[@]}"; do        # for each entry in our array...
  replacements[${arg:0:2}]=${arg:2} # map the first two characters to the remainder.
done

This will create an array that looks like (if you ran declare -p replacements after the above code to dump a description of the replacements variable):

# "declare -p replacements" will then print this description of the new array generated...
# ...by the code given above:
declare -a replacements='([73]="02" [75]="01" [84]="03" [99]="04")'

You can then trivially look up any entry in it as a constant-time operation that requires no external commands:

$ echo "${replacements[73]}"
02

...or iterate through the keys and associated values independently:

for key in "${!replacements[@]}"; do
  value=${replacements[$key]}
  echo "Key $key has value $value"
done

...which will emit:

Key 73 has value 02
Key 75 has value 01
Key 84 has value 03
Key 99 has value 04

Notes/References:

3 Comments

Nice solution, question: what happens if you have duplicate entries (e.g.. [73] x 2)?
@l'L'l, the latter overwrites the former. I don't see anything specified in the question that that behavior would conflict with.
Hmm. I suppose correct behavior depends on what the OP wants -- the way I read the awk-based answer, it would print suffixes for both matching entries if you had a duplicate. Could extend this to do likewise, if desired (or extend the awk answer to use first or last match only), but probably worth getting an explicit spec.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.