1

I have a string with some words in them, example a=1 b=2 c=3 a=50. Now I want to parse this and create another string a=50 b=2 c=3 which is essentially the same as above except that if the same phrase before the = is encountered for the second time the first one is over written with the latest one, so in the end there are only unique phrases on the left of =. Here is what I got till now:

a="a=1 b=2 c=3 a=50"
o=()

for i in $a
do
    reg=${i%=*}
    if [[ ${o[*]} == *"$reg"* ]]
    then
        o=$(echo ${o[*]} | sed -e "s/\$reg=\S/\$i")
    else
        o+=( $i )
    fi
done

What am I doing wrong here?

9
  • Why is the result a=5, not a=50? Commented Dec 19, 2014 at 22:28
  • Use an associative array whose keys are the words before =. Commented Dec 19, 2014 at 22:28
  • 1
    Did you mean to use double quotes with sed? Commented Dec 19, 2014 at 22:29
  • Does the ordering matter on the output end? Commented Dec 19, 2014 at 22:31
  • nope, that doesn't matter. Commented Dec 19, 2014 at 22:32

3 Answers 3

3

I'd take an entirely different approach, not based on regular expressions or string rewriting.

declare -A values=( )              # Initialize an associative array ("hash", "map")
while IFS= read -r -d' ' word; do  # iterate over input words, separated by spaces
  if [[ $word = *=* ]]; then       # ignore any word that doesn't have an "=" in it
    values[${word%%=*}]=${word#*=} # add everything before the "=" as a key...
  fi                               # ...with everything after the "=" as a value
done

for key in "${!values[@]}"; do     # Then iterate over keys we found
  value="${values[$key]}"          # ...extract the values for each...
  printf '%s=%s ' "$key" "$value"  # ...and print the pairs.
done
echo                               # When done iterating, print a newline.

Because the words are being processed first-to-last through the string, updates take effect before the print loop is reached.

Sign up to request clarification or add additional context in comments.

Comments

3

Using awk

$ awk -F= -v RS=" |\n" '{a[$1]=$2} END{for (k in a) printf "%s=%s ",k,a[k]}' <<<"a=1 b=2 c=3 a=50"
a=50 b=2 c=3

How it works:

  • -F=

    Set the field separator to be an equal sign.

  • -v RS=" |\n"

    Set the record separator to be either a space or a newline.

  • a[$1]=$2

    Update associative array a with the latest value.

  • END{for (k in a) printf "%s=%s ",k,a[k]}

    In no particular order, print out the final values.

Using bash

Like Charles Duffy's approach, this uses read -d" " to parse the string. This approach, however, uses IFS="=" to separate names and values.

Two loops are required. The first gathers the values. The second reassembles the updated values in the original order:

a="a=1 b=2 c=3 a=50"
declare -A b
while IFS== read -d" " name value
do
    b["$name"]="$value"
done <<<"$a "

declare -A seen
while IFS== read -d" " name value
do
    [ "${seen[$name]}" ] || o="$o $name=${b["$name"]}"
    seen[$name]=1
done <<<"$a "
echo "$o"

1 Comment

Nice. The only quibble I have here is that iterating through the string a second time, it reads to me like you could be printing keys twice. Perhaps use a second associative array to track what has and hasn't been printed to avoid that? Retaining order is a nice feature (and one I paid no attention to).
1

Easily done with perl:

echo "a=1 b=2 c=3 a=50" \
  | sed "s/ /\n/g" \
  | perl -e '
my %hash = ();
while(<>){
  $line = $_;
  if($line =~ m/(\S+)=(\S+)/) {
    $hash{$1} = $2;
  }
}
for $key (sort keys %hash) {
  print "$key=$hash{$key}\n";
}'

...or, all on one line:

echo "a=1 b=2 c=3 a=50" | sed "s/ /\n/g" | perl -e 'my %hash = (); while(<>){ $line = $_; if($line =~ m/(\S+)=(\S+)/) { $hash{$1} = $2; } } for $key (sort keys %hash) { print "$key=$hash{$key}\n"; }'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.