search duplicate element array

Question

This one work:

arr[0]="XX1 1"
arr[1]="XX2 2" 
arr[2]="XX3 3"
arr[3]="XX4 4"
arr[4]="XX5 5"
arr[5]="XX1 1"
arr[6]="XX7 7"
arr[7]="XX8 8"

duplicate() { printf '%s\n' "${arr[@]}" | sort -cu |& awk -F: '{ print $5 }'; }

duplicate_match=$(duplicate)

echo "array: ${arr[@]}"

# echo "duplicate: $duplicate_match"

[[ ! $duplicate_match ]] || { echo "Found duplicate:$duplicate_match"; exit 0; }

echo "no duplicate"

with same code, this one doesn't work, why ?

arr[0]="XX"
arr[1]="wXyz" 
arr[2]="ABC"
arr[3]="XX"

Your code doesn't actually work, because sort -cu fails when the input is not already sorted; the duplicate it finds in the first data set just happens to be the first item that occurs out of sorted order. — chepner
– chepner, Commented Feb 26, 2014 at 23:02
the pipe-ampersand combination is only valid in c-shell, not in bash — thom
– thom, Commented Feb 26, 2014 at 23:04
@chepner Thanks, i will search for how to sort my array in the right place. — user3353499
– user3353499, Commented Feb 26, 2014 at 23:13
@chepner thanks, I stand corrected. pipe-ampersand is indeed valid. — thom
– thom, Commented Feb 27, 2014 at 0:59

anubhava · Accepted Answer · 2014-02-26 23:15:49Z

6

To check duplicates this code is much simpler and works in both cases:

uniqueNum=$(printf '%s\n' "${arr[@]}"|awk '!($0 in seen){seen[$0];c++} END {print c}')

(( uniqueNum != ${#arr[@]} )) && echo "Found duplicates"

EDIT: To print duplicates use this awk:

printf '%s\n' "${arr[@]}"|awk '!($0 in seen){seen[$0];next} 1'

Awk command stores in an array seen if a line isn't already part of seen array and next move to the next line. 1 in the end prints only those lines that are duplicates.

edited Feb 26, 2014 at 23:15

answered Feb 26, 2014 at 23:01

anubhava

790k67 gold badges603 silver badges671 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

user3353499 Over a year ago

Thanks Anubhava, i need to study your code to fully understand it, how can i return the duplicate element in echo with it please? Also anyone can correct my code please? i'm on this since two hour and finishing using another code without understanding mine is frustrating :(

anubhava Over a year ago

See chepner's answer below why your code failed if you want to understand it.

anubhava Over a year ago

I have also added some explanation to my answer.

anubhava Over a year ago

@Neeraj: Try this: printf '%s\n' "${arr[@]}" | awk '!seen[$0]++ {} END {print length(seen)}'

Krister Janmore · Accepted Answer · 2019-06-05 18:46:43Z

Slightly silly solution here. I just wanted to see if I could do this in a single command without explicit pipes. (I think for very large arrays/array elements, explicit pipes might actually be more efficient.)

Note that this is a test for the presence of duplicate array elements, and doesn't output the duplicates themselves, although the awk command on its own will do that. Also note that if you're unlucky enough to have array elements that contain spaces, the below won't evaluate as described.

[[ $( awk -v RS=" " ' a[$0]++ ' <<< "${arr[@]} " ) ]] && echo "dups found"

Explanation:

awk -v RS=" "

do the subsequent awk command on each input record with space as the record separator. Basically, this will make awk treat each array element as a separate "line".

' a[$0]++ '

awk command that does two things:
- return at the value at key $0 in array a. If this is greater than 0, print the line. Compare to awk ' { $1=$2 } 1 '
- Add 1 to the value at key $0 in array a.

<<< "${arr[@]} "

as the input of the awk command, use the string created when you print each element in arr as a separate word, i.e. separated by space PLUS AN ADDITIONAL SPACE AT THE END.
The space between } and " is actually really important, because without it the final array element will not have a space after it and therefore will not be counted as a distinct "record" by awk.

[[ $( ... ) ]]

If the containing awk command gives any output at all, the test evaluates to 0, i.e. TRUE.

Collectives™ on Stack Overflow

search duplicate element array

2 Answers 2

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related