1

I'm making a program that compares two text files and returns their similarity (based on a given algorithm). For each unique word in the first file, I want to find the probabilities that they occur in the second file. But whenever I run my program the similarity returned is always 0.0. This is what I have right now:

public static double nestedLoop(String[] doc1, String[] doc2) {
    // nested loop: sort doc1, for each unique word in doc1, find all
    // occurences in doc2 using a sequential search

    java.util.Arrays.sort(doc1);

    double similarity = 0.0;

    for (int i = 0; i < doc1.length - 1; i++) {
        if (doc1[i] != doc1[i + 1]) {
        String unique = doc1[i];
        double count = 0.0;
            for (int j = 0; j < doc2.length; j++) {
                if (unique == doc2[j]) {
                    count++;
                    similarity += count / doc2.length;
                }
            }
        }
    }
    return similarity;
}

Can someone tell me what is going on?

1 Answer 1

9
if (unique == doc2[j]) {

should be

if (unique.equals(doc2[j])) {

Same for if (doc1[i] != doc1[i + 1]) {

should be:

 if (!(doc1[i].equals(doc1[i + 1]))) {

String comparison should always use equals() instead of == (except case of String literal comparison)

Please read How do I compare strings in Java?

Sign up to request clarification or add additional context in comments.

4 Comments

WOOOOOW I am stupid. I've definitely run into this before. I'll give it a try. Thanks!
actually, one more small question. If I have to use equals() for String comparison, how come in the first if statement I can use the != to compare doc1[i] and doc[i + 1]? Aren't they also Strings?
@59eagle: Valid question, yes you need to use .equals() there also.
@ColinD: Added clarification.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.