1

I'm trying to remove duplicate id's from a list on a browser I'm working on. The list is converted to array and then added to a richtextbox. This is needed for the generic bookmarking system I am working on. The issue is it scrapes duplicates. Even after I add distinct to the code.

string html = WebsCon2.ExecuteJavascriptWithResult("document.getElementsByTagName('html')[0].innerHTML");
var htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(html); var playerIds = new List<string>();
var playerNodes = htmlDoc.DocumentNode.SelectNodes("//a[contains(@href, '/link/profile-view.jsp?user=')]").Distinct();
foreach (var playerNode in playerNodes)
{
    string href = playerNode.Attributes["href"].Value;
    var parts = href.Split(new char[] { '=' }, StringSplitOptions.RemoveEmptyEntries);
    if (parts.Length > 1)
    {
        playerIds.Add(parts[1]);
    }
    string Target = string.Join("", playerIds.ToArray());
    PlayerID.Text = Target;
}

So is there another way I can remove the duplicates?

2 Answers 2

3

The problem is that Distinct doesn't compare your nodes by their content. It compares them by reference. If you want distinct nodes by InnerText you can use:

var playerNodes = htmlDoc.DocumentNode
            .SelectNodes("//a[contains(@href, '/link/profile-view.jsp?user=')]")
            .GroupBy(node => node.InnerText)
            .Select(g => g.First());

Or you can use DistinctBy method from MoreLINQ:

var playerNodes = htmlDoc.DocumentNode
            .SelectNodes("//a[contains(@href, '/link/profile-view.jsp?user=')]")
            .DistinctBy(node => node.InnerText);
Sign up to request clarification or add additional context in comments.

Comments

2

Declare the playerIds as a HashSet<string> and then you are sure there will be no duplicates. That's the easy solution to your problem. Check the HashSet constructors to provide a comparer that will allow you to make sure you are case-sensitive/case-insensitive with the distinct strings (depending on your preference).

The Distinct() call you're making you can get rid of. This will not help you doing it the way you call it.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.