1

I'm stuck on the following problem:

I have a list with a ton of duplicative data. This includes entry numbers and names.

The following gives me a list of unique (non duplicative) names of people from the Data2014 table:

   tablequery = c.execute("SELECT * FROM Data2014")
   tablequery_results = list(people2014)
   people2014_count = len(tablequery_results)
   people2014_list = []
    for i in tablequery_results:
        if i[1] not in people2014_list:
            people2014_list.append(i[1])
    people2014_count = len(people2014_list)
    # for i in people2014_list:
    #     print(i)

Now that I have a list of people. I need to iterate through tablequery_results again, however, this time I need to find the number of unique entry numbers each person has. There are tons of duplicates in the tablequery_results list. Without creating a block of code for each individual person's name, is there a way to iterate through tablequery_results using the names from people2014_list as the unique identifier? I can replicate the code from above to give me a list of unique entry numbers, but I can't seem to match the names with the unique entry numbers.

Please let me know if that does not make sense.

Thanks in advance!

4
  • 2
    Why don't you do the grouping and aggregating using SQL directly? Commented Oct 25, 2017 at 17:01
  • The only way I know how is c.execute("SELECT * FROM Data2014" WHERE Name = 'John'). This still leaves me with the problem of replicating it for every single person on the list, right? Just so you know, I'm not super familiar with SQL, so I'm probably wrong. Commented Oct 25, 2017 at 17:05
  • 1
    If you used a set instead of a list for people2014_list, then it would not contain duplicates. As was previously suggested, though, using the SQL DISTINCT or GROUP BY statements may be the better way to go. Commented Oct 25, 2017 at 17:54
  • 1
    Thanks for your responses. Your comments made me realize I needed to learn a bit more sql. So I did, and was able to come up with a solution. So thank you! Commented Oct 26, 2017 at 13:35

1 Answer 1

1

I discovered my answer after delving into SQL a bit more. This gives me a list with two columns. The person's name in the first column, and then the numbers of entries that person has in the second column.

def people_data():
    data_fetch = c.execute("SELECT person, COUNT(*) AS `NUM` FROM Data2014 WHERE ACTION='UPDATED' GROUP BY Person ORDER BY NUM DESC")
    people_field_results = list(data_fetch)
    people_field_results_count = len(people_field_results)
    for i in people_field_results:
        print(i)
    print(people_field_results_count)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.