0

I am finding the most optimized way to create an integer array with the help of data from a csv file.

The csv file("sample.csv") data is like,

prediciton1 prediction2 prediction3
low low low
low high high

where low = 1, high = 3,

i want to read these data from the csv and make an array that looks like,

array =  
[[1,1,1],  
[1,3,3]]

    import csv
    import sys
    
    num1 = 'low'
    num2 = 'high'
    
    
    csv_file = csv.reader(open('sample.csv', "r"), delimiter=",")
    
    count = 0
    
    for row in csv_file:
    
        if count == 1:
            if num1 == row[0]:
                dat1 = 1
        
            elif num2 == row[0]:
                dat1 = 3
            if num1 == row[1]:
                dat2 = 1
        
            elif num2 == row[1]:
                dat2 = 3
                
            if num1 == row[2]:
                dat3 = 1
        
            elif num2 == row[2]:
                dat3 = 3
        
            
        count = count + 1

    array =[dat1,dat2,dat3]

This approach works but seems much inefficient. Finding an alternative and optimized way to achieve this.

2 Answers 2

3

Using a dict for a lookup and list comprehension

Ex:

check = {'low': 1, "high": 3}

with open('sample.csv') as infile:
    csv_file = csv.reader(infile)
    next(csv_file) # skip header
    result = [[check[c] for c in row] for row in csv_file]
Sign up to request clarification or add additional context in comments.

1 Comment

Nice and clean solution! I would only add that, just in case the CSV contains unwanted values (as it sometimes may happen) I would use check.get(c) instead of check[c] and so handle later the potential None values contained in the result array.
1

As the CSV file is pretty simple, you could even do it without the CSV package:

# We open the CSV file
with open('sample.csv') as file:
    # We read all the lines and store them in a list
    lines = file.readlines()

# We remove the CSV header
lines = lines[1:]

# We create the array
array = [
    [{
        'low': 1,
        'high': 3
    }.get(value.strip()) for value in line.split(',')
] for line in lines]

# We print the array
print(array)

This would print the following array: [[1, 1, 1], [1, 3, 3]]

Note the use of the dict's get method to avoid errors just in case the CSV has unwanted values. The array would contain a None value in those cases.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.