Something wrong with output from list in Python

Question

I want a Python program to import a list of words from a text file and print out the content of the text file as two lists. The data in the text file is on this form:

A Alfa
B Betta
C Charlie

I want a Python program to print out one list with A,B,C and one with Alfa, Betta, Charlie.

This is what I've written:

english2german = open('english2german.txt', 'r')
englist = []
gerlist = []

for i, line in enumerate(english2german):
    englist[i:], gerlist[i:] = line.split()

This is making two lists, but will only print out the first letter in each word. How can I make my code to print out the whole word?

mipadi · Accepted Answer · 2009-04-13 07:30:34Z

6

You want something like this:

english2german = open("english2german.txt")
englist = []
gerlist = []

for line in english2german:
    (e, g) = line.split()
    englist.append(e)
    gerlist.append(g)

The problem with your code before is that englist[i:] is actually a slice of a list, not just a single index. A string is also iterable, so you were basically stuffing a single letter into several indices. In other words, something like gerlist[0:] = "alfa" actually results in gerlist = ['a', 'l', 'f', 'a'].

answered Apr 13, 2009 at 7:30

mipadi

414k91 gold badges538 silver badges489 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Community · Accepted Answer · 2017-05-23 12:20:37Z

6

And even shorter than amo-ej1's answer, and likely faster:

In [1]: english2german = open('english2german.txt')
In [2]: eng, ger = zip(*( line.split() for line in english2german ))
In [3]: eng
Out[3]: ('A', 'B', 'C')
In [4]: ger
Out[4]: ('Alfa', 'Betta', 'Charlie')

If you're using Python 3.0 or from future_builtins import zip, this is memory-efficient too. Otherwise replace zip with izip from itertools if english2german is very long.

edited May 23, 2017 at 12:20

CommunityBot

11 silver badge

answered Apr 13, 2009 at 7:58

Autoplectic

7,70632 silver badges30 bronze badges

3 Comments

dbr Over a year ago

That's.. horrible. It might be faster, but I really doubt it's "usefully-faster", and it's far harder to read (the * especially)

Autoplectic Over a year ago

it's the 'unzip' operation, it's a fairly common idiom to join up pairs of things.

dbr Over a year ago

I've benchmarked the zip method against the code in mipadi's answer. zip is slightly slower with a small set of data, but slightly quicker with 10,000 lines... but the difference is about 0.05 on each..

ZeD · Accepted Answer · 2009-04-13 14:04:15Z

3

just an addition: you're working with files. please close them :) or use the with construct:

with open('english2german.txt') as english2german:
  englist, gerlist = zip(*(line.split() for line in english2german))

answered Apr 13, 2009 at 14:04

ZeD

5713 silver badges3 bronze badges

Comments

amo-ej1 · Accepted Answer · 2009-04-13 07:32:34Z

1

Like this you mean:

english2german = open('k.txt', 'r')
englist = []
gerlist = []

for i, line in enumerate(english2german):
    englist.append(line.split()[0])
    gerlist.append(line.split()[1])

print englist
print gerlist

which generates:

['A', 'B', 'C'] ['Alfa', 'Betta', 'Charlie']

answered Apr 13, 2009 at 7:32

amo-ej1

3,30728 silver badges35 bronze badges

Comments

ibz · Accepted Answer · 2009-04-13 08:46:23Z

1

The solutions already posted are OK if you have no spaces in any of the words (ie each line has a single space). If I understand correctly, you are trying to build a dictionary, so I would suggest you consider the fact that you can also have definitions of multiple word expressions. In that case, you'd better use some other character instead of a space to separate the definition from the word. Something like "|", which is impossible to appear in a word.

Then, you do something like this:

for line in english2german:
    (e, g) = line.split("|")
    englist.append(e)
    gerlist.append(g)

answered Apr 13, 2009 at 8:46

ibz

47.3k24 gold badges73 silver badges86 bronze badges

2 Comments

S.Lott Over a year ago

-1: changing the file format. Use parition instead of split -- same effect--no change to the file format.

ibz Over a year ago

Oh well, I didn't say he has to change the file format! I just suggested. I don't really see how partition can fix the problem I described, anyway.

Community · Accepted Answer · 2017-05-23 11:55:43Z

Slightly meta-answer(?) to Autoplectic's suggestion of using zip()

With 3 lines in the input file (from the supplied data in the question):

The zip() method takes an average of 0.404729390144 seconds, compared to 0.341339087486 with the simple for loop constructing two lists (the code from mipadi's currently accepted answer).

With 10,000 lines in the input file (random generated 3-12 character words. I reduced the timeit.repeat() values to 100 times, repeated twice):

zip() took an average of 1.43965339661 seconds, compared to 1.52318406105 with the for loop.

Both benchmarks were done using Python version 2.5.1

Hardly a huge difference.. Given how much more readable the simple for loop is, I would recommend using it.. The zip code might be a bit quicker with large files, but the difference is about 0.083 seconds with 10,000 lines..

Benchmarking code:

import timeit

# https://stackoverflow.com/questions/743248/something-wrong-with-output-from-list-in-python/743313#743313
code_zip = """english2german = open('english2german.txt')
eng, ger = zip(*( line.split() for line in english2german ))
"""

# https://stackoverflow.com/questions/743248/something-wrong-with-output-from-list-in-python/743268#743268
code_for = """english2german = open("english2german.txt")
englist = []
gerlist = []

for line in english2german:
    (e, g) = line.split()
    englist.append(e)
    gerlist.append(g)
"""

for code in [code_zip, code_for]:
    t = timeit.Timer(stmt = code)
    try:
        times = t.repeat(10, 10000)
    except:
        t.print_exc()
    else:
        print "Code:"
        print code
        print "Time:"
        print times
        print "Average:"
        print sum(times) / len(times)
        print "-" * 20

Collectives™ on Stack Overflow

Something wrong with output from list in Python

6 Answers 6

Comments

3 Comments

Comments

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

3 Comments

Comments

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related