0

I've got a data file where each "row" is delimited by \n\n\n. My solution is to isolate those rows by first slurping the file, and then splitting rows:

 for row in slurped_file.split('\n\n\n'):
    ...

Is there an "awk-like" approach I could take to parse the file as a stream within Python 2.7.9 , and split lines according to a given string value ? Thanks.

5
  • Is there a specific reason the file.read(num_bytes) method doesn't work for you? Just trying to better understand the requirements. It seems a lazy-generator based on reading bytes into a buffer and yielding split strings would be ideal for this. Commented Feb 19, 2015 at 17:48
  • There is a bug/feature request for such thing to be added into Python standard library; see also this question, but there is an easier workaround too. Commented Feb 19, 2015 at 18:06
  • The \n\n\n delimit large blocs of data (which will fit in memory, but I don't know in advance the size of those blocs). Commented Feb 19, 2015 at 18:09
  • I take that it really means 2 empty lines? Commented Feb 19, 2015 at 18:18
  • Yes, three consecutive line feeds when parsing with od -c. Commented Feb 24, 2015 at 9:32

1 Answer 1

3

So there is no such thing in the standard library. But we can make a custom generator to iterate over such records:

def chunk_iterator(iterable):
    chunk = []
    empty_lines = 0
    for line in iterable:
        chunk.append(line)
        if line == '\n':
            empty_lines += 1
            if empty_lines == 2:
                yield ''.join(chunk[:-2])
                empty_lines, chunk = 0, []
        else:
            empty_lines = 0

    yield ''.join(chunk)

Use as:

with open('filename') as f:
    for chunk in chunk_iterator(f):
        ...

This will use the per-line iteration of file written in C in CPython and thus be faster than the general record separator solution.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.