0

An updated question to my previous one Python string split using regex, I'm trying to parse lines like:

123foo  bar456  baz
123foo, bar456, baz
123foo > 13.0  bar456 = 1024  baz
123foo > 13.0, bar456 = 1024, baz

The items are in format:

String1 [OP String2]

String1 and String2 both can contains alphabets and numbers and '.' (like abc123, 1.2.3 and etc)
OP can be: <, >, <=, >=, =
Separator ',' between items is optional, what I want to get is the String1

so the result for all the above lines is just:

['123foo', 'bar456', 'baz']

How can I do this in python?

2
  • 2
    Please show us what you have tried. Commented Nov 17, 2013 at 11:22
  • Please give the output you want for each input Commented Nov 17, 2013 at 11:25

1 Answer 1

2

The code from the previous question modified to include digits too:

import re
with open("input") as f:
    for line in f:
        line = line.strip()
        # chop a line into expressions of the form: str [OP str]
        exprs = re.split(r'([\w\d]+\s*(?:[!<>=]=?\s*[\w\d.]*)?\s*,?\s*)', line)
        for expr in exprs:
            # chop each expression into tokens and get the str part
            tokens = re.findall(r'([\w\d]+)\s*(?:[!<>=]=?\s*[\w\d.]*)?,?', expr)
            if tokens: print tokens
Sign up to request clarification or add additional context in comments.

6 Comments

seems this is right, I'm verifying it for my inputs.
You should not have needed to ask a new question. The author of the previous answer would have happily updated his answer if you explained your requirement again :). You should accept this answer but rather than opening a question for small modification, you can just ask it in comments or modify your question again.
thanks for reminding, I'm not very familiar with stackoverflow now, I'll remember this :)
Thanks for understanding. But could you accept this answer now? Accepted answers will ensure that the question is not shown to people logging in now as "unanswered".
See this: docs.python.org/2/library/re.html. Read the whole page. At least, read the whole regex syntax. Here is what it does: ?:...) A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.