5

I want to split strings using a comma delimiter if the comma is preceded by a certain regex. Consider the case where my strings are in the format: "(bunch of stuff that might have commas) FOO_REGEX, (other stuff that might have commas) FOO_REGEX, ..." and I want to split the string on commas, but only if they're preceded by FOO_REGEX: ["(bunch of stuff that might have commas) FOO_REGEX", "(other stuff that might have commas) FOO_REGEX", tc.].

As a concrete example, consider splitting the following string:

"hi, hello! $$asdf, I am foo, bar $$jkl, cool" 

into this list of three strings:

["hi, hello! $$asdf", 
"I am foo, bar $$jkl", 
"cool"]

Is there any easy way to do this in python?

2 Answers 2

2

You could use re.findall instead of re.split.

>>> import re
>>> s = "hi, hello! $$asdf, I am foo, bar $$jkl, cool"
>>> [j for i in re.findall(r'(.*?\$\$[^,]*),\s*|(.+)', s) for j in i if j]
['hi, hello! $$asdf', 'I am foo, bar $$jkl', 'cool']

OR

Use external regex module to support variable length lookbehind since re won't support variable length look-behind assertions.

>>> import regex
>>> s = "hi, hello! $$asdf, I am foo, bar $$jkl, cool"
>>> regex.split(r'(?<=\$\$[^,]*),\s*', s)
['hi, hello! $$asdf', 'I am foo, bar $$jkl', 'cool']
Sign up to request clarification or add additional context in comments.

1 Comment

Hope this gets added soon. Because the link you gave is awesome
1

You can use a positive look-behind if the FOO_REGEX is fixed-width. Here, you will get your line split after "$$asdf,"

See a sample working program:

import re    
str = 'hi, hello! $$asdf, I am foo, bar $$jkl, cool'
splts = re.split('(?<=\$\$asdf), *', str)
print splts

Output:

['hi, hello! $$asdf', 'I am foo, bar $$jkl, cool'] 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.