1

Simple one here but I'm fairly new to Python.

I have a string like this:

this is page one of an article 
<!--pagebreak page two --> this is page two 
<!--pagebreak--> this is the third page 
<!--pagebreak page four --> last page
// newlines added for readability

I need to split the string using this regex: <!--pagebreak(*.?)--> - the idea is that sometimes the <!--pagebreak--> comments have a 'title' (which I use in my templates), other times they don't.

I tried this:

re.split("<!--pagebreak*.?-->", str)

which returned only the items with 'titles' in the pagebreak (and didn't split them correctly either). What am I doing wrong here?

3 Answers 3

2

Change *.? into .*?:

re.split("<!--pagebreak.*?-->", str)

Your current regex accepts any number of literal k's, optionally followed by (any character).

Also, I would recommend using raw strings (r"...") for your regular expressions. It's not necessary in this case, but it's an easy way to spare yourself a few headaches.

Sign up to request clarification or add additional context in comments.

4 Comments

.*? doesn't make sense in regex.
@jpm Yes it does. It's a . with a lazy * quantifier.
How can you forget about laziness? It is the greatest of all programming virtues.
@MattAndrews Not at all. We all make typos from time to time.
2

You swapped the . with the *. The correct regex is:

<!--pagebreak.*?-->

Comments

2

Definitely an issue of swapping the . and *. "." matches all and the asterisk indicates that you'll take as many characters as you can get (limited of course by the non-greedy qualifier "?")

import re

s = """this is page one of an article 
<!--pagebreak page two --> this is page two 
<!--pagebreak--> this is the third page 
<!--pagebreak page four --> last page"""

print re.split(r'<!--pagebreak.*?-->', s)

Outputs:

['this is page one of an article \n', ' this is page two \n', ' this is the third page \n', ' last page']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.