Regular expression to return all characters between two strings

Question

How can I design a regular expression that will capture all the characters between 2 strings? Specifically, from this big string:

Studies have shown that...[^title=Fish consumption and incidence of stroke: a meta-analysis of cohort studies]... Another experiment demonstrated that... [^title=The second title]

I want to extract all the characters between [^title= and ], that is, Fish consumption and incidence of stroke: a meta-analysis of cohort studies and The second title.

I think I will have to use re.findall(), and that I can start with this: re.findall(r'\[([^]]*)\]', big_string), which will give me all the matches between the square brackets [ ], but I'm not sure how to extend it.

icedtrees · Accepted Answer · 2014-02-20 07:55:47Z

5

>>> text = "Studies have shown that...[^title=Fish consumption and incidence of stroke: a meta-analysis of cohort studies]... Another experiment demonstrated that... [^title=The second title]"
>>> re.findall(r"\[\^title=(.*?)\]", text)
['Fish consumption and incidence of stroke: a meta-analysis of cohort studies', 'The second title']

Here is a breakdown of the regex:

\[ is an escaped [ character.

\^ is an escaped ^ character.

title= matches title=

(.*?) matches any characters, non-greedily, and puts them in a group (for findall to extract). Which means it stops when it finds a...

\], which is an escaped ] character.

answered Feb 20, 2014 at 7:55

icedtrees

6,5466 gold badges28 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Regular expression to return all characters between two strings

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related