whats the difference between '{m}' and '{m,n}?' in http://docs.python.org/library/re.html it says '{m,n}?' matches numbers in range m to n times, but it is not a greedy search. Therefore if its not a greedy search wouldn't it only match up to m no matter what?
-
I'm not sure about Python-flavored regexes, but most regex flavors (and most programming languages for that matter) have a few functionally identical constructs. This isn't one, but you shouldn't be surprised if you find them.Chris Lutz– Chris Lutz2011-02-03 07:20:20 +00:00Commented Feb 3, 2011 at 7:20
-
@chris: when in doubt, it's easy to compare the regex system from several languages using online tools: PHP and javascript got regex.larsolavtorvik.com while python got ksamuel.pythonanywhere.com. here you can see easily it's not a new construct.Bite code– Bite code2012-01-28 15:01:36 +00:00Commented Jan 28, 2012 at 15:01
2 Answers
{m,n}? will preferably match only m repetitions, but it will expand as needed up to n repetitions if that's necessary for a longer match.
Compare ^x{2}y$ and ^x{2,4}?y$:
The former will fail on xxxy whereas the latter will match.
To summarize:
x{m}: Match x exactly m times.
x{m,n}: Try to match x n times, but if that causes the overall match to fail, give back as needed, but match at least m times (greedy quantifier).
x{m,n}?: Try to match x m times, but if that causes the overall match to fail, expand as needed, but match at most n times (lazy quantifier).
2 Comments
re should be RegularExpression instead of re. Is there any reason for not doing it that way ?It's easiest to see with an example using two matching groups:
>>> re.match(r'(x{1,3}?)(.*)', 'xxxxx').groups()
('x', 'xxxx')
>>> re.match(r'(x{1,3})(.*)', 'xxxxx').groups()
('xxx', 'xx')
In other words, {n,m} and {n,m}? are both able to match exactly the same things; what it changes is where the groupings happen when there's more than one way to match.