9

The string looks like this: (\n used to break the line)

MySQL-vm
Version 1.0.1

WARNING:: NEVER EDIT/DELETE THIS SECTION

What I want is only 1.0.1 .

I am trying re.search(r"Version+'([^']*)'", my_string, re.M).group(1) but it is not working.

re.findall(r'\d+', version) is giving me an array of the numbers which again I have to append.

How can I improve the regex ?

5
  • I would suggest to first parse the string and apply the regex only on relevant parts. Will make your life easier. Commented Oct 21, 2014 at 7:16
  • 2
    "Version+" means match V-e-r-s-i-o and then one or more ns. Commented Oct 21, 2014 at 7:26
  • it repeats only the n one or more times. Commented Oct 21, 2014 at 7:32
  • Look closely at the attempt, re.search(r"Version+'([^']*)'", my_string, re.M).group(1). What is the intended purpose of the 's? Based on how it is constructed, it looks as though you are quite deliberately looking for a single-quoted string (i.e.: a single-quote, some not-single-quote characters, and then a closing single-quote). Now, look carefully at the input: does the data you want actually look like that? I don't see quotes around the 1.0.1, therefore there is no reason to look for them. Why was this not closed as an obvious typo at the time? Commented Aug 12, 2022 at 4:08
  • What does "not working" mean for your re.search().group(1) line? See minimal reproducible example for more info. Commented Aug 1, 2024 at 19:58

6 Answers 6

22

Use the below regex and get the version number from group index 1.

Version\s*([\d.]+)

DEMO

>>> import re
>>> s = """MySQL-vm
... Version 1.0.1
... 
... WARNING:: NEVER EDIT/DELETE THIS SECTION"""
>>> re.search(r'Version\s*([\d.]+)', s).group(1)
'1.0.1'

Explanation:

Version                  'Version'
\s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                         more times)
(                        group and capture to \1:
  [\d.]+                   any character of: digits (0-9), '.' (1
                           or more times)
)                        end of \1
Sign up to request clarification or add additional context in comments.

Comments

3

We can use the python re library. The regex described is for versions containing numbers only.

import re

versions = re.findall('[0-9]+\.[0-9]+\.?[0-9]*', AVAILABLE_VERSIONS)

unique_versions = set(versions) # convert it to set to get unique versions

Where AVAILABLE_VERSIONS is string containing versions.

Comments

1

You can try with Positive Look behind as well that do not consume characters in the string, but only assert whether a match is possible or not. In below regex you don't need to findAll and group functions.

(?<=Version )[\d.]+

Online demo

Explanation:

  (?<=                     look behind to see if there is:
    Version                  'Version '
  )                        end of look-behind
  [\d.]+                   any character of: digits (0-9), '.' (1 or more times)

Comments

1
(?<=Version\s)\S+

Try this.Use this with re.findall.

x="""MySQL-vm
  Version 1.0.1

  WARNING:: NEVER EDIT/DELETE THIS SECTION"""

print re.findall(r"(?<=Version\s)\S+",x)

Output:['1.0.1']

See demo.

http://regex101.com/r/dK1xR4/12

Comments

1

https://regex101.com/r/5Us6ow/1

Bit recursive to match versions like 1, 1.0, 1.0.1:

def version_parser(v):
    versionPattern = r'\d+(=?\.(\d+(=?\.(\d+)*)*)*)*'
    regexMatcher = re.compile(versionPattern)
    return regexMatcher.search(v).group(0)

Comments

0

Old question but none of the answers cover corner cases such as Version 1.2.3. (ending with dot) or Version 1.2.3.A (ending with non-numeric values) Here is my solution:

ver = "Version 1.2.3.9\nWarning blah blah..."
print(bool(re.match("Version\s*[\d\.]+\d", ver)))

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.