Python regex to extract version from a string [closed]

Question

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

This question does not appear to be about programming within the scope defined in the help center.

Closed last year.

The string looks like this: (\n used to break the line)

MySQL-vm
Version 1.0.1

WARNING:: NEVER EDIT/DELETE THIS SECTION

What I want is only 1.0.1 .

I am trying re.search(r"Version+'([^']*)'", my_string, re.M).group(1) but it is not working.

re.findall(r'\d+', version) is giving me an array of the numbers which again I have to append.

How can I improve the regex ?

I would suggest to first parse the string and apply the regex only on relevant parts. Will make your life easier. — Maroun
– Maroun, Commented Oct 21, 2014 at 7:16
"Version+" means match V-e-r-s-i-o and then one or more ns. — Joel Cornett
– Joel Cornett, Commented Oct 21, 2014 at 7:26
Look closely at the attempt, re.search(r"Version+'([^']*)'", my_string, re.M).group(1). What is the intended purpose of the 's? Based on how it is constructed, it looks as though you are quite deliberately looking for a single-quoted string (i.e.: a single-quote, some not-single-quote characters, and then a closing single-quote). Now, look carefully at the input: does the data you want actually look like that? I don't see quotes around the 1.0.1, therefore there is no reason to look for them. Why was this not closed as an obvious typo at the time? — Karl Knechtel
– Karl Knechtel, Commented Aug 12, 2022 at 4:08
What does "not working" mean for your re.search().group(1) line? See minimal reproducible example for more info. — TylerH
– TylerH, Commented Aug 1, 2024 at 19:58

Avinash Raj · Accepted Answer · 2014-10-21 07:31:50Z

22

Use the below regex and get the version number from group index 1.

Version\s*([\d.]+)

DEMO

>>> import re
>>> s = """MySQL-vm
... Version 1.0.1
... 
... WARNING:: NEVER EDIT/DELETE THIS SECTION"""
>>> re.search(r'Version\s*([\d.]+)', s).group(1)
'1.0.1'

Explanation:

Version                  'Version'
\s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                         more times)
(                        group and capture to \1:
  [\d.]+                   any character of: digits (0-9), '.' (1
                           or more times)
)                        end of \1

edited Oct 21, 2014 at 7:31

answered Oct 21, 2014 at 7:17

Avinash Raj

175k32 gold badges247 silver badges289 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

0sVoid · Accepted Answer · 2022-08-16 09:31:52Z

3

We can use the python re library. The regex described is for versions containing numbers only.

import re

versions = re.findall('[0-9]+\.[0-9]+\.?[0-9]*', AVAILABLE_VERSIONS)

unique_versions = set(versions) # convert it to set to get unique versions

Where AVAILABLE_VERSIONS is string containing versions.

edited Aug 16, 2022 at 9:31

0sVoid

2,7881 gold badge13 silver badges26 bronze badges

answered Aug 12, 2022 at 3:53

Rittick Paul

312 bronze badges

Comments

Braj · Accepted Answer · 2014-10-21 07:55:16Z

1

You can try with Positive Look behind as well that do not consume characters in the string, but only assert whether a match is possible or not. In below regex you don't need to findAll and group functions.

(?<=Version )[\d.]+

Online demo

Explanation:

  (?<=                     look behind to see if there is:
    Version                  'Version '
  )                        end of look-behind
  [\d.]+                   any character of: digits (0-9), '.' (1 or more times)

edited Oct 21, 2014 at 7:55

answered Oct 21, 2014 at 7:38

Braj

46.9k5 gold badges63 silver badges77 bronze badges

Comments

vks · Accepted Answer · 2014-10-21 08:07:11Z

1

(?<=Version\s)\S+

Try this.Use this with re.findall.

x="""MySQL-vm
  Version 1.0.1

  WARNING:: NEVER EDIT/DELETE THIS SECTION"""

print re.findall(r"(?<=Version\s)\S+",x)

Output:['1.0.1']

See demo.

http://regex101.com/r/dK1xR4/12

answered Oct 21, 2014 at 8:07

vks

68.1k11 gold badges96 silver badges132 bronze badges

Comments

yourstruly · Accepted Answer · 2020-03-17 21:16:20Z

1

https://regex101.com/r/5Us6ow/1

Bit recursive to match versions like 1, 1.0, 1.0.1:

def version_parser(v):
    versionPattern = r'\d+(=?\.(\d+(=?\.(\d+)*)*)*)*'
    regexMatcher = re.compile(versionPattern)
    return regexMatcher.search(v).group(0)

answered Mar 17, 2020 at 21:16

yourstruly

1,0021 gold badge10 silver badges17 bronze badges

Comments

Payam · Accepted Answer · 2022-01-24 23:56:55Z

0

Old question but none of the answers cover corner cases such as Version 1.2.3. (ending with dot) or Version 1.2.3.A (ending with non-numeric values) Here is my solution:

ver = "Version 1.2.3.9\nWarning blah blah..."
print(bool(re.match("Version\s*[\d\.]+\d", ver)))

answered Jan 24, 2022 at 23:56

Payam

1,24714 silver badges16 bronze badges

Collectives™ on Stack Overflow

Python regex to extract version from a string [closed]

6 Answers 6

Comments

Comments

Comments

Comments

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

Comments

Comments

Comments

Comments

Comments

Related