2

I have a string like this:

var hours_tdate = ['22','23','<span style="color:#1d953f;">0</span>','<span style="color:#1d953f;">1</span>'];

This is a part of a js file. Now I want to use regex to extract the numbers from the above string, and having the output like this:

[22,23,0,1]

I have tried:

re.findall('var hours_tdate = \[(.*)\];', string)

And it gives me:

'22','23','<span style="color:#1d953f;">0</span>','<span style="color:#1d953f;">1</span>'

I don't know why it has no match when I tried:

re.findall('var hours_tdate = \[(\d*)\];', string)
1
  • first thing first :- It should be \d+ along with word boundary..regex101.com/r/nS1xG6/1 Commented Apr 18, 2016 at 3:42

2 Answers 2

1

Use \d+ along with word boundary to extract the numbers only

\b\d+\b

Regex Demo

Python Code

p = re.compile(r'\b\d+\b')
test_str = "var hours_tdate = ['22','23','<span style=\"color:#1d953f;\">0</span>','<span style=\"color:#1d953f;\">1</span>'];"

print(re.findall(p, test_str))

Ideone Demo

NOTE :- Even if there will be digits in variable name, it won't matter as long as your format of variable is correct

Sign up to request clarification or add additional context in comments.

Comments

0

To provide another examples:

['>](\d+)['<]
# one of ' or >
# followed by digits
# followed by one of ' or <

In Python Code:

import re
rx = r"['>](\d+)['<]"
matches = [match.group(1) for match in re.finditer(rx, string)]

Or use lookarounds to only match what you want (no additional group needed, that is):

(?<=[>'])\d+(?=[<'])

Again, in Python Code:

import re
rx = r"(?<=[>'])\d+(?=[<'])"
matches = re.findall(rx, string)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.