0

I try get formatting string using python3 regex - re

My input:

{'factorial.2.0.0.zip', 'Microsoft ASP.NET Web API 2.2 Client Libraries 5.2.3.zip', 'Newtonsoft.Json.9.0.1.zip'}

I try get only name and only version for packages, like that:

  • factorial.2.0.0.zip
    • factorial
    • 2.0.0
  • Microsoft ASP.NET Web API 2.2 Client Libraries 5.2.3.zip
    • Microsoft ASP.NET Web API 2.2 Client Libraries
    • 5.2.3

etc. This my code

if diff is not None:
    for values in diff.values():
        for value in values:
            temp = ''
            temp1 = ''
            temp = re.findall('[aA-zZ]+[0-9]*', value) #name pack
            temp1 = re.findall('\d+', value) #version
            print(temp)
            print(temp1)

My wrong output:

 temp:
 ['Microsoft', 'ASP', 'NET', 'Web', 'API', 'Client', 'Libraries', 'zip']
 ['Newtonsoft', 'Json', 'zip']
 ['factorial', 'zip']

temp1:
['2', '0', '0']
['2', '2', '5', '2', '3']
['9', '0', '1']

Right output:

temp:
['Microsoft', 'ASP', 'NET', 'Web', 'API', 'Client', 'Libraries']
['Newtonsoft', 'Json']
['factorial']

temp1:
['2', '0', '0']
['5', '2', '3']
['9', '0', '1']

how me fix problem, delete "zip" is search and extra numbers. Maybe have another way solved my problem.

1
  • I'd strongly recommend to get rid of meaningless identifiers such as temp, whatever you change else. Commented Dec 25, 2016 at 16:57

1 Answer 1

3

Something like this?

import re

a = {'factorial.2.0.0.zip', 'Newtonsoft.Json.9.0.1.zip',\
     'Microsoft ASP.NET Web API 2.2 Client Libraries 5.2.3.zip',\
     'namepack010.0.0.153.212583'}

for b in a:
    c = re.findall('(.*?).(\d+\.\d+\.\d+)(\.zip|\.\d+)$', b)[0]
    if c[2] == '.zip':
        print c[0],'||',c[1]
    else:
        print c[0],'||',c[1]+c[2]

Output:

Newtonsoft.Json || 9.0.1
namepack010 || 0.0.153.212583
Microsoft ASP.NET Web API 2.2 Client Libraries || 5.2.3
factorial || 2.0.0

Don't use [aA-zZ] for selecting all alphabets. It will match some of the special characters also. You should use [a-zA-Z]

Check this for more understanding: Why is this regex allowing a caret?

Sign up to request clarification or add additional context in comments.

3 Comments

thanks man, you realy help, but I find name packgs which is not suitable this regex. He looks like this: namepack010.0.0.153.212583 your regex return ('namepack010.0.0.153.', '12583') maybe you can help me again? right return this packs: ('namepack010' , '0.0.153.212583')
my solution: print(re.findall('(.*?\d*\s*)\.*(\d*[^a-zA-Z]*).zip', b)[0])
@teror4uks Modified. Check now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.