0

I have a string like below.

s = ({[test1, test2 ; test3 (New) ]})

Now I have a regex which will remove brackets and convert it into the list. Even if there are separated with a;b,c like. REGEX:

output = [i for i in re.split(r'\s*[(){}<>\[\],;\'"]\s*', s) if i]

But this regex is removing brackets from items of the list as well. ((New) in my case)

How to apply this regex for beginnig and end of the string. I know it can be done using ^ but not sure how?

Expected Output

['test1', 'test2', 'test3 (New)' ]

Output coming from above regex

['test1', 'test2', 'test3', 'New']

Any help?

5
  • What is your expected output? Commented May 22, 2018 at 9:00
  • Updated my question Commented May 22, 2018 at 9:02
  • Is adding the brackets to the list element an option ? Is the (New) element always the last in the list ? Commented May 22, 2018 at 9:10
  • No, I mean there may be (New) something like this in any element of my string. I want to remove brackets only from starting and end of string Commented May 22, 2018 at 9:12
  • @JayeshDhandha, what should be the result for this string -[({[test1, test2(3) ; test3 (New) ]})]-? Commented May 22, 2018 at 9:32

2 Answers 2

1
s = '({[test1, test2 ; test3 (New) ]})'

Based on your comment below I assume that the number of opening brackets of the whole string is equal to the number of closing brackets.

So removing the outer brackets first needs to know their number:

m = re.match('[({[]*', s)
n_brckt = m.span()[1] - m.span()[0]

Then remove the outer brackets ( - dependent on if there were found any...):

if n_brckt > 0:
    s = s[n_brckt:-n_brckt]
s = s.strip()

In: s
Out: 'test1, test2 ; test3 (New)'

Then you can split at all occurences of commas or colons optionally followed by a space:

In: re.split('[,;]+ *', s)
Out: ['test1', 'test2', 'test3 (New)']
Sign up to request clarification or add additional context in comments.

6 Comments

I can't use strip as there may be different number of brackets in the starting and ending of string. I can't use string with 3 count
So how do the beginning and the end of the strings look like? This should be made clear, otherwise it's just a guessing game...
Its not fix. Sometimes it has [{( and sometimes [{ only. And same thing at the end of string.
You have missed 1 thing. If my string doesn't contain any brackets then your output will be [''] Which is not right.
Do you want to tell me, that the string might even have no outer brackets at all...?
|
1

Using re.search

import re
s = "({[test1, test2 ; test3 (New) ]})"
m = re.search("\[(.*?)\]", s)
if m:
    #print(m.group(1).replace(";", ",").split(",")) 
    print([i.strip() for i in m.group(1).replace(";", ",").split(",")])

Output:

['test1', 'test2', 'test3 (New)']

4 Comments

Good Answer. Thanks!
Check output for (({{[[test1, test2 ; test3 (France)]]}}))
For a deep nested structure, regex will not work and can cause unwanted issues. It is best to find a parser according to your need. Ex: stackoverflow.com/questions/5454322/…
Please note that if m: is not a valid test for success of re.match. You could use if m.end(): but however, clearly writing if m.end() > 0: or sth similar is always clearer to read and understand immediately.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.