Adding different items to the list from the string

Question

Let's say, there is a string that can be:

"3d6"
"t3d6"
"3h4d6"
"t3h4d6"
"6 t3d6"
"6 t3h4d6+6" and similar possible variations.

Is there a way to run through the string and add the items from it to the list based on them being letters/numbers/symbols/other stuff? For example, "6 t3h10d6+6" would result in [6, " ", "t", 3, "h", 10, "d", 6, "+", 6] as the list.

I'm working on a telegram bot that uses the python to answer users' inputs with dice calculations results and the hardest part is processing the user input. Right now the input is being processed through the clumsy if statements complex. There might be a better way to process the user input and I would be glad to hear your advice!

Not a duplicate to this question. The question is about breaking a string into list of characters, basically turning a string into a list of strings. My question is about breaking string into list of different items that can be both strings and integers.

Possible duplicate of Break string into list of characters in Python — samiles
– samiles, Commented Sep 4, 2017 at 14:10

coder · Accepted Answer · 2017-09-04 18:20:48Z

2

You could also use regex for that (by specifying to find all digit and non digit characters):

>>> x
'6 t3h4df6d+!643'
>>> __import__("re").findall('\d+|\s+|\D+', x)
['6', ' ', 't', '3', 'h', '4', 'df', '6', 'd+!', '643']

As you can see the above expression doesn't separate d from +!. If that is a problem you can slightly modify the above regular expression to:

>>> x = '6 t3h4df6d+!643'
>>> re.findall('\d+|\s+|[a-zA-Z]+|\D+', x)
['6', ' ', 't', '3', 'h', '4', 'df', '6', 'd', '+!', '643']

which separates them completely!

Update

If you want to split strings as single characters (e.g. "xy" as ['x','y']) you can change the above regex expression to:

>>> x = '6 t3h4df6d+!643'
>>> __import__("re").findall('\d+|\s+|\w|\D+', x)
['6', ' ', 't', '3', 'h', '4', 'd', 'f', '6', 'd', '+!', '643']

edited Sep 4, 2017 at 18:20

answered Sep 4, 2017 at 14:12

coder

13k5 gold badges44 silver badges55 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Andy K Over a year ago

very very pythonic

Andy K Over a year ago

and you taught me a new stuff. I was not even aware of that... omg... the __import__("re").findall(regex,variable) OMG!!! wowwow

Antony Petrushko Over a year ago

Wow, thanks! You seem to be the only person to notice that "10" should result in [10] and not in [1, 0].

Nikhil Rupanawar Over a year ago

__import__("re").findall(regex,variable) I like that !

Antony Petrushko Over a year ago

re.findall('\d+|\s+|\D+', "6 t3h10d6+1") returns ['6', ' ', 't', '3', 'h', '10', 'd', '6', '+', '1']. The items inside of the list are all of the string type but, though not mentioned in the question, it is not a problem at all.

|

Nikhil Rupanawar · Accepted Answer · 2017-09-04 14:55:48Z

1

If you are comfortable using list comprehensions List comprehensions

  [ int(c) if c.isdigit() else c for c in "6 t3h4d6+6" ]

Output

[6, ' ', 't', 3, 'h', 4, 'd', 6, '+', 6]

edited Sep 4, 2017 at 14:55

answered Sep 4, 2017 at 14:17

Nikhil Rupanawar

4,22111 gold badges37 silver badges51 bronze badges

1 Comment

Eric Duminil Over a year ago

Note that this wouldn't extract 10 from "6 t3h10d6+6"

Eric Duminil · Accepted Answer · 2017-09-04 20:50:53Z

1

You could use groupby, isdigit and list comprehensions to achieve the desired result :

from itertools import groupby
text = "6 t3h10d6+6"
substrings = groupby(text, lambda c: (c.isdigit(), c.isspace()))
print([int(''.join(l)) if d else ''.join(l) for (d, s), l in substrings])
# => [6, ' ', 't', 3, 'h', 10, 'd', 6, '+', 6]

Note that '10' is parsed as 10, not '10' or [1, 0].

edited Sep 4, 2017 at 20:50

answered Sep 4, 2017 at 15:02

Eric Duminil

54.6k10 gold badges80 silver badges134 bronze badges

3 Comments

Antony Petrushko Over a year ago

And I also can note that " t" is parsed as " t" and not as [" ", "t"]. Close enough!

Eric Duminil Over a year ago

@AntonyPetrushko: Indeed. How should "ab" be parsed?

Antony Petrushko Over a year ago

Good observation! Sorry not to give a best example in the question. "ab" should be parsed as ["a", "b"] and " b" as [" ", "b"]. Basically it should turn a string into a list of numbers (NOT digits) and single character.

Collectives™ on Stack Overflow

Adding different items to the list from the string

3 Answers 3

7 Comments

1 Comment

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related