1

Let's say, there is a string that can be:

  • "3d6"
  • "t3d6"
  • "3h4d6"
  • "t3h4d6"
  • "6 t3d6"
  • "6 t3h4d6+6" and similar possible variations.

Is there a way to run through the string and add the items from it to the list based on them being letters/numbers/symbols/other stuff? For example, "6 t3h10d6+6" would result in [6, " ", "t", 3, "h", 10, "d", 6, "+", 6] as the list.

I'm working on a telegram bot that uses the python to answer users' inputs with dice calculations results and the hardest part is processing the user input. Right now the input is being processed through the clumsy if statements complex. There might be a better way to process the user input and I would be glad to hear your advice!

Not a duplicate to this question. The question is about breaking a string into list of characters, basically turning a string into a list of strings. My question is about breaking string into list of different items that can be both strings and integers.

1

3 Answers 3

2

You could also use regex for that (by specifying to find all digit and non digit characters):

>>> x
'6 t3h4df6d+!643'
>>> __import__("re").findall('\d+|\s+|\D+', x)
['6', ' ', 't', '3', 'h', '4', 'df', '6', 'd+!', '643']

As you can see the above expression doesn't separate d from +!. If that is a problem you can slightly modify the above regular expression to:

>>> x = '6 t3h4df6d+!643'
>>> re.findall('\d+|\s+|[a-zA-Z]+|\D+', x)
['6', ' ', 't', '3', 'h', '4', 'df', '6', 'd', '+!', '643']

which separates them completely!

Update

If you want to split strings as single characters (e.g. "xy" as ['x','y']) you can change the above regex expression to:

>>> x = '6 t3h4df6d+!643'
>>> __import__("re").findall('\d+|\s+|\w|\D+', x)
['6', ' ', 't', '3', 'h', '4', 'd', 'f', '6', 'd', '+!', '643']
Sign up to request clarification or add additional context in comments.

7 Comments

very very pythonic
and you taught me a new stuff. I was not even aware of that... omg... the __import__("re").findall(regex,variable) OMG!!! wowwow
Wow, thanks! You seem to be the only person to notice that "10" should result in [10] and not in [1, 0].
__import__("re").findall(regex,variable) I like that !
re.findall('\d+|\s+|\D+', "6 t3h10d6+1") returns ['6', ' ', 't', '3', 'h', '10', 'd', '6', '+', '1']. The items inside of the list are all of the string type but, though not mentioned in the question, it is not a problem at all.
|
1

If you are comfortable using list comprehensions List comprehensions

  [ int(c) if c.isdigit() else c for c in "6 t3h4d6+6" ]

Output

[6, ' ', 't', 3, 'h', 4, 'd', 6, '+', 6]

1 Comment

Note that this wouldn't extract 10 from "6 t3h10d6+6"
1

You could use groupby, isdigit and list comprehensions to achieve the desired result :

from itertools import groupby
text = "6 t3h10d6+6"
substrings = groupby(text, lambda c: (c.isdigit(), c.isspace()))
print([int(''.join(l)) if d else ''.join(l) for (d, s), l in substrings])
# => [6, ' ', 't', 3, 'h', 10, 'd', 6, '+', 6]

Note that '10' is parsed as 10, not '10' or [1, 0].

3 Comments

And I also can note that " t" is parsed as " t" and not as [" ", "t"]. Close enough!
@AntonyPetrushko: Indeed. How should "ab" be parsed?
Good observation! Sorry not to give a best example in the question. "ab" should be parsed as ["a", "b"] and " b" as [" ", "b"]. Basically it should turn a string into a list of numbers (NOT digits) and single character.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.