0

This is my current code:

def poisci_pare(besedilo):
    import re
    seznam = re.split("[.]", besedilo)
    return seznam

this returns (we assume the sentences will always end with a dot .)

poisci_pare("Hello world. This is great.")
>>>output: ["Hello world", "This is great"]

What would I have to write to get python to split the string like this:

poisci_pare("Hello world. This is great.")
>>>output: [["Hello", "world"], ["This", "is", "great"]]
2
  • 1
    Im actually suprised that worked ... . typically means any character in regex ... I guess when its in a box bracket it treats it as a literal ... Commented Nov 5, 2014 at 20:10
  • Yeah I didn't think it would work in the first place, but after some experimenting with re.split I got it to work perfectly.. Commented Nov 5, 2014 at 20:15

2 Answers 2

3
def poisci_pare(text):
    sents = text.split('.')
    answer = [sent.split() for sent in sents if sent]
    return answer

Output:

In [8]: poisci_pare("Hello world. This is great.")
Out[8]: [['Hello', 'world'], ['This', 'is', 'great']]
Sign up to request clarification or add additional context in comments.

Comments

0

this also will do the trick:

input = "Hello world. This is great."
print [s.split() for s in input.split('.') if s.split()]
[['Hello', 'world'], ['This', 'is', 'great']]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.