1

I am trying to replace the variables with placeholders like XXX. The words "hello" and "morning" are printed as-is because they appear in another list. The following code works, but prints extra placeholders.

import re

mylist = ['hello', 'morning']
nm = [
    "Hello World Robot Morning.",
    "Hello Aniket Fine Morning.",
    "Hello Paresh Good and bad Morning.",
]



def punctuations(string):
    pattern = re.compile(r"(?u)\b\w\w+\b")
    result = pattern.match(string)
    myword = result.group()
    return myword


for x in nm:
    newlist = list()
    for y in x.split():
        for z in mylist:
            if z.lower() == punctuations(y.lower()):
                newlist.append(y)
            else:
                newlist.append("xxx")
    print(newlist)

Output:

['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']

Expected output:

['Hello', 'xxx', 'xxx',  'Morning.']
['Hello', 'xxx', 'xxx',   'Morning.']
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']
1
  • Can you explain your code, is it a arbitrary exercise, or any specific purpose? Commented Dec 1, 2019 at 10:11

2 Answers 2

3

You're reaching for python's vanilla string functions and regular expressions when actually your problem is better solved with formal parsing using Parsing Expression Grammar (PEP):

For example:

import pyparsing as pp

expr = pp.OneOrMore(pp.Word("hello") | pp.Word("world") | pp.Word(pp.alphas).setParseAction(pp.replaceWith("XXX")))

expr.parseString("hello foo bar world")

Yields:

(['hello', 'XXX', 'XXX', 'world'], {})

See module pyParsing and docs.

Sign up to request clarification or add additional context in comments.

1 Comment

Nice, you captured essential behaviour of the OPs code and found an effective library (and concept PEG) to solve it elegantly👍
2

You have to break when you have found the word and only after checking all the elements in my_list evaluate if you have found something, and if not, append the placeholder

for x in nm:
    newlist = list()
    for y in x.split():
        for z in mylist:
            if z.lower() == punctuations(y.lower()):
                newlist.append(y)
                break
        else:
            newlist.append('xxx')
    print(newlist)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.