0

I've been trying to cleanup some text. But got stuck on regex, finally got around with re.sub. But end up with syntax error. Original Code:

Test for name cleanup

import re

input = u'CHEZ MADU 東久留米店(シェマディ)【東京都東久留米市】'

pattern = re.compile(ur'(【(.*?)\】)', re.UNICODE)\

print(re.sub(input, pattern, ''))

Gave me this error:

  File "retest01.py", line 6
    pattern = re.compile(ur'(【(.*?)\】)', re.UNICODE)\
                                      ^
SyntaxError: invalid syntax

I've been testing code from another regex thread: python regular expression with utf8 issue

It gave same error. What could be possible the source of problem here?

3
  • what version of python are you using? If you are using version 3 try removing the u prefix on the strings since all strings are unicode. Commented May 16, 2017 at 14:22
  • python 3, following the answer pattern = re.compile(u'(【(.*?)\】)') is working for me Commented May 16, 2017 at 15:04
  • In Python 3, you do not need u prefix. Commented May 16, 2017 at 15:55

2 Answers 2

1

If you don't use the raw string notation, it works out fine for me. Additionally, I don't think you're using the re.sub properly:

re.sub(pattern, repl, string, count=0, flags=0)

This didn't throw an error for me:

import re
input = u'CHEZ MADU 東久留米店(シェマディ)【東京都東久留米市】'
pattern = re.compile(u'(【(.*?)\】)', re.UNICODE)
print(re.sub(pattern, '', input))

This works on python 2 and 3, but you don't need the unicode specifier on 3.

Sign up to request clarification or add additional context in comments.

1 Comment

thank you. looks like my beginner problem. will be more careful
0

The ur'....' syntax is invalid since Python 3.3 (see http://bugs.python.org/issue15096 )

The syntax error is, a bit surprisingly, indicated at the end of the string...

>>> ru'my string'
  File "<stdin>", line 1
    ru'my string'
                ^
SyntaxError: invalid syntax

So, in Python 3, you can use either:

  • 'my string' or u'mystring', which mean the same (the latter was reintroduced in Python 3.3 for compatibility with Python 2 code, see PEP 414 )
  • or r'my string with \backslashes' for a "raw" string.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.