I have a user defined string. I want to use it in regex with small improvement: search by three apostrophes instead of one. For example,
APOSTROPHES = re.escape('\'\u2019\u02bc')
word = re.escape("п'ять")
word = ''.join([s if s not in APOSTROPHES else '[%s]' % APOSTROPHES for s in word])
It works good for latin, but for unicode list comprehension gives the following string:
"[\\'\\\\u2019\\\\u02bc]\xd0[\\'\\\\u2019\\\\u02bc]\xbf[\\'\\\\u2019\\\\u02bc][\\'\\\\u2019\\\\u02bc][\\'\\\\u2019\\\\u02bc]\xd1[\\'\\\\u2019\\\\u02bc]\x8f[\\'\\\\u2019\\\\u02bc]\xd1[\\'\\\\u2019\\\\u02bc]\x82[\\'\\\\u2019\\\\u02bc]\xd1[\\'\\\\u2019\\\\u02bc]\x8c"
Looks like it finds backslashes in both strings and then substitutes APOSTROPHES
Also, print(list(w for w in APOSTROPHES)) gives ['\\', "'", '\\', '\\', 'u', '2', '0', '1', '9', '\\', '\\', 'u', '0', '2', 'b', 'c'].
How can I avoid it? I want to get "\п[\'\u2019\u02bc]\я\т\ь"
"'"with['\\u2019\\u02bc]after applyingre.escape?