I'm using preg_replace to match and replace improperly encoded UTF-8 characters with their proper characters. I've created a "old" array containing the wrong characters, and a corresponding "new" array with the replacements. Here is a snippet of each array:
$old = array(
'/â€/',
'/’/',
);
$new = array(
'†',
'’',
);
(Note: If you're curious about why I'm doing this, read more here)
A sample string that may contain the wrong data could be:
The programmer’s becoming very frustrated
Which should become:
The programmer's becoming very frustrated
I'm using this function:
$result = preg_replace($old, $new, $str);
But the subject is actually becoming:
The programmer†™s becoming very frustrated
It's clear that PHP is doing what I call a non-greedy match on the subject (not the correct term to use here, I know). preg_replace is executing the replacement on the first pair in the old/new array without considering if there may a different pattern in the pattern array that is more appropriate. If I reverse the order of the replacement pair, then it works as expected.
My question is: Is there an approach that will allow preg_replace to consider all elements of the pattern array before executing a replacement, or is my only option to re-order the arrays?