1

So I'm trying to write a function that pulls out the direction out of a street name if it exists in the name.

For instance the street Burnhamthorpe Rd West would pull out the word West

I have done that with the following regex:

(?:\b(W)\b)|(?:\b(West)\b)

The problem is when I hit the following address Brown's Line and use the regex for testing the South direction, it pulls out the 's

(?:\b(S)\b)|(?:\b(South)\b)

My php looks as follows

private function breakUpStreetNameParts($address)
{
    $directions = ["N" => "North", "S" => "South", "E" => "East", "W" => "West"];

    $direction = "";
    foreach ($directions as $short_dir => $long_dir) {
        if (preg_match_all("/(?:\b({$short_dir})\b)|(?:\b($long_dir)\b)/i", $address, $parts)) {
            $direction = $parts[0][0];
            $patterns = [
                "/{$parts[0][0]}/",
                "/\s\s/"
            ];
            $replacements = [
                "",
                " "
            ];
            $address = trim(preg_replace($patterns, $replacements, $address));
        }
    }

    return [
        "st_name"       => $address, 
        "st_direction"  => $direction
    ];
}

How do i do the regex for this so that it works correctly and ignore's the apostrophe as a word boundary?

EDIT

Added my fix to below :)

5
  • good luck with this. consider something like North Road or West Boulevard, named after Mrs. North and Mr. West Commented Oct 30, 2015 at 16:52
  • I guess you probably want to look for white space separator (\s) rather than a word break (\b)? However, I think you have a bigger issue in that your premise of extracting "North" "South" "East" and "West" is going to result in false positives, even if you just take the word on its own. I know of places named "West End" and "North End", etc, and roads leading to them called "West End Road". You can't take the "West" out of that without destroying its meaning. Commented Oct 30, 2015 at 16:52
  • Since you are looking for just 4 words why not use strpos,nevermind the issue raised by Marc above Commented Oct 30, 2015 at 16:53
  • I have thought of that since I break up street name and street suffix and I do a check for it, so I'll just extend that to the street direction Commented Oct 30, 2015 at 16:56
  • You could standardize on how a country's Post system scans for addresses. It's pretty arcane and complicated though. Commented Oct 30, 2015 at 17:05

1 Answer 1

1

I sorted it out, the following works perfectly, even for street names that begin with a direction:

private function breakUpStreetNameParts($address)
{
    $directions = ["N" => "North", "S" => "South", "E" => "East", "W" => "West"];

    $direction = "";
    foreach ($directions as $short_dir => $long_dir) {
        if (preg_match_all("/(?:(?:\s)({$short_dir})(?:\s|$))|(?:(?:\s)({$long_dir})(?:\s|$))/i", $address, $parts)) {
            $direction = $parts[0][0];
            $patterns = [
                "/{$parts[0][0]}/",
                "/\s\s/"
            ];
            $replacements = [
                "",
                " "
            ];
            $address = trim(preg_replace($patterns, $replacements, $address));
        }
    }

    return [
        "st_name"       => $address, 
        "st_direction"  => $direction
    ];
}


Street name: Brown's Line
Array
(
    [st_name] => Brown's Line
    [st_direction] =>
)

Before breakup: Burnhamthorpe Road West
Array
(
    [st_name] => Burnhamthorpe Road
    [st_direction] =>  West
)

Before breakup: West Ave
Array
(
    [st_name] => West Ave
    [st_direction] =>
)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.