1

I have virtually no experience of regx, but trying my best.

I have a string like this:

$fString = "Name=Sök,Value=2,Title=Combine me,Options=[Item1=1,Item2=2,Item3=3]";

I want to get an array looking like this:

Array[0] = "Name=Sök"
Array[1] = "Value=2"
Array[2] = "Title=Combine me"
Array[3] = "Options=[Item1=1,Item2=2,Item3=3]"

What I have managed to do so far is:

preg_match_all("/[^,]*[\w\d]*=[^,]*/",$fString,$Data);

But it I can't figure out how to fix the last "Option".

Array ( [0] => Array ( [0] => Name=S�k [1] => Value=2 [2] => Title=Combine me [3] => Options=[Item1=1 [4] => Item2=2 [5] => Item3=3] ) )

...and why is the result an array inside an array?!?


[EDIT]

I guess I need to explain the whole idea of what I'm trying to do here, I'm not sure I'm on the right track any more.

I have created some classes where I store all the "persistent" variables in an array. I have a function that serializes this array so I can be stored in a database.

I know all about the serialize() function, but I'm doing some filtering so I can't use it as it is, and I also prefer to have it more readable for manual editing. This array can have nested arrays within, that needs to be preserved. When I read it all back from the database, the original array must be created again.

I had it all working with the eval() command but stumbled into trouble where I had nested arrays because of the " or ' characters was breaking the main outer string. So this approach was an attempt to serialize everything without nested strings that needed to be preserved.

So if I can solve the nested data with preg_match_all I'm there, otherwise I need to come up with another solution.

I guess the data needs to be escaped as well, such as the , and [ ]

4
  • The result is always in an array. $Data outputs always an multidimensional array with several result types. Commented Jan 11, 2011 at 10:27
  • Can there be a comma in the values, such as "Title=Combine,me"? If yes, how is that case handled? Commented Jan 11, 2011 at 10:52
  • You'll need a mini parser then. Look at what @SirDarius did. You only need something to look out for the nearest , or [, and descend on [ or return the collected list on ]. Commented Jan 11, 2011 at 15:48
  • After reading your edit, I can say that this use case calls for JSON serialization through the json_encode function. JSON is human-editable and allows array nesting. Plus, it's damn easy to decode to php variables using json_decode. Commented Jan 11, 2011 at 16:57

5 Answers 5

2

Here is a function that will do basically what you need:

function explode_me($str) {
    $a = array();
    $v = "";
    $ignore = false;
    for ($i = 0; $i < strlen($str); $i++) {
        if ($str[$i] == ',' && !$ignore) {
            $a[] = $v;
            $v = "";
        }
        else if ($str[$i] == '[' && !$ignore) {
            $ignore = true;
            $v .= $str[$i];
        }
        else if ($str[$i] == ']' && $ignore) {
            $ignore = false;
            $v .= $str[$i];
        }
        else {
            $v .= $str[$i];
        }
    }
    $a[] = $v;
    return $a;
}

To test it:

$str = "Name=Sök,Value=2,Title=Combine me,Options=[Item1=1,Item2=2,Item3=3]";
$a = explode_me($str);

print_r($a);

which prints:

Array
(
    [0] => Name=Sök
    [1] => Value=2
    [2] => Title=Combine me
    [3] => Options=[Item1=1,Item2=2,Item3=3]
)
Sign up to request clarification or add additional context in comments.

1 Comment

I used your JSON suggestion, seems to work as a charm so far :D
2
(\w+)=(\[[^\]]+\]|[^,]+)

This breaks down as:

(\w+)        # a word (store in match group 1)
=            # the "=" character
(            # begin match group 2
  \[         #   a "[" character
  [^\]]+     #   anything but "]" character
  \]         #   a "]" character
  |          #   or...
  [^,]+      #   anything but a comma
)            # end match group 1

Apply with preg_match_all():

$fString = "Name=Sök,Value=2,Title=Combine me,Options=[Item1=1,Item2=2,Item3=3]";

$matches = array();
preg_match_all("/(\\w+)=(\\[[^\\]]+\\]|[^,]+)/", $fString, $matches);

Which results in something even more detailed than you wanted to have:

Array
(
    [0] => Array
        (
            [0] => Name=Sök
            [1] => Value=2
            [2] => Title=Combine me
            [3] => Options=[Item1=1,Item2=2,Item3=3]
        )

    [1] => Array
        (
            [0] => Name
            [1] => Value
            [2] => Title
            [3] => Options
        )

    [2] => Array
        (
            [0] => Sök
            [1] => 2
            [2] => Combine me
            [3] => [Item1=1,Item2=2,Item3=3]
        )

)

$result[0] is what you wanted. $result[1] and $result[2] are property names and values separately, which enables you to use them right away instead of making an extra step that splits things like "Options=[Item1=1,Item2=2,Item3=3]" at the correct =.

2 Comments

@Max: And that would be what? :)
Well, why the answer is an array with arrays, very handy :D
1

If you could change the separators between the items (where it says Item1=1,Item2=2,Item3=3 to something like Item1=1|Item2=2|Item3=3) you could easily use explode(',',$fString) to convert a string to an array.

I can also offer this piece of code that will change the separators, as I have no experience with regex:

$newstr = str_replace(',Item','|Item',$fString);
$newarray = explode(',',$newstr);

$newarray will look like this:

Array[0] = "Name=Sök"
Array[1] = "Value=2"
Array[2] = "Title=Combine me"
Array[3] = "Options=[Item1=1|Item2=2|Item3=3]"

1 Comment

I had that in mind too before, but I need to be able to iterate the [...] blocks back in again to unwrap them.
1

This is a problem that lends itself more to parsing than regex extraction. Bout you can separate the special case to make it work:

preg_match_all("/(\w+)=( \w[^,]+ | \[[^\]]+\] )/x", $str, $m);
$things = array_combine($m[1], $m[2]);

Will give you a PHP variable like (but you can access $m[0] for the unparsed strings):

[Name] => Sök
[Title] => Combine me
[Options] => [Item1=1,Item2=2,Item3=3]

You can reapply the function on Options to explode that too.

The trick again is differentiating between \w anything that starts with a letter, and the \[...\] enclosed options. There you have to just make it match ^] all non-closing-brackets, and that's it.

2 Comments

Thanks, this was actually the result I was aiming for. Just one problem: When I iterate the [...] parameter (removing the [...] first) "Item1=1,Item2=2,Item3=3" It fails because there is no match for digits. I tried to put in a \d but can't make it work. Furter I need to be able to have nested [...] blocks such as Controls=[Options=[Item1=1,Item2=2],Name=Foo] Maybe this isn't the right approach?!?
@MaxKielland: You don't need \d. The failure above was the +, change it into a \w[^,]*. But as for nested lists, I'm afraid, regular expressions become difficult to manage. You could try to nest another \\[...\\] within though.
1

So, here is another approach. It's a mini parser for nested structures. Adapt the regex if you need escape codes.

function parse(&$s) {
    while (strlen($s) && preg_match("/^(.*?)([=,\[\]])/", $s, $m)) {
        $s = substr($s, 1 + strlen($m[1]));
        switch ($m[2]) {
            case "=":
               $key = $m[1];
               break;
            case ",":
               if (!isset($r[$key])) {
                  $r[$key] = $m[1];
               }
               break;
            case "[":
               $r[$key] = parse($s);
               break;
            case "]":
               return $r;
        }
    }
    if ($s) { $r[$key] = $s; } // remainder
    return $r;
}

3 Comments

I will have a look at this and compare with JSON proposed by SirDarius. I also read a bit about the Serializable Interface and it looks like I can do something in this direction...
@MaxKielland: It's actually based on a JSON parser I wrote. And I've tested it with nested lists already. I've compacted it a bit too much for this case, so you will need print statements to debug what's going on. But give it a try on one of your complex examples.
I will get back, when I have implemented and tested this. For the moment I'm testing the components to be serialized.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.