0

Today I was messing around with Javascript regexp's and found out this:

//Suppose
var one = 'HELLOxBYE';
var two = 'HELLOBYE';

You could create a regex that tries to capture the 'x' in both of these ways:

/^HELLO(x?)BYE$/ //(A)

//or

/^HELLO(x)?BYE$/ //(B)

I've found out that when you use (A) on var two, the regexp returns an empty string ''; while when you use (B) the regexp returns null.

You have to be careful with that.

Does anyone knows if this is a cross-browser behavior?

I've tested this on Google Chrome (Webkit) build 15.

UPDATE: Whoa, just did some tests on Internet Explorer 8, and it returns an empty string '' for both cases. So my conclusion is that the best alternative is to use (A) and then test for an empty string.

4
  • You are using different regex and return different result, how can you say it is a cross-browser behavior? If same regex with different result, then it is. Commented Jan 31, 2012 at 5:38
  • Can you please show the code you are using to test the matches? (Even better, set it up at jsfiddle.net.) @xdazz - Different browsers do give different results. I initially posted an answer about this being the expected result, but when I tried it in IE I got an empty string instead of null. Commented Jan 31, 2012 at 5:54
  • @nnnnnn Yes, that it is as I said, same regex different result, then it is cross-browser behavior. Commented Jan 31, 2012 at 6:00
  • 1
    @nnnnnn, I believe you are right about the expected behavior being empty string in case (A) and null in case (B) (as in Perl). And IE is just messing it up. (; Commented Jan 31, 2012 at 6:51

1 Answer 1

2

Technically (A) should return '' on HELLOBYE because the capturing brackets can capture both an 'x' and an empty string, since the ? is inside the capturing group.

Whereas in (B), the capturing brackets can only ever capture the string x. If the x is not present, then the group is never captured at all, because the entire group is optional, as opposed to the regex within the group.

Subtle difference!

So a browser or regex engine will always return '' for (A), but what it returns for (B) isn't all that well defined, so may differ depending on implementation - Chrome distinguishes between "the group matched an empty string" and "the group didn't match at all". Whereas IE doesn't make this distinction (or if it does, it coerces the return type for the second case into an empty string).

Summary -- use (A) because you know that if there is no x then the capturing group definitely matches ''. Using (B) depends on whether a browser distinguishes between "zero-length match" and "no match at all".

Sign up to request clarification or add additional context in comments.

1 Comment

exactly what I've tought, so we can conclude that this is not a cross-browser behavior

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.