11

I am writing a set of RegExps to translate a CSS selector into arrays of ids and classes.

For example, I would like '#foo#bar' to return ['foo', 'bar'].

I have been trying to achieve this with

"#foo#bar".match(/((?:#)[a-zA-Z0-9\-_]*)/g)

but it returns ['#foo', '#bar'], when the non-capturing prefix ?: should ignore the # character.

Is there a better solution than slicing each one of the returned strings?

3
  • 1
    Here’s a one-liner: str.replace(/[^#]+|(#[a-zA-Z0-9\-_]*)/g, '$1').split('#').slice(1) Commented Jun 2, 2012 at 18:50
  • split doesn't work in ie8 Commented Sep 16, 2014 at 0:45
  • 4
    @webaba Why would ie8 even be relevant for anything in september 2014 unless it's a specific request? Commented Feb 13, 2015 at 14:26

7 Answers 7

12

You could use .replace() or .exec() in a loop to build an Array.

With .replace():

var arr = [];
"#foo#bar".replace(/#([a-zA-Z0-9\-_]*)/g, function(s, g1) {
                                               arr.push(g1);
                                          });

With .exec():

var arr = [],
    s = "#foo#bar",
    re = /#([a-zA-Z0-9\-_]*)/g,
    item;

while (item = re.exec(s))
    arr.push(item[1]);
Sign up to request clarification or add additional context in comments.

Comments

5

It matches #foo and #bar because the outer group (#1) is capturing. The inner group (#2) is not, but that' probably not what you are checking.

If you were not using global matching mode, an immediate fix would be to use (/(?:#)([a-zA-Z0-9\-_]*)/ instead.

With global matching mode the result cannot be had in just one line because match behaves differently. Using regular expression only (i.e. no string operations) you would need to do it this way:

var re = /(?:#)([a-zA-Z0-9\-_]*)/g;
var matches = [], match;
while (match = re.exec("#foo#bar")) {
    matches.push(match[1]);
}

See it in action.

1 Comment

No need with this to group the hash key at all (and then exclude it).
2

I'm not sure if you can do that using match(), but you can do it by using the RegExp's exec() method:

var pattern = new RegExp('#([a-zA-Z0-9\-_]+)', 'g');
var matches, ids = [];

while (matches = pattern.exec('#foo#bar')) {
    ids.push( matches[1] ); // -> 'foo' and then 'bar'
}

Comments

1

Unfortunately there is no lookbehind assertion in Javascript RegExp, otherwise you could do this:

/(?<=#)[a-zA-Z0-9\-_]*/g

Other than it being added to some new version of Javascript, I think using the split post processing is your best bet.

Comments

1

You can use a negative lookahead assertion:

"#foo#bar".match(/(?!#)[a-zA-Z0-9\-_]+/g);  // ["foo", "bar"]

1 Comment

It does return ['foo', 'bar'], but won't search for the #, so "#foo#bar.foobar".match(/(?!#)[a-zA-Z0-9\-_]+/g); will return ['foo', 'bar', 'foobar']
1

The lookbehind assertion mentioned some years ago by mVChr is added in ECMAScript 2018. This will allow you to do this:

'#foo#bar'.match(/(?<=#)[a-zA-Z0-9\-_]*/g) (returns ["foo", "bar"])

(A negative lookbehind is also possible: use (?<!#) to match any character except for #, without capturing it.)

Comments

0

MDN does document that "Capture groups are ignored when using match() with the global /g flag", and recommends using matchAll(). matchAll() isn't available on Edge or Safari iOS, and you still need to skip the complete match (including the#`).

A simpler solution is to slice off the leading prefix, if you know its length - here, 1 for #.

const results = ('#foo#bar'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);

The [] || ... part is necessary in case there was no match, otherwise match returns null, and null.map won't work.

const results = ('nothing matches'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.