JavaScript Regular Expression String Match/Replace

Question

Given the string; "{abc}Lorem ipsum{/abc} {a}dolor{/a}"

I want to be able find occurrences of curly brace "tags", store the tag and the index where it was found and remove it from the original string. I want to repeat this process for each occurrence, but because I'm removing part of the string each time the index must be correct...I can't find all the indices THEN remove them at the end. For the example above, what should happen is;

Search the string...
Find "{abc}" at index 0
Push { tag: "{abc}", index: 0 } into an array
Delete "{abc}" from string
Repeat step 1 until no more matches can be found

Given this logic, "{/abc}" should be found at index 11 - since "{abc}" has already been removed.

I basically need to know where those "tags" start and end without actually having them as part of the string.

I'm almost there using regular expressions but it sometimes skips occurrences.

let BETWEEN_CURLYS = /{.*?}/g;
let text = '{abc}Lorem ipsum{/abc} {a}dolor{/a}';
let match = BETWEEN_CURLYS.exec(text);
let tags = [];

while (match !== null) {
    tags.push(match);
    text = text.replace(match[0], '');
    match = BETWEEN_CURLYS.exec(text);
}

console.log(text); // should be; Lorem ipsum dolor
console.log(tags);

/**
 * almost there...but misses '{a}'
 * [ '{abc}', index: 0, input: '{abc}Lorem ipsum{/abc} {a}dolor{/a}' ]
 * [ '{/abc}', index: 11, input: 'Lorem ipsum{/abc} {a}dolor{/a}' ]
 * [ '{/a}', index: 20, input: 'Lorem ipsum {a}dolor{/a}' ]
 */

Wiktor Stribiżew · Accepted Answer · 2017-11-13 17:23:25Z

3

You need to subtract the match length from the regex lastIndex value, otherwise the next iteration starts farther than expected (since the input becomes shorter, and the lastIndex does not get changed after you call replace to remove the {...} substring):

let BETWEEN_CURLYS = /{.*?}/g;
let text = '{abc}Lorem ipsum{/abc} {a}dolor{/a}';
let match = BETWEEN_CURLYS.exec(text);
let tags = [];

while (match !== null) {
    tags.push(match);
    text = text.replace(match[0], '');
    BETWEEN_CURLYS.lastIndex = BETWEEN_CURLYS.lastIndex - match[0].length; // HERE
    match = BETWEEN_CURLYS.exec(text);
}

console.log(text); // should be; Lorem ipsum dolor
console.log(tags);

Some more RegExp#exec reference to bear in mind:

If your regular expression uses the "g" flag, you can use the exec() method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property (test() will also advance the lastIndex property).

edited Nov 13, 2017 at 17:23

answered Nov 13, 2017 at 17:05

Wiktor Stribiżew

631k41 gold badges502 silver badges633 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

zfrisch Over a year ago

This is a really good answer. I was working on something similar but I didn't realize .lastIndex was a property. +1

Lewis Peel Over a year ago

Ahh, that makes sense! Thanks so much!

Wiktor Stribiżew Over a year ago

@LewisPeel I was trying to come up with the replacing approach I mentioned in your previous post, but I see that it is not quite possible to easily get it to work since the input part will be tricky to handle.

Lewis Peel Over a year ago

@WiktorStribiżew Yeah, I think we should all forget my previous post...too many crossed wires haha

Collectives™ on Stack Overflow

JavaScript Regular Expression String Match/Replace

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related