2

Given the string; "{abc}Lorem ipsum{/abc} {a}dolor{/a}"

I want to be able find occurrences of curly brace "tags", store the tag and the index where it was found and remove it from the original string. I want to repeat this process for each occurrence, but because I'm removing part of the string each time the index must be correct...I can't find all the indices THEN remove them at the end. For the example above, what should happen is;

  • Search the string...
  • Find "{abc}" at index 0
  • Push { tag: "{abc}", index: 0 } into an array
  • Delete "{abc}" from string
  • Repeat step 1 until no more matches can be found

Given this logic, "{/abc}" should be found at index 11 - since "{abc}" has already been removed.

I basically need to know where those "tags" start and end without actually having them as part of the string.

I'm almost there using regular expressions but it sometimes skips occurrences.

let BETWEEN_CURLYS = /{.*?}/g;
let text = '{abc}Lorem ipsum{/abc} {a}dolor{/a}';
let match = BETWEEN_CURLYS.exec(text);
let tags = [];

while (match !== null) {
    tags.push(match);
    text = text.replace(match[0], '');
    match = BETWEEN_CURLYS.exec(text);
}

console.log(text); // should be; Lorem ipsum dolor
console.log(tags);

/**
 * almost there...but misses '{a}'
 * [ '{abc}', index: 0, input: '{abc}Lorem ipsum{/abc} {a}dolor{/a}' ]
 * [ '{/abc}', index: 11, input: 'Lorem ipsum{/abc} {a}dolor{/a}' ]
 * [ '{/a}', index: 20, input: 'Lorem ipsum {a}dolor{/a}' ]
 */

1 Answer 1

3

You need to subtract the match length from the regex lastIndex value, otherwise the next iteration starts farther than expected (since the input becomes shorter, and the lastIndex does not get changed after you call replace to remove the {...} substring):

let BETWEEN_CURLYS = /{.*?}/g;
let text = '{abc}Lorem ipsum{/abc} {a}dolor{/a}';
let match = BETWEEN_CURLYS.exec(text);
let tags = [];

while (match !== null) {
    tags.push(match);
    text = text.replace(match[0], '');
    BETWEEN_CURLYS.lastIndex = BETWEEN_CURLYS.lastIndex - match[0].length; // HERE
    match = BETWEEN_CURLYS.exec(text);
}

console.log(text); // should be; Lorem ipsum dolor
console.log(tags);

Some more RegExp#exec reference to bear in mind:

If your regular expression uses the "g" flag, you can use the exec() method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property (test() will also advance the lastIndex property).

Sign up to request clarification or add additional context in comments.

4 Comments

This is a really good answer. I was working on something similar but I didn't realize .lastIndex was a property. +1
Ahh, that makes sense! Thanks so much!
@LewisPeel I was trying to come up with the replacing approach I mentioned in your previous post, but I see that it is not quite possible to easily get it to work since the input part will be tricky to handle.
@WiktorStribiżew Yeah, I think we should all forget my previous post...too many crossed wires haha

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.