I would like to write a simple internpreter in JavaScript/Node. I got an obstacle when it comes to generate tokens.
var code = 'if (a > 2 && b<4) c = 10;';
code.match(/\W+/g)
// [" (", " > ", " && ", "<", ") ", ";"]
code.match(/\w+/g)
// ["if", "a", "2", "b", "4", "elo"]
As shown, W+ lets me get special characters and w+ lets me get words. I wonder how to get those in one array, something like below:
// ["if", "(", "a", ">", "2", "&&", "b", "<", "4", ")", "c", "=", "10", ";"]
\W+is extremely naive, consider(!a+-1),/\w+|\W+/will generate(! , a , +-, 1, )while the correct tokenization is(, !, a, +, -1, )