4

I have a string like this:

var s = '{ "foo": { "bar": { "baz": 5 } } } Some more content';

Is there a reliable way in JavaScript to split the string into the JSON and non-JSON part? The string always begins with valid JSON, but other than that the JSON can contain anything, so tricks like counting braces won't work.

The best option I see at the moment is to use the json-lint module to get the position of the error and split at the error position.

Any other ideas?

4
  • 8
    Personally I'd work on getting rid of the code that makes a mess like that in the first place. How is that produced? Commented May 23, 2016 at 19:34
  • Can the 'Some more content' contain a closing brace character? Ideally I would also recommend just altering how this is done. Use a Tuple<,> or something and put the Json in Item1 and the 'Some more content' in Item2. Commented May 23, 2016 at 19:43
  • @Pointy It's produced by embedding "JSON start markers" like json= in a text that I parse. I could cheat and disallow a ' character and enclose the string in single quotes, but I'd like a better solution, if possible. If not, I'll probably just disallow text after the JSON. Commented May 23, 2016 at 19:50
  • 2
    If you can disallow text after the JSON, can you move the text into the JSON? {"body": { ... }, "post": "text!"}? Commented May 23, 2016 at 19:55

2 Answers 2

2

Using the same approach as KevBot but potential elimination a lot of try {} catch(e) {}:

var s = '{ "foo": { "bar": { "baz": 5 } } } Some mo}re }content';
var offset = s.length + 1;
while (offset > 0) {
    offset = s.lastIndexOf('}', offset - 1);
    var maybe_json = s.slice(0, offset + 1);
    try {
        var json = JSON.parse(maybe_json);
    break;
    } catch (e) {}
}
var rest = s.slice(offset + 1);

console.log(json); // { "foo": { "bar": { "baz": 5 } } }
console.log(rest); // " Some mo}re }content"
Sign up to request clarification or add additional context in comments.

2 Comments

Great solution, thanks. I'll refrain from the bad practice and promise to sanitize my input string ;-)
Ah! I now see what you meant. Very nice!
2

As stated in the comments, you really should look at cleaning up the string that you are working with so that it is only actual JSON.

Nevertheless, if you really want to move forward with the way you have things, here is a way you can do it with just vanilla JavaScript. This assumes as you say, the string always begins with "valid" JSON.

function splitMyJsonString(string) {

    var jsonData = getValidJson(string);

    return [
        jsonData.json,
        string.split('').slice(jsonData.errorIndex).join('')
    ];

    function getValidJson(s) {
        var newString = '';
        for (var i = 0; i < s.length; i++) {
            try {
                return {json: JSON.parse(newString), errorIndex: i}
            }
            catch (e) {
                newString += s.charAt(i);
            }
        }
        return {json: null, errorIndex: 0}
    }
}

What you get back from the function call is an array where at the first index is your parsed JSON as an object, and at the second index is the remaining string (since you asked for a split). If no valid json was found at the beginning of the string, you will get an array back where at the first index is null.

var s = '{ "foo": { "bar": { "baz": 5 } } } Some more content';
console.log(splitMyJsonString(s)); // logs: [Object, " Some more content"]

// If no valid JSON is at the start of the string...
var z = 'Some more content';
console.log(splitMyJsonString(z)); // logs [null, "Some more content"]

First, we split the entire string into an array for array slicing later. Then we get the valid JSON from the beginning of the string. The way we do this is be trying to parse a piece at a time until we get an actual pass (sigh). Once that passes, or doesn't we return the values. I left a few gaps in the explanation, but the main points have been covered.

function splitMyJsonString(string) {

  var jsonData = getValidJson(string);

  return [
    jsonData.json,
    string.split('').slice(jsonData.errorIndex).join('')
  ];

  function getValidJson(s) {
    var newString = '';
    for (var i = 0; i < s.length; i++) {
      try {
        return {
          json: JSON.parse(newString),
          errorIndex: i
        }
      } catch (e) {
        newString += s.charAt(i);
      }
    }
    return {
      json: null,
      errorIndex: 0
    }
  }
}

var s = '{ "foo": { "bar": { "baz": 5 } } } Some more content';
console.log(splitMyJsonString(s)); // logs: [Object, " Some more content"]

// If no valid JSON is at the start of the string...
var z = 'Some more content';
console.log(splitMyJsonString(z)); // logs [null, "Some more content"]

5 Comments

What a hack... But atleast you got him something to work with. Let`s hope he normalizes his input :)
@jruts... I know :( Makes me sad. I hope he does so as well.
Might consider to slurp until last } and then try to parse. See: .lastIndexOf
@andlrc, I thought about that, it's a good idea, but wasn't sure if in the string after the "json" there could ever possibly be a closing bracket }. If user generated string, then the chance will be higher.
OP said that the last string could contain anything. So slurping until the last } would not have been a "safe" option.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.