JSON Package Standards Implementation
=====================================

The JSON Parser package provides three conversion functions, a JSON Parser, a JSON 
Generator and a JSON Validator. Additionally there are also a rich set of JSON
Support Functions to assist C developers is using JSON Data easily with the minimum
of coding overhead. Those are described in another document.

This document presents information related to standards compliance of this package
and provides any notes required to understand subtle compliance issues.


Background
----------

This JSON Parser package was written to provide not only a fully functional, fast
and accurate JSON Parser for C programmers, but also to provide the JSON data in a
form that is more generally useful to C developers. Most JSON parsers write the
parsed JSON data into arrays etc, and there are numerous issues as C as a language
does not natively support unbounded or variable sized data. However the D-List 
package for C does natively provide such support with a very easy to use, and rich,
set of available functions to access, modify, add to, manipulate, manage and copy 
this variable data. The D-List package is also extremely fast & efficient and a 
great way to manipulate varied data from different sources.

All functions described in this document that start with json_xxx are part of the 
JSON Parser package, and details about there usage and arguments, can be found in
the JSON Parser documentation. All functions that start with list_xxx are part of
the D-List package.

JSON is a data format focused on inter-system and inter-language interoperability. 
If you are unfamiliar with JSON grammar or syntax, please refer to ECMA-404 and 
RFC 8259 or 
	
	http://json.org 

before trying to use the JSON Support Functions.

Throughout this JSON Parser package, all JSON Objects and JSON Arrays are accessed
and manipulated using D-List list_objects, and all JSON value items are elements
within those list_objects. The list_objects are arranged hierarchically allowing
these data structures to represent the JSON syntax exactly.

If you are unfamiliar with D-List, then please consult the reference documentation.
To find that documentation, or to learn more about list_objects, elements, and the
rich set of D-List functions; please refer to the D-List documentation, which is 
available in the D-List distribution Header Files, or a set of Doxy files that are
included in that distribution, or online at 

	http://info.fwsentry.org/dlist/index.html

STANDARDS
=========

This implementation of this JSON Parser package is completely compliant with 
RFC 8259 and ECMA-404. It does not impose any restrictions on what may or may 
not be deemed fully interoperable between JSON peers sharing data, it does 
conform to the specifications of RFC 8259, but does not impose restrictions
based its guidelines of interoperability. 

Based on RFC 8259's written document structure, the following notes should 
help clarify any details.

	2. JSON Grammar 

	JSON grammar is strictly enforced by the validator and the parser. They
	enforce the same requirements as those imposed by RFC 8259 Section 10, and
	do not allow variations beyond those standards as MAY be permitted but not
	required or standard, by Section 9.

	3. Values

	All JSON values are kept intact, and in the order they were present in the
	original JSON data block. All literal values are converted to a JSON_value_type
	present in each D-List element that represents any data value or Object Member 
	name:value pair.

	All numbers are converted, integers to a 64 bit minimum size, and fractional
	numbers to a long double. Exponent numbers may be converted to long double or
	optionally may be left intact as text (string) representations for the caller
	to convert themselves later. This is an option available when calling the JSON
	Parser function to allow callers with high precision number requirements to 
	handle the conversions themselves to what ever standard of precision they wish.

	4. Objects

	Objects are represented as a D-List list_object, each member of the JSON Object
	being an element in that list. All JSON object members are converted in the 
	order they were presented in the original JSON data block. RFC 8259 states that
	member names should be unique. This implementation parses all object members,
	therefore it allows non-unique names in an object. If you use the D-List function
	list_simple_search() on such a list it will always return the last instance of
	a name:value member in that list. All the other list_search_xx() type functions
	in D-List can be used to find any or all of such duplicate members if needed.

	The JSON Support Functions provide a much richer set of tools to access JSON 
	Objects and data, these were built on top of the D-List functions to simplify
	JSON data usage for users. These functions can find all JSON Object Members with
	equal names, etc.

	However for proper interoperability the author strongly suggests that users of
	JSON stick to the recommendations of the RFC, and avoid duplicate member names,
	and use array values on a single instance of the member name instead. One area
	where non-unique member names is appropriate is where a JSON user wants to 
	incorporate something in the JSON grammer, that is not important to the
	consumer of that object, for example comments. Multiple object members with the
	name "comment" would be a sensible way of documenting a JSON data block without
	either violating the standards of JSON grammar, or confusing a parser or consumer 
	of that data.

	When using any element in an object list, or iterating through that list, do not
	make any assumptions about what type of value the element value is. ALWAYS check
	the element JSON_value_type first before accessing the value item.

	JSON Object Member names can be any string (by definition of the standards), so
	that also means they can be a zero length string  ( "" ) which may not be very
	useful to data blocks, but is part of the standard. A JSON Object Member D-List
	element will always have a pointer to the Member name, which will always be a
	valid string, even zero length.

	5. Arrays

	Arrays are represented by a subordinate D-List list_object. The elements of that
	list_object can be of any legitimate JSON value type. As per the RFC there is no
	requirement for the elements to be of the same type.

	Just as with JSON Objects, when using an array list, or iterating through that 
	list, do not make any assumptions about what type of value the element value is.
	ALWAYS check the element JSON_value_type first before accessing the value item.

	JSON Array values have no names, (by definition of the standards), so that also
	means a JSON array value D-List element will always have a NULL object name 
	pointer and therefore no Member name. The JSON Object Member name, if there
	was one, will be in the superior list element that originally pointed to this
	array list_object.

	If you code has an element from a JSON data block, but for some reason does not
	know where it came from (an Object or an Array), checking to see if the pointer
	to object_name is NULL or not will answer the question.

	6. Numbers, were covered above in 3. Values.

	7. Strings

	All strings in the JSON Validator, Parser and Generator eco-system are stored
	and transfered as nul terminated characters arrays. They are not kept with a
	specific size attached. This follows C language style semantics and formal
	UTF-8 encoding practices. It also means that it is not permitted in this JSON
	eco-system to embed nul (0x00) bytes in text values, as they will cause the 
	premature termination of such string values. The parser and validator will
	ignore nul bytes in the input values. The generator will terminate the 
	processing of a textural value when it encounters a nul byte.

	The JSON Parser preserves all string values as is. All string values are in 
	fact left in-situ in the original JSON data block that was presented for 
	parsing. This data block is extensively modified by the parser. It leaves 
	strings in place, but terminates them with a nul at the end of the string. 
	Additionally any escape sequences present in the string are converted to the 
	actual character in-situ, and the escape character is removed. So for 
	example "The test string\n\tNew line." would be converted to a character array
	of 27 characters, the \n would be changed to a 0x12 and the \t to 0x11, all 
	the following characters would be shifted to the left to fill the empty space
	left by the escape characters, and the string would be terminated at the end
	by 0x00. Therefore the char * pointer to that character array would act 
	correctly, and be treated as a string by all C library functions, and still 
	be in-situ in the original JSON data block. The design idea of this JSON 
	Parser was to reduce the heap overhead required to represent everything in 
	that JSON data block, and greatly simplify the removal of all the objects
	(text strings etc) after the user has finished with the JSON data.

	JSON escapes \u are similarly encoded in-situ removing the original hex escape 
	sequence. However it should be noted that if any byte in a \uxxxx sequence is
	nul (0x00), that byte will be ignored and not written into the output character
	string. So for example if the serializer of the JSON had escaped a tab control
	character 0x11 as \u0011 instead of \t (which is permitted by the standards) 
	the byte 0x00 will be ignored, and the byte 0x11 will be added to the string.
	If the original JSON serializer encoded \u0000 both bytes will be ignored by 
	this parser. Although it is valid in JSON grammar to pass such a value, in 
	any C type language, those bytes would terminate a character array and render
	the rest of the string useless. No syntax errors are thrown for nul bytes
	encoded this way, they are just silently ignored to preserve C language 
	semantics.

	If the caller of the JSON Parser wishes to preserve (not convert) all exponent
	numbers, they will be left in-situ as strings as well. However instead of being
	described as JSON_string types, they will be JSON_float_str so the caller knows
	the difference when they find the elements.

	8.1 Character Encoding

	This JSON Parser makes no attempt to enforce or even recognize the encoding used.
	It parses the characters in-situ and allows the caller to decide what data is or
	is not there. It neither removes byte order marks (0xFEFF) nor tags them as errors.
	It passes all text as is. Similarly unpaired UNICODE characters are passed as is
	by this parser. The consumer of these data items is responsible for safe handling 
	and usage of them. There is an underlying assumption that the data is in UTF-8
	form, but there is no such enforcement, and nothing changes or is damaged, if the
	data is not in UTF-8 format.

	8.3 String Comparison

	This JSON Parser provides a set of default search functions and compare 
	functions which act on strings as-is. If the caller wishes to transform 
	textural values into UNICODE first before such comparisons or compares, they 
	can provide their own search and compare functions to add to the lists by 
	default. These search or compare functions are very simple and easy to write.

	9. Parsers

	This JSON Parser accepts all forms of text that conform to the JSON grammar. This
	parser sets no limits on the size of texts it accepts. If it was able to fit in the
	original JSON data block, it is accepted and used in-situ.

	This JSON Validator does set a limit of 2048 nested levels of JSON Objects and
	JSON Arrays. 
	
	This JSON Parser sets no limits at all, except those that may be enforced by
	the host operating system, such as available heap space. The parser is 
	totally recursive and re-entrant, it will parse any valid JSON data block,
	mo matter how deeply it is nested.

	This JSON Validator and Parser set no limits on string length or contents.
	
	This JSON Generator imposes no limits, and acts as a mirror of the JSON
	Parser using the same recursive model.

	10. Generators

	This package includes a JSON Generator. It translates a valid D-List 
	list_object tree representation of JSON data into a textural JSON data 
	block, according to the standards of RFC 8259. The caller of the JSON 
	Generator function gets a nul terminated character array back, which they
	can write to a file, or send over a socket, or whatever they want to do 
	with it. The output of this JSON Generator is strictly compliant with the
	standards set forth by RFC 8259 and ECMA-404.

	Generator Notes: 
		Strings

		The JSON Generator assumes all input strings, either member names or
		values, are ASCII or UTF-8. There is no checking beyond nul bytes which
		have no place inside a UTF-8 string. No enforcement of FF, FE, C0 or C1.
		Both the parser and generator assume the callers will properly deal with
		byte encodings. It is the responsibility of the caller to enforce whatever
		encoding standards they wish for the text strings. All control characters
		(0x01 - 0x1F) are either directly escaped or rendered in \uxxxx form. nul
		0x00 is ignored as it will signify the end of the character array being 
		generated and any characters after the nul byte will be implicitly ignored.
	
		Style
	
		The human readable textural data output by the JSON Generator is in a 
		particular style, which is in-line with general conventions of both C
		and JSON. The author investigated a considerable number of JSON files 
		from many different sources, and settled on a format style that seems
		to fit within the general ways people write or generate JSON text data.
	
		Content
	
		If you use the JSON Parser to create D-List data from a JSON data file,
		and then use the JSON Generator to create another textural 
		representation of that original data, it will look very similar, with
		some style differences. The only substantial difference will be that
		any hex escape sequences will not be reproduced, but output instead as
		UTF-8 bytes, with the exception of any characters that are required by 
		the standards to be escaped.
	
		Exponent Numbers.
	
		The production of exponent numbers will depend on how the caller of the
		JSON Generator prepared those numbers prior to calling. If those exponent
		numbers were put in the D-List elements as textural strings, they are
		reproduced exactly. If they were put in as converted double float numbers
		the results are only going to be as good as the basic C libraries that 
		are available on your system, and the various precision and rounding 
		issues they may have. In general it is best to create these as textural
		floating point representations prior to calling the JSON Generator if
		accuracy and precision for exponent numbers is important to your needs.

	12. Security Considerations

	This parser uses the JSON data block provided by the caller. All other data
	fields and structures are provided by the D-List subsystem. If data leakage
	is considered an issue for the caller, they are responsible for managing the
	original JSON data block, and may take extra steps to scrub it after usage,
	or extra steps to allocate it in a random location if desired. The internal
	D-List data, the list_object, the elements, and the internal space management 
	data are released when the root list_object is erased by list_erase(). If
	extra security precautions are required by the caller, they may take 2 extra
	steps. First if they compile the D-List code that is to be used in this parser
	with the compile option DLIST_FLUSH_MEMORY, D-List will enforce a strict
	memory scrubbing system to purge all data before release. It makes D-List a
	little slower, but cleans everything. Another option for the caller is to
	provide their own remove_function to the root list_object and all the others.
	But be sure that your replacement also provides the same functions as the 
	in-built one does, otherwise there will be memory leaks galore.

Extra Note
----------

It seems that the issue of whether or not comments

		/* blah blah blah */ or // more blah 

should be permitted in JSON data blocks is rather contentious among the user
community. It does seem clear that there is no contention at all in the 
standards community, IETF and ECMA both clearly state the JSON grammar does
not include comments.

This implementation of both the JSON Validator and JSON Parser sticks to the 
standards, and will reject any JSON data block that contains comments, or 
other non JSON grammar encoding. Although RFC 8259 section 9 states that 

	"A JSON parser MAY accept non-JSON forms or extensions."

This implementation does not accept any non standard grammar forms or syntax.
Not just for the obvious operability argument, but also because the next 
section (10. Generators) states very clearly, that 

	"A JSON generator produces JSON text. The resulting text MUST 
	strictly conform to the JSON grammar."

There would be no point in having a JSON Parser that works on a different 
standard of grammar than that imposed on the JSON Generator of that grammar, 
when both are in the same JSON Package.

If at some point in the future the IETF and/or ECMA change the standard to 
accept comments or update the available JSON grammar and syntax, this package
will be updated accordingly.

If you as a JSON user feel strongly about incorporating comments into the
JSON grammar specification, i urge you to communicate with the IETF and ECMA
about the issue. I also implore you to NOT just go and start changing software
to accept what is not allowed by specification, just because you personally 
don't like it. 

Until then, just put your comments in a value field! It is not hard and 
works within the existing standards. It also allows everyone to properly
interoperate and exchange data, which was the whole point.

	{ 	"myobject" 	: "valuable data",
		"another" 	: 3.14159,					"//" : "was that pi ?",
		"comment"	: "simple enough isn't it",
		"more data"	: true,						"" : "Another style of comment",
		"label_b"	: "something important"		}


