EDIT:
As pozs points out, there are two typeof functions: one for JSON and one for SQL. This query is the one you're looking for:
SELECT
json_data.key,
jsonb_typeof(json_data.value),
count(*)
FROM x, jsonb_each(x.data) AS json_data
group by key, jsonb_typeof
order by key, jsonb_typeof;
Old Answer: (Hey, it works...)
This query will return the type of the keys:
SELECT
json_data.key,
pg_typeof(json_data.value),
json_data.value
FROM x, jsonb_each(x.data) AS json_data;
... unfortunately, you'll notice that Postgres doesn't differentiate between the different JSON types. it regards it all as jsonb, so the results are:
key1 | value1 | value
------+--------+-----------
a | jsonb | "test"
b | jsonb | 123
c | jsonb | null
d | jsonb | true
a | jsonb | "test"
b | jsonb | 123
c | jsonb | null
d | jsonb | "yay"
e | jsonb | "foo"
f | jsonb | [1, 2, 3]
(10 rows)
However, there aren't that many JSON primitive types, and the output seems to be unambiguous. So this query will do what you're wanting:
with jsontypes as (
SELECT
json_data.key AS key1,
CASE WHEN left(json_data.value::text,1) = '"' THEN 'String'
WHEN json_data.value::text ~ '^-?\d' THEN
CASE WHEN json_data.value::text ~ '\.' THEN 'Number'
ELSE 'Integer'
END
WHEN left(json_data.value::text,1) = '[' THEN 'Array'
WHEN left(json_data.value::text,1) = '{' THEN 'Object'
WHEN json_data.value::text in ('true', 'false') THEN 'Boolean'
WHEN json_data.value::text = 'null' THEN 'Null'
ELSE 'Beats Me'
END as jsontype
FROM x, jsonb_each(x.data) AS json_data -- Note that it won't work if we use jsonb_each_text here because the strings won't have quotes around them, etc.
)
select *, count(*) from jsontypes
group by key1, jsontype
order by key1, jsontype;
Output:
key1 | jsontype | count
------+----------+-------
a | String | 2
b | Integer | 2
c | Null | 2
d | Boolean | 1
d | String | 1
e | String | 1
f | Array | 1
(7 rows)
INSERTstatement isn't formatted quite right.json[b]_typeof(json[b])function(s) give exactly, what you need (requires PostreSQL 9.4+). But I'm not sure, how you want to aggregate your results, f.ex. how do you want to represent thed | boolean:1 string:1row?typeoffunction, I didn't think to go looking for a second.jsonb_typeofabout 10% faster, thanks guys