4

The bigquery method: INFORMATION_SCHEMA.TABLE_OPTIONS returns a string of arrays of structs for option_value when option_name is "labels". More information is here: TABLE_OPTIONS For example, I create the table:

CREATE OR REPLACE TABLE sample_dataset.sample_table
OPTIONS(
    labels=[("env","dev"),("dep","hr")]
)

I then query from the table:

SELECT * FROM sample_table.INFORMATION_SCHEMA.TABLE_OPTIONS

This returns the following:

[
  {
    "table_catalog": "sample_project",
    "table_schema": "sample_dataset",
    "table_name": "sample_data",
    "option_name": "labels",
    "option_type": "ARRAY<STRUCT<STRING, STRING>>",
    "option_value": "[STRUCT(\"env\", \"dev\"), STRUCT(\"dept\", \"hr\")]"
  }
]

How can I transform this table to something more like:

[
  {
    "table_catalog": "sample_project",
    "table_schema": "sample_dataset",
    "table_name": "sample_data",
    "env": "dev",
    "dept":"hr"

  }
]

I have tried all the answers to this stack overflow question with no luck: Stringified array bigquery

2 Answers 2

6

Consider below approach

select table_catalog, table_schema, table_name, 
  array(
    select as struct arr[offset(0)] key, arr[offset(1)] value
    from unnest(regexp_extract_all(option_value, r'STRUCT\(("[^"]+", "[^"]+")\)')) kv, 
    unnest([struct(split(replace(kv, '"', ''), ', ') as arr)]) 
  ) options 
from sample_dataset.INFORMATION_SCHEMA.TABLE_OPTIONS
where table_name = 'sample_data'
and option_name = 'labels'       

if applied to sample use case in your question - output is

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

Another option to the nice solution offered by @Mikhail_Berlyant would be to use the TO_JSON function when it will be available.

Since January 2022, JSON is now a data type supported by BigQuery, but is not yet Generally Available.

When it becomes GA (or if you are lucky enough to already have access, I think it's possible to apply) you can simply do:


SELECT * EXCEPT (option_value),
    TO_JSON(option_value) AS option_value_json
FROM sample_table.INFORMATION_SCHEMA.TABLE_OPTIONS

And you will be able to extract the information you need very simply with the BQ JSON function.

Some refs: here and here

1 Comment

This doesn't work since the value is stored literally as "[STRUCT(\"env\", \"dev\"), STRUCT(\"dept\", \"hr\")]", which is not valid JSON. Wonder why google would use such an atrocious format.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.