I'm trying to run some calculation over an array of aggregated data. When using a SQL function it's working:
CREATE TEMPORARY FUNCTION uniq_sum(cls array<struct<word string,word_count int64>>) AS (
(select sum(word_count) from (select row_number() over (partition by word) r,word_count from unnest(cls)) where r=1)
);
select
corpus,
uniq_sum(array_agg(struct(word,word_count))) res
from `bigquery-public-data.samples.shakespeare`
group by corpus
When I try to run this inline, I get an error: Aggregate function ARRAY_AGG not allowed in UNNEST.
Is it possible to run inline calculations over an array created by array_agg? In this case I'm trying to run some version of sum(distinct) where the distinct key is taken over some string element (so for many pairs of word,word_count I would like to run sum(word_count), and sum only one element per word).
select
corpus,
(select sum(word_count) from (select row_number() over (partition by word) r,word_count from unnest(array_agg(struct(word,word_count))) where r=1))
from `bigquery-public-data.samples.shakespeare`
group by corpus