Automating averages and stdevs for large dataset in set intervals of rows

Question

I am currently trying to make a dataset which refers to a series of data in another tab, of which it takes 20 points (vertically) and averages them and gives a stdev then two open rows and then again average+stdev of the next 20 points. How can this be automated without having to manually insert all of the ranges?

Column A
'Blank'
'Avg1'
'StDev1'
'Blank'
'Blank'
'Avg2'
'StDev2'
'Blank'

This is the format used in an already setup datasheet to which I would like to be able to automate the above problem.

The first set is =Average('Series Data'!Q2:Q21) and if the next is set to =Average('Series Data'!Q22:Q41) and pulled down the next becomes =Average('Series Data'!Q10:Q11)

" =Average('Series Data'!Q2:Q20) and if the next is set to =Average('Series Data'!Q22:Q42) and pulled down the next becomes =Average('Series Data'!Q10:Q11)" - I don't see any logical pattern here - do you mean something like 2-20, 22-40, 42-60? — Michal
– Michal, Commented Jan 29 at 10:59
It should be Q2:Q21, Q22:Q41, Q42:Q61, My mistake. Its sets of 20 points beginning from Q2. — Jan van oest
– Jan van oest, Commented Jan 29 at 11:32

DjC · Accepted Answer · 2025-01-29 18:37:40Z

3

Another option for a large dataset (e.g. 100k+ rows):

=LET(
    rws, QUOTIENT(XMATCH(,DROP('Series Data'!Q:Q,2)),20),
    arr, INDEX('Series Data'!Q:Q,SEQUENCE(rws,20,2)),
    pad, EXPAND("",rws,,""),
    TOCOL(HSTACK(pad,BYROW(arr,AVERAGE),BYROW(arr,STDEV),pad))
)

The number of iterations (rws) will be determined automatically, assuming there are no blank cells in your data series... XMATCH will return the row number of the first blank cell found in the lookup_array, when the lookup_value is omitted.

answered Jan 29 at 18:37

DjC

1,4122 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

P.b Jan 29 at 18:42

"XMATCH will return the row number of the first blank cell found in the lookup_array, when the lookup_value is omitted." very nice

P.b · Accepted Answer · 2025-01-29 18:26:51Z

1

Reduce is perfect for this: =REDUCE("",SEQUENCE(3,,2,20),LAMBDA(a,b,LET(c,INDEX('Series Data'!Q:Q,SEQUENCE(20,,b)),VSTACK(a,AVERAGE(c),STDEV(c),""))))

Changing 3 will change the number of iterations of the formula

edited Jan 29 at 18:26

answered Jan 29 at 16:05

P.b

12k2 gold badges15 silver badges34 bronze badges

Comments

Ron Rosenfeld · Accepted Answer · 2025-01-29 18:34:41Z

Put your data into a Table. In the formula, the table name is dataTbl and the column header is Data.

=LET(
    d, dataTbl[Data],
    a, WRAPCOLS(d, 20),
    avg, BYCOL(a, LAMBDA(arr, AVERAGE(arr))),
    stdev, BYCOL(a, LAMBDA(arr, STDEV.S(arr))),
    x, HSTACK(TOCOL(avg), TOCOL(stdev)),
    b, MAKEARRAY(
        COUNT(avg) * 4,
        1,
        LAMBDA(r, c, IFERROR(INDEX(x, INT((r - 1) / 4) + 1, MOD(r - 1, 4) + 1), ""))
    ),
    b
)

The 20 is the number of items in each summarization.
The 4 refers to the number of rows for each summarization
- Avg
- Std Dev
- blank
- blank

Enter the formula in row 2 or lower so as to have a blank row above.

Note that the STDEV function is being replaced by STDEV.S and other variations. You should read the documentation to ensure you are using the one you want.

Black cat · Accepted Answer · 2025-01-29 14:35:29Z

0

In this formula adjust the
mul value (in the example 4) to the required count of blocks of the result.

=LET(first,'Series Data'!Q$2:Q$21,
mul,4,
cella,BYROW(SEQUENCE(mul,1,0,20),LAMBDA(x,AVERAGE(OFFSET(first,INDEX(x,1,1),0)))),
cellb,BYROW(SEQUENCE(mul,1,0,20),LAMBDA(x,STDEV(OFFSET(first,INDEX(x,1,1),0)))),
DROP(TOCOL(HSTACK(TOCOL(TEXTSPLIT(REPT("|",mul),"|")),cella,cellb,TOCOL(TEXTSPLIT(REPT("|",mul),"|")))),-4))

Copy it to cell A1, the formula will spill the result.

answered Jan 29 at 14:35

Black cat

7,6358 gold badges69 silver badges99 bronze badges

Comments

user22566114 · Accepted Answer · 2025-01-30 16:26:29Z

0

Another alternative could be this formula. For range 'Series Data'!Q1:Q1000000 the result was returned in ~ 1 second in my sample sheet.

=LET(_rng,WRAPROWS('Series Data'!Q1:Q120,20),
_avrg,BYROW(_rng,LAMBDA(r,AVERAGE(r))),
_stdev,BYROW(_rng,LAMBDA(r,STDEV(r))), 
_res,IFNA(EXPAND(HSTACK(_avrg,_stdev),,4),""),
VSTACK("",TOCOL(_res)))

answered Jan 30 at 16:26

user22566114

5464 silver badges6 bronze badges

Comments

desertnaut · Accepted Answer · 2025-02-05 22:53:11Z

0

Thank you all for your answers,

I've used this:

=LET(first,'Series Data'!Q$2:Q$21,
mul,108,
cella,BYROW(SEQUENCE(mul,1,0,20),LAMBDA(x,AVERAGE(OFFSET(first,INDEX(x,1,1),0)))),
cellb,BYROW(SEQUENCE(mul,1,0,20),LAMBDA(x,STDEV(OFFSET(first,INDEX(x,1,1),0)))),
DROP(TOCOL(HSTACK(TOCOL(TEXTSPLIT(REPT("|",mul),"|")),cella,cellb,TOCOL(TEXTSPLIT(REPT("|",mul),"|")))),-4))

Because 2161 is the maximum amount of data sets of 20, 108 is sufficient to harbor all possibilities. It gives the data how I requested, dropping down the avg, then stdev, then two blank rows.

edited Feb 5 at 22:53

desertnaut

60.8k32 gold badges155 silver badges183 bronze badges

answered Jan 30 at 7:16

Jan van oest

112 bronze badges

Collectives™ on Stack Overflow

Automating averages and stdevs for large dataset in set intervals of rows

6 Answers 6

1 Comment

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

1 Comment

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related