1

I have many arrays of different length and what I want to do is to have for those arrays a fixed length, let's say 100 samples. These arrays contain time series and I do not want to lose the shape of those series while reducing the size of the array. What I think I need here is an undersampling algorithm. Is there an easy way to reduce the number of samples in an array doing like an average on some of those values?

Thanks

3 Answers 3

2

Heres a little script to do it without numpy. Maintains shape even if length required is larger than the length of the array.

from math import floor

def sample(input, count):
    output = []
    sample_size = float(len(input)) / count
    for i in range(count):
        output.append(input[int(floor(i * sample_size))])
    return output
Sign up to request clarification or add additional context in comments.

Comments

2

if you use a slice with generated random indices, and you keep your original array (or only the shape of it to reduce memory usage):

import numpy as np
input_data = somearray
shape = input_data.shape
n_samples= 100
inds = np.random.randint(0,shape[0], size=n_samples)
sub_samples = input_data[inds]

Comments

1

Here's a shorter version of Nick Fellingham's answer.

from math import floor
def sample(input,count):
    ss=float(len(input))/count
    return [ input[int(floor(i*ss))] for i in range(count) ]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.