I'm looking for an efficient way to reduce the sum of a NumPy array a by a given number n, such that no value in a is below 0, and I can specify probabilities pvals for the different values in a. So if the signature for my function is:
> def removeRandom(a, n, pvals):
> ...
Then it should do the following:
> a = np.array([2, 3, 5, 10])
> pvals = np.array([0.1, 0.1, 0.4, 0.4])
> removeRandom(a, 5, pvals)
array([2, 2, 3, 8])
Since the removal should be random, the output can look different next time:
> removeRandom(a, 5, pvals)
array([1, 3, 4, 7])
I currently have an approach that does the removal step, then checks whether any values in a have gone below 0, and if so, repeats the step until no value in a is below 0:
def removeRandom(a, n, pvals=None):
if n < np.sum(a):
# remove a total of n at random indexes, taking the pvals into account
aranged = np.arange(a.size)
randomIndexes = np.random.choice(aranged, n, p=pvals)
np.subtract.at(a, randomIndexes, 1)
while(a[a < 0].size > 0):
# what's the sum of all cells below 0?
sumBelowZero = np.abs(np.sum(a[a < 0]))
# set them to 0
a[a < 0] = 0
# rinse and repeat the process
randomIndexes = np.random.choice(aranged, n, p=pvals)
np.subtract.at(a, randomIndexes, 1)
return a
else:
return np.zeros_like(a)
That loop is obviously not very elegant, plus there is a chance that the function gets stuck in that loop if it keeps dropping at least one value below 0. The chance that this happens increases dramatically as n approaches np.sum(a).
A very elegant solution to this question has been posted here, but it does not allow for the setting of probabilities:
def removeRandom(a, n):
c = np.cumsum(np.r_[0, a])
if n < c[-1]:
r = np.random.choice(np.arange(c[-1]) + 1, n, replace = False)
d = np.sum(r[:,None] <= c[None,:], axis=0)
return np.diff(c-d)
else:
return np.zeros_like(a)
Since np.random.choice is also used here and accepts probabilities, I have looked for a way to utilize that (without success, obviously) – can this be done at all?
I would also appreciate any other ideas to solve this, of course.