The following code does exactly what I want; however, the for loop is far too slow. On my machine, the wall time for the for loop is 1min 5s. I'm looking for an alternative to the for loop that is much faster.
# Imports
from sympy.solvers.solveset import solveset_real
from sympy import Symbol, Eq
# Define variables
initial_value = 1
rate = Symbol('r')
decay_obs_window = 1480346
target_decay = .15
# Solver to calculate decay rate
decay_rate = solveset_real(Eq((initial_value - rate * decay_obs_window), target_decay), rate).args[0]
# Generate weights
weights = []
for i in range(5723673):
# How to handle data BEYOND decay_obs_window
if i > decay_obs_window and target_decay == 0:
# Record a weight of zero
weights.append(0)
elif i > decay_obs_window and target_decay > 0:
# Record the final target weight
weights.append(decayed_weight)
# How to handle data WITHIN decay_obs_window
else:
# Calculate the new slightly decayed weight
decayed_weight = 1 - (decay_rate * i)
weights.append(decayed_weight)
weights[0:10]
I wrote this list comprehension with the hope of improving the execution time. While it works perfectly, it does not yield any appreciable runtime improvement over the for loop 😞:
weights = [0 if i > decay_obs_window and target_decay == 0 else decayed_weight if i > decay_obs_window and target_decay > 0 else (decayed_weight := 1 - (decay_rate * i)) for i in range(len(weights_df))]
I'm interested in any approaches that would help speed this up. Thank you 🙏!
FINAL SOLUTION:
This was the final solution that I settled on. On my machine, the wall time to execute the entire thing is only 425 ms. It's a slightly modified version of Aaron's proposed solution.
import numpy as np
from sympy.solvers.solveset import solveset_real
from sympy import Symbol, Eq
# Define variables
initial_value = 1
rate = Symbol('r')
decay_obs_window = 1480346
target_decay = .15
# Instantiate weights array
weights = np.zeros(5723673)
# Solver to calculate decay rate
decay_rate = solveset_real(Eq((initial_value - rate * decay_obs_window), target_decay), rate).args[0]
# Fix a bug where numpy doesn't like sympy floats :(
decay_rate = float(decay_rate)
# How to weight observations WITHIN decay_obs_window
weights[:decay_obs_window + 1] = 1 - np.arange(decay_obs_window + 1) * decay_rate
# How to weight observations BEYOND decay_obs_window
weights[decay_obs_window + 1 : 5723673] = target_decay
weights
for i in range(decay_obs_window):andfor i in range(decay_obs_window, 5723673)