Join now Sign in

From the course: Computer Vision for Data Scientists

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

Exponential moving average

Exponential moving average

From the course: Computer Vision for Data Scientists

Start my 1-month free trial Buy for my team

Exponential moving average

“

- [Instructor] Getting trapped in a false local minima sucks. When you're training a neural network, chances are you are using mini batches. There's nothing wrong with that. It just happens to introduce noise and less accurate gradients when gradient descent updates your model parameters between batches. Exponential moving average is used in training deep neural networks to improve their stability and generalization. Exponential moving average, or EMA for short, is often applied to model weights. Instead of just using the raw weights that are directly learned during training, EMA maintains a running average of the model weights which are then updated at each training step. It combines the previous average and the current model weights, each of which are scaled by some value alpha. And here's the formula for exponential moving average. The P here represents the model parameters. So P1 is the parameter at time step…

Contents