From the course: Computer Vision for Data Scientists
Unlock the full course today
Join today to access over 24,900 courses taught by industry experts.
Exponential moving average
From the course: Computer Vision for Data Scientists
Exponential moving average
- [Instructor] Getting trapped in a false local minima sucks. When you're training a neural network, chances are you are using mini batches. There's nothing wrong with that. It just happens to introduce noise and less accurate gradients when gradient descent updates your model parameters between batches. Exponential moving average is used in training deep neural networks to improve their stability and generalization. Exponential moving average, or EMA for short, is often applied to model weights. Instead of just using the raw weights that are directly learned during training, EMA maintains a running average of the model weights which are then updated at each training step. It combines the previous average and the current model weights, each of which are scaled by some value alpha. And here's the formula for exponential moving average. The P here represents the model parameters. So P1 is the parameter at time step…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.