RMS Prop#
An adaptive gradient technique that divides the current gradient over a rolling window of the magnitudes of recent gradients. Unlike AdaGrad, RMS Prop does not suffer from an infinitely decaying step size.
Parameters#
| # | Name | Default | Type | Description | 
|---|---|---|---|---|
| 1 | rate | 0.001 | float | The learning rate that controls the global step size. | 
| 2 | decay | 0.1 | float | The decay rate of the rms property. | 
Example#
use Rubix\ML\NeuralNet\Optimizers\RMSProp;
$optimizer = new RMSProp(0.01, 0.1);
References#
- 
T. Tieleman et al. (2012). Lecture 6e rmsprop: Divide the gradient by a running average of its recent magnitude. ↩ 
  
    
      Last update: 2021-03-03