An adaptive learning rate optimiser that maintains a moving average of the squared gradient to scale updates, making it effective for recurrent neural networks.
An adaptive learning rate optimiser that maintains a moving average of the squared gradient to scale updates, making it effective for recurrent neural networks.