Bibliography

Pol64

B. Polyak. Some methods of speeding up the convergence of iteration methods. Ussr Computational Mathematics and Mathematical Physics, 4:1–17, 12 1964.

Nes83

Y. Nesterov. A method of solving a convex programming problem with convergence rate o(1/k)². Soviet Mathematics Doklady, 27(2):372–376, 1983.

DHS11

J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, 2011.

HSS12

missing journal in hinton-2012-neural

KB15

D. Kingma and J. Ba. Adam: a method for stochastic optimization. In Int'l Conf. on Learning Representations (ICLR). 2015.

Yua08

Y. Yuan. Step-sizes for the gradient method. AMS/IP Studies in Advanced Mathematics, 42(2):785–796, 2008.

BB88

J. Barzilai and J. Borwein. Two-point step size gradient methods. IMA Journal of Numerical Analysis, 8(1):141–148, 1988. doi:10.1093/imanum/8.1.141.

LW19

T. Li and Z. Wan. New adaptive barzilai-borwein step size and its application in solving large-scale optimization problems. ANZIAM Journal, 61(1):76–98, 2019.

RS02

M. Raydan and B. Svaiter. Relaxed steepest descent and cauchy-barzilai-borwein method. Computational Optimization and Applications, 21(2):155–167, Feb 2002.

HCS06

G. Huang, L. Chen, and C. Siew. Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. on neural networks, 17:879–92, 07 2006.