Bernstein, Jeremy; Vahdat, Arash; Yue, Yisong; Liu, Ming-Yu (2020-06-12). "On the distance between two neural networks and the stability of learning". arXiv:2002.03432 [cs.LG].
Combettes, P. L.; Pesquet, J.-C. (2011). "Proximal splitting methods in signal processing". In Bauschke, H. H.; Burachik, R. S.; Combettes, P. L.; Elser, V.; Luke, D. R.; Wolkowicz, H. (eds.). Fixed-Point Algorithms for Inverse Problems in Science and Engineering. New York: Springer. pp. 185–212. arXiv:0912.3522. ISBN978-1-4419-9568-1.
Barzilai, Jonathan; Borwein, Jonathan M. (1988). "Two-Point Step Size Gradient Methods". IMA Journal of Numerical Analysis. 8 (1): 141–148. doi:10.1093/imanum/8.1.141.
Wolfe, Philip (April 1969). "Convergence Conditions for Ascent Methods". SIAM Review. 11 (2): 226–235. doi:10.1137/1011036.