signSGD: compressed optimisation for non-convex problems
Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Anima Anandkumar [arxiv] [poster] [slides] [cite]
@article{ bernstein_signum, author = {Jeremy Bernstein and Yu-Xiang Wang and Kamyar Azizzadenesheli and Anima Anandkumar}, title = {sign{SGD}: compressed optimisation for non-convex problems}, note = {arXiv:1802.04434}, year = {2018} }
We exploit the natural geometry of neural net error landscapes to develop an optimiser that converges as fast as SGD whilst providing cheap gradient communication for distributed training.