signSGD: compressed optimisation for non-convex problems
Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli & Anima Anandkumar
[arxiv] [poster] [slides] [cite]ICML '18 long talk
@InProceedings{bernstein_signum, title = {sign{SGD}: Compressed Optimisation for Non-Convex Problems}, author = {Bernstein, Jeremy and Wang, Yu-Xiang and Azizzadenesheli, Kamyar and Anandkumar, Animashree}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, address = {Stockholmsmässan, Stockholm Sweden}, month = {10--15 Jul}, publisher = {PMLR}, }
We exploit the natural geometry of neural net error landscapes to develop an optimiser that converges as fast as SGD whilst providing cheap gradient communication for distributed training.