Optimization Methods For Large Scale Machine Learning