Efficient Large Scale Language Model Training