Towards Efficient Generative Large Language