Speed Up Large Language Generation