How Do Large Language Models Learn