Data Preprocessing For Llm