Direct Preference Optimization Formula