Fine Tuning Reinforcement Learning