Training Diffusion With Rl