The Proximal Policy Optimization