Multi Policy Reinforcement Learning In Machine