Actor Critic Reinforcement Learning