Q Learning Solved Example