Does Reinforcement Learning Use Labeled Data