Ensure adequate exploration in Monte Carlo control. Discover how Exploring Starts and Epsilon-Greedy policies prevent agents from getting stuck in sub-optima.