Learn the mathematics of Potential-Based Reward Shaping. Ensure your agent learns the optimal policy without falling into infinite positive-reward loops.