Understand and prevent reward hacking. Learn why agents find unintended loopholes in poorly designed reward functions and how to build robust objective metrics.