Understand the math behind Scaled Dot-Product Attention, the fundamental building block of the Transformer architecture.