Deep dive into the self-attention mechanism. Learn how queries, keys, and values allow models to weigh the importance of words in an input sequence.