A guide to the most common metrics for evaluating Large Language Models, from statistical scores to model-based evaluation.