Which metric is most appropriate for evaluating a multi-class classification model?
- Accuracy, Precision, Recall, and F1-score ✓
- Only accuracy
- Only precision
- Only recall
Correct answer: Accuracy, Precision, Recall, and F1-score
Option A is correct because evaluating a multi-class classification model requires a balanced view across all classes, making Accuracy, Precision, Recall, and F1-score together the most appropriate suite of metrics; F1-score in particular balances precision and recall and can be computed per-class or as a macro/weighted average. Option B is insufficient because accuracy alone can be misleading when classes are imbalanced, giving an inflated score even when the model fails on minority classes. Option C is incorrect because precision alone ignores false negatives and does not capture whether the model is missing true positives across classes. Option D is incorrect because recall alone ignores false positives and does not reflect the model's ability to avoid incorrect classifications.
Topic: · model evaluation, classification metrics, f1-score, machine learning