Which metric is most appropriate for evaluating a classification model on an imbalanced dataset?

Question

Accepted Answer

The correct answer is F1 Score. Option D is correct because the F1 Score is the harmonic mean of precision and recall, which makes it particularly well-suited for imbalanced datasets where a model could achieve high accuracy by simply predicting the majority class while completely failing on the minority class. Option A is incorrect because accuracy is misleading on imbalanced datasets since a classifier predicting only the majority class can appear highly accurate while providing no useful predictions for the minority class. Option B is incorrect because Mean Squared Error is a regression metric that measures average squared prediction error and is not applicable to classification problems. Option C is incorrect because R-squared is also a regression metric that measures the proportion of variance explained by the model and is not used to evaluate classification performance.