A model performs well on training data but poorly on unseen data. What is this called?

Question

Accepted Answer

The correct answer is Overfitting. Option C is correct because overfitting occurs when a model learns the training data too well, including its noise and random fluctuations, resulting in high training accuracy but poor generalization to new, unseen data. Option A, bias, refers to systematic errors from incorrect assumptions in the learning algorithm, which typically causes underfitting rather than strong training performance. Option B, underfitting, is the opposite problem where the model is too simple to capture the underlying patterns, leading to poor performance on both training and test data. Option D, data leakage, occurs when information from outside the training set is inadvertently used during model training, inflating evaluation metrics, but is a different root cause from the described scenario.