What is the purpose of cross-validation in machine learning?
- To balance class distribution
- To increase model accuracy
- To estimate model performance and reduce variance in performance estimates ✓
- To reduce training time
Correct answer: To estimate model performance and reduce variance in performance estimates
Option C is correct because cross-validation, such as k-fold cross-validation, estimates how well a model generalizes to unseen data by repeatedly training and evaluating on different partitions of the dataset, which reduces the variance of the performance estimate compared to a single train-test split. Option A is incorrect because balancing class distribution is addressed by techniques such as oversampling, undersampling, or class-weight adjustment, not cross-validation. Option B is incorrect because cross-validation is an evaluation technique, not a training optimization, so it does not directly increase model accuracy; it only provides a more reliable measurement of accuracy. Option D is incorrect because cross-validation typically increases total compute time since the model is trained multiple times, not less, making it more expensive rather than faster.
Topic: · cross-validation, model evaluation, generalization, variance reduction