What is the main challenge in data imbalance in classification problems?

  1. Improves prediction accuracy
  2. Reduces model complexity
  3. Model may be biased toward the majority class and perform poorly on minority classes ✓
  4. Increases training speed

Correct answer: Model may be biased toward the majority class and perform poorly on minority classes

Option C is correct because when a dataset has a severe class imbalance, a classifier can achieve high overall accuracy by simply predicting the majority class for nearly all samples, resulting in very poor recall for the minority class and making the model practically useless for detecting the rare but often critical minority events. Option A is wrong because class imbalance harms prediction accuracy for the minority class rather than improving it; the inflated overall accuracy is misleading and masks poor minority-class performance. Option B is wrong because data imbalance does not reduce model complexity; it introduces a bias problem that often requires additional techniques such as oversampling, undersampling, or adjusted class weights to address. Option D is wrong because data imbalance does not increase training speed; the dominant factor in training speed is the total number of samples and model architecture, not the distribution of labels.

Topic: · class imbalance, classification, model bias, machine learning

Practice Google Cloud ML Engineer Questions Free