Imbalanced Dataset — Google Cloud ML Engineer Practice Questions
An imbalanced dataset is one where the distribution of target classes is highly skewed, causing a naive model to achieve misleadingly high accuracy by predicting only the majority class. The Google Cloud ML Engineer exam covers strategies for addressing imbalance, including oversampling minority classes, undersampling majority classes, generating synthetic samples with techniques like SMOTE, and adjusting class weights in Vertex AI AutoML and custom training jobs. Choosing the right evaluation metric, such as AUC-PR instead of AUC-ROC, is also a key exam topic when imbalance is present.
Free questions on imbalanced dataset
Which metric is most appropriate for evaluating a classification model on an imbalanced dataset?
Free question · medium · full answer + explanation