• Doctorate / Master Degree Program
  • Fellowship Program
  • Advanced Certificate Medical Program
  • PG Diploma
  • SAP Program
  • Digital Marketing
  • Data Science & AI
  • Salesforce Training
  • HR & Finance Training

Doctorate / Master Degree Program

Digital Marketing

Handling Imbalanced Data: SMOTE, Class Weights, and Better Metrics

Most real datasets are imbalanced: 99% “normal” transactions, 1% fraud. Standard accuracy lies (99% by predicting all normal). Here’s how to build models that work when classes aren’t equal.​

Why imbalance breaks Machine Learning

Problems:

  • Models ignore rare class (easy 99% accuracy).
  • Threshold at 0.5 biases toward majority.
  • Evaluation metrics hide poor minority performance.​

Solution domains: Resampling, cost‑sensitive learning, better metrics.

Method 1: Resampling strategies

Undersampling: Remove majority samples → balanced but less data.
Oversampling: Duplicate minority → overfitting risk.

SMOTE (Synthetic Minority Oversampling):

  • Find k nearest minority neighbors.
  • Generate synthetic samples along line segments.
  • Preserves local structure better than duplication.​

Method 2: Algorithm tweaks

Class weights: Penalize majority errors more.

sklearn: class_weight=’balanced’

XGBoost: scale_pos_weight = neg/pos ratio

Ensemble: Undersample /boost on different splits.

Method 3: Threshold Tuning + Metrics

Key metrics:

  • Precision/Recall trade‑off (PR curve > ROC for imbalance).
  • F1 score: Harmonic mean, punishes imbalance.
  • AUC‑PR: Area under precision‑recall curve.

Tune threshold on validation for business cost (FP vs FN).​

Example: Detection of Frauds

Dataset: 98% normal, 2% fraud.

Baseline: Predict all normal → Accuracy 98%, Recall 0%

Class weights → Recall 75%, Precision 60%

SMOTE + threshold → Recall 85%, Precision 55%

Pick based on cost: $100 FN vs $10 FP.

Try this: Grab a fraud/credit dataset. Fit 3 models: baseline, class weights, SMOTE. Plot PR curves.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

GTR Placement Ecosystem

    GTR Academy Logo


    Download Your Brochure







    https://youtu.be/_KW9ZKQYtNY?si=wrMtMBnFXZk5IJ3c

    https://youtu.be/IoG1WxAKXwg

    https://www.youtube.com/watch?v=l9XB4Gwt0H4

    https://www.youtube.com/watch?v=71Y_1M0NSoo

    https://www.youtube.com/watch?v=yjGQ1g9S-dU

    https://www.youtube.com/watch?v=Q_BixayJrHk

    https://www.youtube.com/watch?v=jqOVYf7ESh0