• Doctorate / Master Degree Program
  • Fellowship Program
  • Advanced Certificate Medical Program
  • PG Diploma
  • SAP Program
  • Digital Marketing
  • Data Science & AI
  • Salesforce Training
  • HR & Finance Training

Doctorate / Master Degree Program

Digital Marketing

Regularization (L1/L2) and Why Your Models Overfit

Machine​‍​‌‍​‍‌​‍​‌‍​‍‌ learning models can be excellent when tested on the data used for training, but quite often they fail to live up to expectations when real-world data are used. The reason for this gap is mostly overfitting: the model ends up learning noise and idiosyncrasies rather than general patterns. Regularization (L1/L2) may be a rather modest in appearance, but a very effective method for keeping models under control and for improving generalization.

What is overfitting in practice?

Overfitting is a situation where a model is so flexible relative to the number and the quality of data that it starts to memorize training examples instead of learning robust rules. Typical indicators of overfitting are: very low training error coupled with much higher validation/test error, predictions that are not stable, and the model being sensitive to small data changes.

As an instance, a complex model used for predicting customer churn might perfectly follow every tiny fluctuation of the past behavior but when the customer behavior changes even a bit it can fail terribly.

 

How regularization helps

Regularization imposes a penalty on large weights in the loss function, thus it encourages simpler models which do not depend heavily on any particular feature. The concept is: “If a slightly less perfect fit on the training data leads to a more robust model on new data, then this should be preferred.”

L2 regularization (Ridge) imposes a penalty that is proportional to the square of the weights. It usually moves weights toward zero in a smooth manner but hardly ever makes them exactly zero.

L1 regularization (Lasso) imposes a penalty that is proportional to the absolute value of the weights. Through that, some weights can be completely zeroed thus feature selection is implicitly done.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

GTR Placement Ecosystem

    GTR Academy Logo


    Download Your Brochure







    https://youtu.be/_KW9ZKQYtNY?si=wrMtMBnFXZk5IJ3c

    https://youtu.be/IoG1WxAKXwg

    https://www.youtube.com/watch?v=l9XB4Gwt0H4

    https://www.youtube.com/watch?v=71Y_1M0NSoo

    https://www.youtube.com/watch?v=yjGQ1g9S-dU

    https://www.youtube.com/watch?v=Q_BixayJrHk

    https://www.youtube.com/watch?v=jqOVYf7ESh0