While training Machine Learning models we often encounter the problem of overfitting. It simply means that our model has learned the mapping of training data from features to the target values. While our training loss might be very low (can go up to 0 if overfitted) but the model will not perform well on the test data (the unseen data). The reason is that the model has only memorized the training data and learned nothing of value.
Why does overfitting occur?
- Insufficient Data: If your training data isn't enough, your model will only be trained on patterns in the small given dataset which may not occur in test data and/or inference.
- Model is too complex: If you are using a more complex model than required then the model will memorize noise in the data and that might not be present in test/inference as it is in the training set.
- Overtraining the model: If your validation loss starts going up again and you continue to train it, it will overfit
How do you avoid/fix overfitting?
- Increase the training data size (Use data augmentation techniques, GANs...)
- Make the model simpler (Use a lower degree polynomial, remove a few layers of neural network...)
- Use Early Stopping (Stop training if validation loss starts increasing..)
- Use Dropout (In Neural Nets) / Regularization (l2 or l1)
When a model overfits, there is a lot of variation in the output even for very close values of input features. Therefore, the model is said to have a high variance.

Comments
Post a Comment