Outlier Detection in Machine Learning. Techniques, Methods, and How to Handle Overfitting and Underfitting

5 min readOct 25, 2024

Outlier detection plays a crucial role in machine learning, especially in data preprocessing, anomaly detection, and improving model performance.

Outliers can disrupt the predictive accuracy of models, leading to overfitting or underfitting, and affecting overall reliability.

This article explores the fundamentals of outlier detection, reviews popular techniques, and addresses methods to manage overfitting and underfitting in machine learning models.

What Are Outliers?

Outliers are data points that significantly differ from the majority of the dataset. These anomalies can occur due to various factors, such as measurement errors, data entry issues, or genuine but rare occurrences within the data. Outliers can skew the training process and result in models that either fail to generalize well (overfitting) or are too simplistic to capture underlying patterns (underfitting).

Types of Outliers

Global Outliers: Data points that are far removed from the overall distribution of the dataset.
Contextual Outliers: Points that are only considered outliers in a specific context, such as time-series…

Outlier Detection in Machine Learning. Techniques, Methods, and How to Handle Overfitting and Underfitting

What Are Outliers?

Types of Outliers

Written by Ismat Samadov

No responses yet