Bias Variance Trade-off

In machine learning, evaluating the quality of prediction is critical. There have been numerous studies to devise measures to evaluate the performance of models.

All these performance measures state the difference between actual value and the predicted value. This difference is the summation of 3 components:

Bias Error
Variance Error
Irreducible Error

Irreducible error can't be reduced as this error is there because of data or absence of some variable in the model, which is entirely unknown. In this section, we will focus on Bias & Variance errors.

Bias

Bias is to have some inclination or assumption about something. In machine learning also, it means the same. Having bias means model assumes a lot of things and chooses few and simple variables. It tries to keep itself simple and straightforward.

High Bias Model -

This is a model that have few and less complex variables. These are generally simple models like linear, logistics regression models.

Low Bias Model -

This is a model that have more and complex variables. Tree, SVC algorithms generally create this type of models. Models with low bias have better prediction capability but tend to have overfitting nature.

Variance

Variance is a unit to measure the difference. It talks about how the model changes when training data is changed. In other words, how close models' predictions are from each other when trained on different datasets.

High Variance model

This model has more and complex variables. Non-parametric models like decision tree have high variance. When you change the data, the model also changes.

Low Variance models

Models which are simple, have low variance. Parametric models like linear/logistic regression models have low variance. These models are independent of data up to some extent and don't change with data much.

By now, you must have realized that bias and variance are almost inversely proportional to each other.

When model complexity is increased > Bias error decreases & Variance Error increase
When model complexity is decreased > Bias error increases & Variance Error decreases

The best case scenario is to have both bias and variance error as less as possible so that total error is least.

Let's start with an example to make the concept easy and entertaining. Let's say that there are two models which have to predict correct animal from the below image. These two models have been trained on little different data. This image is actually of an orangutan(one type of ape). Other species of apes are - Chimpanzee and Gorilla.

Now, consider four scenarios which are the prediction results of these 2 models trained on different data.

Let's analyze each scenario by moving anti-clockwise.

1. Low Bias - Low Variance

This is the sweetest spot where models trained on different data give results which are pretty correct(higher accuracy) and pretty close to each other(means results don't vary much with different data). In the above example, both models predicted Orangutan and are close to each other(two types of Orangutan - Bornean & Sumatran).

2. Low Bias - High Variance

Models, trained on different datasets, give results which are pretty correct but are different from each other. In the above example, one model predicted Chimpanzee & other predicted Gorilla which are still close to Orangutan, but are different from each other.

3. High Bias - High Variance

Models, trained on different datasets, predict wrongly and predict different results from each other. In the above example, one model predicted Tree & another predicted Building which are neither correct nor close to each other.

4. High Bias - Low Variance

Models, trained on different datasets, predict wrongly but predict close results. In the above example, one model predicted Polar White Mercedes & other predicted Black Mercedes which are wrong predictions but pretty close to each other.

As you saw in above example, our target should always be to reduce both bias and variance.

Complete and Continue