Evaluation Metrics - Linear Regression

A conceptual understanding of Linear Regression Analysis is a prerequisite for this article. Thus, we recommend you to visit this link if you need more clarity. Let’s begin!

Statistical Concepts — Evaluation metrics – Linear Regression

Evaluation Metrics of the Classical Linear Regression Model:

Before understanding the metrics to assess model performance, let us understand why do we even need them?

A performance metric is an important part of regression because it checks the reliability of a model. It gives you a value that can be used to describe the error associated with the predictions made by a model.

In this article, we will be discussing 3 metrics for linear regression:

Mean Absolute Error, MAE
Mean Squared Error and Root Mean Squared Error, MSE and RMSE
Coefficient of Determination, the concept of R² and adjusted R²

Mean Absolute Error:

Have you heard about mean deviation? The concept of MAE is analogous to the mean deviation. In MAE, we take the average of absolute deviations of the predicted values from the actual values.

Thus, MAE gives you a measure of how much the predictions made by your model deviate on an average in absolute terms.

Calculation of MAE:

MAE and RMSE — Which Metric is Better? | by JJ | Human in a Machine World | Medium

Where y_j is the j^th observation, y_j cap is the j^th observation as predicted by the model and n is the sample size.

Mean Squared Error and Root Mean Squared Error:

Mean Squared Error (MSE) is analogous to the standard deviation. MSE is calculated as the average of squared differences between the actual value and the predicted values of the model. MSE penalizes large differences even more by squaring them.

MSE Cost Function for Training Neural Network - Stack Overflow

However, MSE is not on the same scale as the dependent variable (y). To overcome this drawback, there exists the concept of Root Mean Squared Error (RMSE). RMSE is calculated by taking a square root of MSE thus bringing back the metric to the same scale. Hence, RMSE is more widely used.

What does RMSE really mean?. Root Mean Square Error (RMSE) is a… | by James Moody | Towards Data Science

Coefficient of Determination:

Coefficient of determination measures the proportion of variability explained by the model. The idea is to partition the total variability in the dependent variable into two parts: variability due to the regression model and variability due to the error (residual).

i.e. SS(Total) = SS(Regression) + SS(Residual)

Looking at R-Squared. In data science we create regression… | by Erika D | Medium

On observing closely you would notice that SS(Total) is nothing but the variance of the dependent variable (y).

However, the addition of an explanatory variable almost always leads to an increase in the value of R², that is, R² is a non-decreasing function of a number of regressors. Therefore, we use adjusted R² which takes into account the number of predictors in the model. We divide both SSE and TSS with their degrees of freedom. Thus, the adjusted R²adjusts for the number of parameters in the model.

GraphPad Prism 7 Curve Fitting Guide - Interpreting the adjusted R2

Here, n is the sample size and k is the number of estimated parameters (number of regressors+1, since we estimate an intercept term as well)

For any query, suggestion or feedback, please reach out to us on LinkedIn or schedule a meeting by vising the Contact page. You can find some of the resources that helped us here.

If you can contribute by talking about your interview experience, it will definitely create an impact. Please fill this form to help students get a perspective about the interview structure and questions.

You can read other articles here.

Cheers and Best!