Member-only story

Predictors need not be normally distributed but the errors do.

A place where discrimination is legal.

Damini Vadrevu
3 min readFeb 6, 2025

It’s nice that a lot of people know what the assumptions of a linear regression are, but not many dig deep into why they have to be taken into account. And yes, it is for mathematical convenience but I like reasoning more than math. It’s funny—as a data scientist, I’ve never really been ‘good’ at math or even liked it for that matter. So this is an article where I choose to skip all the ugly formulas.

What is Linear Regression?

Linear regression finds the best straight-line relationship between two variables (X & Y). It helps predict one based on the other by minimizing the difference between actual and predicted values.

Linear Regression Assumptions

  • Linear Relationship: The relationship between the predictor and outcome is linear.
  • Homoscedasticity: The variability of residuals is consistent across all levels of the predictor variable.
  • Normality: The residuals are normally distributed.
  • Independence of Errors: Residuals are not correlated with one another and are independent of predictor variables.
  • Autocorrelation: Residuals should be…

--

--

Damini Vadrevu
Damini Vadrevu

Written by Damini Vadrevu

Humans are complex, and so is our data. I make data science easy to understand here. Welcome!

No responses yet