Assumptions of Linear Regression

Brain Glitch
2 min readFeb 17, 2023

Linear regression is a statistical method that is used to model the linear relationship between a dependent variable and one or more independent variables. The assumptions of linear regression are important to ensure the validity and reliability of the results obtained from the analysis. Here are the main assumptions of linear regression:

1.Linearity: The linearity assumption in linear regression assumes that there is a linear relationship between the independent variable(s) and the dependent variable. This means that a change in the independent variable(s) is associated with a constant change in the dependent variable.

For example, in a study that examines the relationship between a person’s height and their weight, we assume that there is a linear relationship between these two variables. That is, as a person’s height increases, their weight will also increase at a constant rate.

2. No Multicollinearity: The multicollinearity assumption in linear regression requires that there is no high correlation among the independent variables. This means that the independent variables should not be so highly correlated that they provide redundant information about the dependent variable.

For example, if we were examining the factors that influence a person’s income and we included both their education level and their years of work experience as independent variables, we would want to ensure that these two variables are not highly correlated with each other, as they both capture information about a person’s level of skill and knowledge.

3. Normality of Residual: The normality assumption in linear regression requires that the errors or residuals are normally distributed. This means that the distribution of the residuals should follow a bell-shaped curve, with most of the residuals clustered around zero and fewer residuals at the extreme ends.

For example, if we were examining the relationship between a person’s level of physical activity and their risk of heart disease, we would want to ensure that the residuals are normally distributed to ensure that the regression model is accurately capturing the relationship between the two variables.

4. Homoscedasticity: The homoscedasticity assumption in linear regression requires that the variance of the residuals is constant across all levels of the independent variable(s). This means that the variability of the residuals should be the same for all values of the independent variable(s).

For example, if we were examining the relationship between a person’s age and their level of physical activity, we would want to ensure that the variance of the residuals is constant across all age groups to ensure that the regression model is accurately capturing the relationship between the two variables. If the variance of the residuals changes as the age of the individuals in the study changes, then we may have violated the homoscedasticity assumption.

--

--

Brain Glitch

Artificial Intelligence | Spirituality | Manifestation | Conspiracy | Technology