Multiple Linear Regression (MLR) with Python: A Hands-on Guide
Multiple Linear Regression
In our previous discussion, we learned about Simple Linear Regression, where a single independent variable (X
) is used to predict a dependent variable (). But what happens when the outcome depends on more than one factor? That’s where Multiple Linear Regression (MLR) comes into play.
Complete Python Course with Advance topics:-Click Here
SQL Tutorial :-Click Here
🔍 What is Multiple Linear Regression?
Multiple Linear Regression is a variant of Simple Linear Regression. It models the linear relationship between a single continuous dependent variable and two or more independent variables (which can be either continuous or categorical).
✅ Real-Life Example:
Let’s say you want to predict CO₂ emissions of a car. It’s not enough to look only at the engine size. Other factors, like the number of cylinders, also affect emissions. That’s a classic case for Multiple Linear Regression.
📌 Key Points About MLR
- The dependent variable (Y) must be continuous.
- The independent variables (X) can be continuous or categorical.
- Each independent variable should have a linear relationship with the dependent variable.
- The regression line is fitted through a multidimensional space.
📐 MLR Mathematical Equation
The MLR model can be represented as:
Y = b₀ + b₁x₁ + b₂x₂ + b₃x₃ + … + bₙxₙ
Where:
- Y = Target variable
- b₀ = Intercept
- b₁, b₂, …, bₙ = Coefficients
- x₁, x₂, …, xₙ = Independent variables
🧠 Assumptions in MLR
- Linear relationship between dependent and independent variables.
- Residuals (errors) are normally distributed.
- No or minimal multicollinearity between independent variables.
🛠 Implementing Multiple Linear Regression in Python
Let’s explore a practical example to predict company profits using Python.
🎯 Problem Statement:
We have a dataset of 50 startup companies. It includes:
- R&D Spend
- Administration Spend
- Marketing Spend
- State
- Profit (Target Variable)
🔎 Step 1: Data Pre-processing
# Importing necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing dataset
dataset = pd.read_csv('50_CompList.csv')
# Extracting independent (X) and dependent (Y) variables
x = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
📦 Step 2: Encoding Categorical Data
The “State” column is categorical, so we encode it using LabelEncoder and OneHotEncoder.
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
labelencoder = LabelEncoder()
x[:, 3] = labelencoder.fit_transform(x[:, 3])
# Creating dummy variables
ct = ColumnTransformer([("State", OneHotEncoder(), [3])], remainder='passthrough')
x = ct.fit_transform(x)
# Avoiding the Dummy Variable Trap
x = x[:, 1:]
✂ Step 3: Splitting the Dataset
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)
Note: Feature Scaling is not required here as the library handles it internally.
⚙ Step 4: Fitting the MLR Model
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(x_train, y_train)
🔮 Step 5: Predicting the Results
y_pred = regressor.predict(x_test)
# Comparing predictions with actual results
comparison = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})
print(comparison)
✅ Output:
You’ll get a table comparing the predicted profits to the actual ones from the test set, helping you see how well the model performs.
Download New Real Time Projects :-Click here
Complete Advance AI topics:- CLICK HERE
💡 Final Thoughts
Multiple Linear Regression is one of the most widely used techniques in machine learning and statistics. It’s simple yet powerful when multiple factors are involved in making a prediction.
By mastering MLR:
- You can make better business predictions.
- Understand how different features affect the output.
- Develop more accurate forecasting models.
multiple linear regression formula
multiple linear regression example
multiple linear regression python
multiple linear regression pdf
multiple linear regression in machine learning
multiple linear regression in r
multiple linear regression calculator
multiple linear regression solved example
Post Comment