Multiple Linear Regression

Multiple Linear Regression (MLR) with Python: A Hands-on Guide

Multiple Linear Regression

In our previous discussion, we learned about Simple Linear Regression, where a single independent variable (X) is used to predict a dependent variable (). But what happens when the outcome depends on more than one factor? That’s where Multiple Linear Regression (MLR) comes into play.

Complete Python Course with Advance topics:-Click Here
SQL Tutorial :-Click Here

🔍 What is Multiple Linear Regression?

Multiple Linear Regression is a variant of Simple Linear Regression. It models the linear relationship between a single continuous dependent variable and two or more independent variables (which can be either continuous or categorical).

✅ Real-Life Example:

Let’s say you want to predict CO₂ emissions of a car. It’s not enough to look only at the engine size. Other factors, like the number of cylinders, also affect emissions. That’s a classic case for Multiple Linear Regression.

📌 Key Points About MLR

  • The dependent variable (Y) must be continuous.
  • The independent variables (X) can be continuous or categorical.
  • Each independent variable should have a linear relationship with the dependent variable.
  • The regression line is fitted through a multidimensional space.

📐 MLR Mathematical Equation

The MLR model can be represented as:

Y = b₀ + b₁x₁ + b₂x₂ + b₃x₃ + … + bₙxₙ

Where:

  • Y = Target variable
  • b₀ = Intercept
  • b₁, b₂, …, bₙ = Coefficients
  • x₁, x₂, …, xₙ = Independent variables

🧠 Assumptions in MLR

  1. Linear relationship between dependent and independent variables.
  2. Residuals (errors) are normally distributed.
  3. No or minimal multicollinearity between independent variables.

🛠 Implementing Multiple Linear Regression in Python

Let’s explore a practical example to predict company profits using Python.

🎯 Problem Statement:

We have a dataset of 50 startup companies. It includes:

  • R&D Spend
  • Administration Spend
  • Marketing Spend
  • State
  • Profit (Target Variable)

🔎 Step 1: Data Pre-processing

# Importing necessary libraries
import numpy as np  
import matplotlib.pyplot as plt  
import pandas as pd  

# Importing dataset
dataset = pd.read_csv('50_CompList.csv')

# Extracting independent (X) and dependent (Y) variables
x = dataset.iloc[:, :-1].values  
y = dataset.iloc[:, 4].values

📦 Step 2: Encoding Categorical Data

The “State” column is categorical, so we encode it using LabelEncoder and OneHotEncoder.

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer

labelencoder = LabelEncoder()
x[:, 3] = labelencoder.fit_transform(x[:, 3])

# Creating dummy variables
ct = ColumnTransformer([("State", OneHotEncoder(), [3])], remainder='passthrough')
x = ct.fit_transform(x)

# Avoiding the Dummy Variable Trap
x = x[:, 1:]

✂ Step 3: Splitting the Dataset

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

Note: Feature Scaling is not required here as the library handles it internally.

⚙ Step 4: Fitting the MLR Model

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(x_train, y_train)

🔮 Step 5: Predicting the Results

y_pred = regressor.predict(x_test)

# Comparing predictions with actual results
comparison = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})
print(comparison)

✅ Output:

You’ll get a table comparing the predicted profits to the actual ones from the test set, helping you see how well the model performs.

Download New Real Time Projects :-Click here
Complete Advance AI topics:- CLICK HERE

💡 Final Thoughts

Multiple Linear Regression is one of the most widely used techniques in machine learning and statistics. It’s simple yet powerful when multiple factors are involved in making a prediction.

By mastering MLR:

  • You can make better business predictions.
  • Understand how different features affect the output.
  • Develop more accurate forecasting models.


multiple linear regression formula
multiple linear regression example
multiple linear regression python
multiple linear regression pdf
multiple linear regression in machine learning
multiple linear regression in r
multiple linear regression calculator
multiple linear regression solved example

Post Comment