Fake Review Detection System using NLP and ML

Fake Review Detection System using NLP and ML

 Introduction

These days, when people shop online, product reviews really help them decide what to buy. But some reviews are fake — either to make a product look better or to harm others.To solve this problem, we made a Fake Review Detection System using NLP and Machine Learning. It comes with an easy-to-use web app made with Streamlit.Users can upload a CSV file with product reviews, and the system will check them. Then, it gives two download files — one with real reviews and one with fake ones.

What You Will Learn

  • How to preprocess review data
  • How to train an NLP-based ML model
  • How to classify fake vs. real reviews
  • How to create a web interface using Streamlit
  • How to handle file upload and download in a web app

Heart Attack Prediction Using Machine Learning : Click here

Tech Stack

  • Frontend: Streamlit (Python-based web framework)
  • Backend: Logistic Regression with TF-IDF Vectorizer
  • Language: Python
  • Libraries: Pandas, NumPy, scikit-learn, re, string

Streamlit App Flow

1. Upload the CSV file

Users upload a CSV containing product review data.

2. NLP Model Processes Reviews

The system preprocesses the text (lowercasing, punctuation and digit removal) and uses a pre-trained TF-IDF + Logistic Regression model to classify reviews.

3. Download the Results

Two downloadable CSVs are generated: real_reviews.csv and fake_reviews.csv.

New Real World Projects : Click Here

Required CSV Format

Ensure your file follows this structure:

category rating label text_
Home_and_Kitchen_5 5 CG Love this! Well made, sturdy.
Home_and_Kitchen_5 1 OR Missing information on how to use it.
  • category: Product category
  • rating: Star rating (1-5)
  • label: CG for genuine, OR for other
  • text_: The review content

How Fake/Real Is Determined

We used TF-IDF (Term Frequency-Inverse Document Frequency) to transform text data into numerical vectors. Then, we trained a Logistic Regression model using labeled data:

  • Label CG is treated as real (1)
  • Others are treated as fake (0)

Full Streamlit Code Overview

The app is a single Python file:

  • Loads and trains a model using a sample dataset
  • Allows CSV upload
  • Validates column structure
  • Preprocesses text
  • Classifies reviews
  • Generates download buttons for real and fake reviews

How to Run the App

  1. Save the app as fake_review_app.py
  2. Install dependencies:
pip install streamlit pandas numpy scikit-learn
  1. Run the Streamlit app:
streamlit run fake_review_app.py
  1. Open your browser at http://localhost:8501
  2. Upload your review CSV and download results!

Report

The report will include:

✅ Abstract
✅ Introduction (Overview, Problem Statement, Motivation)
✅ Literature Review
✅ Existing System & Drawbacks
✅ Proposed System
✅ System Architecture (Diagrams)
✅ System Specifications
✅ Experimental Design Diagrams
✅ Implementation (Setup, Modules, Sample Code)
✅ System Testing
✅ Results & Screenshots
✅ Conclusion & Future Scope
✅ References

image-8 Fake Review Detection System using NLP and ML
Fake Review Detection System
AD_4nXceBVijZXuiSTpqKetbrsL6KYRr94ruH1PHUvwqkaOKhVlEQ-fjmY8GTXwx8mChU1cQqCcoi-mQnGLVzzs57Hp497rQ2tmbgx4BFca_5lD7VRXbDNPS2Um-NJezJAURYHUmL9mt7w?key=FeGpu5VOQrmp5ssEXJ-roaMl Fake Review Detection System using NLP and ML
Fake Review Detection System

Share this content:

Post Comment