A Machine Learning + Packet Sniffing based real-time Intrusion Detection System built using Python, Scapy, RandomForest, and a Streamlit Dashboard.
This project captures live network packets, extracts features, classifies traffic as normal or intrusive, logs intrusions, and visualizes them on a real-time dashboard.
- Overview
- Why This Project?
- Features
- Architecture
- Tech Stack
- Dataset
- Project Structure
- How It Works
- Installation
- How to Run
- Demo Workflow
- Future Enhancements
Intrusion Detection Systems (IDS) are essential in cybersecurity for monitoring network traffic and detecting malicious activities.
This project implements a host-based IDS that detects anomalies using:
- Live Packet Capture (Scapy)
- Machine Learning Classification (RandomForest)
- Real-time Data Visualization (Streamlit)
- Logging & Analysis (CSV Log Storage)
The system is lightweight and suitable for academic, training, and small-scale enterprise environments.
Most IDS solutions are large, complex, and not beginner-friendly.
This project aims to:
- Simplify how IDS works internally
- Demonstrate real packet sniffing
- Connect ML with live network features
- Provide an interactive analytics dashboard
This project is also fully reproducible for training, internships, and demonstrations.
Uses Scapy to capture network packets (IP/TCP/UDP).
Classifies traffic using a RandomForest model trained on a subset of NSL-KDD dataset features.
Every suspicious event is logged into:
logs/ids_alerts.csv
Stored fields:
- Timestamp
- Source IP
- Destination IP
- Packet Length
- TTL
- Flags
- Prediction result
Visualizations include:
- Number of intrusions
- Top attacker IPs
- Intrusion timeline chart
- Latest events table
- Traffic statistics
Separated into:
- Training
- Sniffing
- Logging
- Dashboard
┌──────────────────────────┐
│ Packet Sniffer │
│ (Scapy) │
└─────────────┬────────────┘
│ Extract 3 features
▼
┌──────────────────────────┐
│ ML Classifier (RF) │
│ normal / intrusion │
└─────────────┬────────────┘
▼
┌──────────────────────────┐
│ Intrusion Logger │
│ writes logs → CSV │
└─────────────┬────────────┘
▼
┌──────────────────────────┐
│ Streamlit Dashboard │
│ reads & visualizes logs │
└──────────────────────────┘
| Component | Technology |
|---|---|
| Packet Capture | Scapy |
| ML Model | RandomForestClassifier |
| Dashboard | Streamlit |
| Data Processing | Pandas |
| Environment Mgmt | uv (Python package manager) |
| Dataset | NSL-KDD |
Dataset used: NSL-KDD (KDDTest.csv)
Source: Kaggle
Converted from .arff → .csv for easier ML processing.
We extracted 3 lightweight numerical features for real-time matching:
| Feature | Meaning |
|---|---|
| src_bytes | proxy for packet length |
| dst_bytes | proxy for TTL |
| wrong_fragment | proxy for TCP flags |
These match the 3 features the live sniffer extracts.
IDS/
│
├── dataset/
│ └── KDDTest.csv
│
├── models/
│ └── sniffer_model.joblib
│
├── logs/
│ └── ids_alerts.csv
│
├── src/
│ ├── load_dataset.py
│ ├── train_sniffer_model.py
│ ├── live_sniffer.py
│ └── dashboard.py
│
└── README.md
- Dataset is converted to CSV.
- Only 3 numerical features are selected.
- RandomForestClassifier is trained.
- Model saved into
models/sniffer_model.joblib.
- Scapy captures IP/TCP/UDP packets.
- Extracts 3 real-time features:
- packet_length
- ttl
- tcp_flags
The model predicts:
0→ Normal1→ Intrusion
If prediction is 1 (intrusion), IDS logs event into:
logs/ids_alerts.csv
Dashboard reads the CSV log every few seconds and updates:
- Intrusion timeline graph
- Recent alerts
- Top attacker IPs
- Traffic statistics
uv add scapy pandas scikit-learn joblib streamlit matplotlibuv run src/load_dataset.pyuv run src/train_sniffer_model.pyuv run src/live_sniffer.pyuv run streamlit run src/dashboard.py-
Start
live_sniffer.py -
Start
dashboard.py -
Generate traffic:
ping google.com
-
Watch dashboard update in real-time
-
Check logs at
logs/ids_alerts.csv
- Use deep learning (LSTM, CNN)
- Add more packet-level features
- Retrain using real captured traffic
- Automatic IP blocking (Windows/Linux firewall)
- Email or SMS alert integration
- Push alerts to mobile devices
- GeoIP Mapping (attacker locations)
- Advanced heatmaps
- Exportable PDF reports
This project demonstrates how to integrate machine learning with real-time network monitoring to create a practical intrusion detection system. It is fully modular, extendable, and ideal for cybersecurity internships, training labs, and academic submissions.
Developed as part of Cyber Security Internship Training Project.