This repository hosts an in-depth data exploration and analysis of Netflix movie content, investigating trends in movie duration, genre distribution, and the evolution of Netflix's content strategy over time. Building upon foundational data analysis techniques, this project expands to uncover more nuanced patterns and insights that go beyond initial observations.
The core objective of this project was to analyze a dataset of Netflix movies to understand various characteristics of their content library. This includes:
- Identifying the types of content Netflix produces/acquires.
- Analyzing the distribution of movie durations.
- Exploring the popularity of different genres.
- Investigating how content characteristics have changed over the years.
While initially guided by a DataCamp project, this repository showcases my further independent exploration and expanded analysis, including:
-
Deeper Temporal Analysis
-
Advanced Genre Deconstruction
-
Statistical Hypothesis Testing (Conceptual/Basic)
-
Enhanced Visualization Storytelling
-
Outlier Analysis
The dataset used for this project is netflix_data.csv, containing information about movies available on Netflix, including titles, release years, durations, and genres.
- Python 3
- Pandas
- NumPy
- Matplotlib
- Seaborn
- Jupyter Notebook
This project originated from a guided exploration on DataCamp, which provided an excellent foundation. The extended analysis and insights presented here are my own independent contributions.