Comprehensive EDA on Amazon Prime Video's content catalog analyzing 9,871 titles to uncover strategic insights about content diversity, regional production, temporal trends, and quality metrics.
- Analyze content type distribution (Movies vs TV Shows)
- Identify dominant genres and gaps
- Examine regional production patterns
- Track content growth over time
- Evaluate quality through IMDb/TMDB ratings
- titles.csv: 9,871 titles with 15 features
- credits.csv: 124,235 cast/crew records
Python | Pandas | NumPy | Matplotlib | Seaborn | Jupyter Notebook
- Content Split: 80% movies, 20% TV shows
- Top Genres: Drama (3,000+), Comedy (2,500+)
- Peak Growth: 2018 with 700+ titles added
- Regional: 70% US-produced content
- Quality: Average IMDb rating 7.2/10
- Rebalance toward more TV shows for engagement
- Invest in underrepresented genres (Thriller, Sci-Fi)
- Diversify regional production beyond US
- Focus on premium quality content (8.0+ ratings)
- Implement data-driven content curation
pip install pandas numpy matplotlib seaborn
jupyter notebook Amazon_Prime_EDA_RaviShankarKumar.ipynbRavi Shankar Kumar
BTech CSE (AI/ML)
Email: rshankarkumar906@gmail.com
MIT License