Train a Deep Q-Network (DQN) agent to master the classic Flappy Bird game through reinforcement learning.
This project implements a Deep Q-Network (DQN) agent designed to learn how to play the iconic Flappy Bird game. Utilizing the principles of reinforcement learning, the agent autonomously discovers optimal strategies for navigating through the pipes. The project features a custom-built Flappy Bird environment, a PyTorch-based DQN model, and an interactive Jupyter Notebook for training and evaluation. It serves as an excellent educational resource for understanding fundamental concepts in deep reinforcement learning.
- Deep Q-Network (DQN) Architecture: Implements a neural network to estimate Q-values, enabling intelligent decision-making for the agent.
- Experience Replay Buffer: Incorporates a replay buffer to store and sample past experiences, stabilizing and improving the learning process.
- Target Network: Utilizes a separate target network to further enhance training stability by providing consistent Q-value targets.
- Epsilon-Greedy Exploration: Employs an epsilon-greedy policy for balancing exploration (trying new actions) and exploitation (using learned actions).
- Custom Flappy Bird Environment: A lightweight game environment, built with Pygame, designed for seamless integration with the RL agent.
- Configurable Hyperparameters: Easily adjust training parameters and agent settings via a
parameters.yamlconfiguration file. - Interactive Training & Evaluation: A comprehensive Jupyter Notebook (
All_in_One.ipynb) guides users through the entire process, from setup and training to demonstrating the trained agent's performance.
Watch the trained agent in action!
Reinforcement Learning:
Game Environment:
Development & Configuration:
Follow these steps to get the Flappy Bird RL project up and running on your local machine.
- Python 3.8+: Recommended version for compatibility with PyTorch.
- pip: Python package installer.
-
Clone the repository
git clone https://github.com/Mayank-Kumar-Maurya/Flapping_Game-Reinforcement.git cd Flapping_Game-Reinforcement -
Install dependencies It is recommended to create a virtual environment first:
python -m venv venv source venv/bin/activate # On Windows: `venv\Scripts\activate`
Then, install the required packages:
pip install torch numpy pygame pyyaml jupyter
-
Launch Jupyter Notebook
jupyter notebook All_in_One.ipynb
This will open the Jupyter interface in your browser.
-
Run the notebook Once the Jupyter Notebook is open, execute the cells in
All_in_One.ipynbsequentially to train and evaluate the agent.
Flapping_Game-Reinforcement/
├── All_in_One.ipynb # This file contains all code in one place, A Jupyter Notebook to run on google Colab
├── agent.py # Implements the DQN agent logic
├── dqn.py # Defines the Deep Q-Network (neural network) model
├── experience_replay.py # Implements the experience replay buffer
├── flappy.mp4 # Video demonstration of the trained agent
├── game_flappy_bird.py # Custom Flappy Bird game environment using Pygame
├── parameters.yaml # Configuration file for hyperparameters
└── runs/ # Directory for storing training logs, checkpoints (automatically created during training)
The project's hyperparameters and training settings are managed through parameters.yaml.
parameters.yaml
# Agent Parameters
MINI_BATCH_SIZE: 32 # Number of experiences to sample for training
GAMMA: 0.99 # Discount factor for future rewards
network_sync_rate: 10 # parameter for target network
# Epsilon Decay
EPSILON_START: 1.0 # Starting value of epsilon for exploration
EPSILON_END: 0.01 # Minimum value of epsilon
EPSILON_DECAY: 0.995 # Decay rate for epsilon per episode
# Game Parameters (if applicable, or inferred from game_flappy_bird.py)
# Example:
# SCREEN_WIDTH: 400
# SCREEN_HEIGHT: 600
# FPS: 30You can modify these values to experiment with different training dynamics and agent behaviors.
The All_in_One.ipynb notebook is the primary interface for development and training.
- Load Parameters: The notebook loads parameters from
parameters.yaml. - Initialize Environment & Agent: Sets up the
FlappyBirdGameand theAgentwith the configured DQN model and replay buffer. - Training Loop: Iterates through a specified number of episodes, where the agent interacts with the environment, collects experiences, and learns from them using the DQN algorithm.
- Model Saving: During training, checkpoints of the agent's policy network are typically saved (e.g., to the
runs/directory) allowing you to resume training or evaluate specific models.
To evaluate a trained agent:
- Load a Trained Model: Within
All_in_One.ipynb, you can load a pre-trained model checkpoint (e.g., from theruns/directory) into the agent's policy network. - Run in Evaluation Mode: Disable exploration (set
epsilonto 0 or a very small value) and run the agent in the environment to observe its learned behavior. The game environment will render the agent's performance.
We welcome contributions to enhance this project! If you have ideas for improvements, new features, or bug fixes, please feel free to:
- Fork the repository.
- Create a new branch for your feature or fix.
- Make your changes and ensure the code adheres to existing style.
- Submit a pull request with a clear description of your changes.
- PyTorch: For providing a powerful and flexible deep learning framework.
- Pygame: For enabling the creation of the custom Flappy Bird game environment.
- Reinforcement Learning Community: For the foundational research and open-source contributions that make projects like this possible.
- 🐛 Issues: GitHub Issues
⭐ Star this repo if you find it helpful!
Made with ❤️ by Mayank Kumar Maurya