Teaching an agent to play Asteroids with Unity ML-Agents. See a web demo.
The purpose is this project is for use as a learning resource for Unity ML-Agents, highlighting how different methods can be applied to try and overcome a classic game.
- The agent can travel within a set, square area.
- They can move forwards, turn left or right, and fire at asteroids.
The agent's has several discrete actions:
- Stay still =
0and move =1. - Don't turn =
0, turn left =1, and turn right =2. - Don't fire =
0and fire =1.
The agent's sensing of the environment constists of, with all stacked across two frames:
- Agent position - Both the agent's previous and current positions in the playable area are given along both the horizontal and vertical axes each in the range of
[0, 1]. - Agent rotation - The agent's rotation scaled to
[0, 1]. - Raycast sensor - A 2D raycast sensor is attached to the agent, firing 30 rays on each side of the agent.
- A reward of
0.5is given for every asteroid destroyed. - A penalty of
-1is given for being eliminated. - A penalty of
-0.1is given for every shot fired.
The agent was trained with Proximal Policy Optimization (PPO), training curriculum, a curiosity reward signal to encourage exploration, and imitation learning, being both Behavioral Cloning (BC) and Generative Adversarial Imitation Learning (GAIL). The demonstrations for imitation learning were recorded using the heuristic agent.
The heuristic agent tries to aim at the "best" asteroid. This is determined by first seeing if any asteroids are on a collision path with the agent. If there are, the nearest asteroid on a collision path is chosen. Otherwise, the nearest asteroid not on a collision path is chosen. Then, the agent rotates to face the selected asteroid, firing if it is facing said asteroid. The agent does not move on its own, but human keyboard controls can take over, allowing for manual movement and firing using the arrow keys or WASD alongside space to fire.
The trained model does not perform the best, highlighting how it has likely overfit to the Behavioral Cloning (BC) from the heuristic agent despite its strength being configurated to be very low, leading to it rarely moving out of the way.
If you just wish to see the agent in action, you can run the web demo.
To train the agent, you can either read the Unity ML-Agents documentation to learn how to install and run Unity ML-Agents, or use the provided helper functions to train the agent.
The helper files have been made for Windows and you must install uv. One installed, from the top menu of the Unity editor, you can select ML-Asteroids followed by the desired command to run.
Train- Run training.TensorBoard- This will open your browser to see the TensorBoard logs of the training of all models.Install- If you have uv installed for Python, this will set up your environment for running all other commands. Note: This assumes you have NVIDIA CUDA support. You will need to remove the--index-url https://download.pytorch.org/whl/cu121from the PyTorch installation line if you do not have an NVIDIA GPU with CUDA support.Activate- This will open a terminal in your uv Python virtual environment for this project, allowing you to run other commands.
Assets are from Asteroids by Zigurous.