Skip to content

StevenRice99/ML-Asteroids

Repository files navigation

ML-Asteroids

Teaching an agent to play Asteroids with Unity ML-Agents. See a web demo.

Purpose

The purpose is this project is for use as a learning resource for Unity ML-Agents, highlighting how different methods can be applied to try and overcome a classic game.

Game Overview

  • The agent can travel within a set, square area.
  • They can move forwards, turn left or right, and fire at asteroids.

Agent Design

The agent's has several discrete actions:

  1. Stay still = 0 and move = 1.
  2. Don't turn = 0, turn left = 1, and turn right = 2.
  3. Don't fire = 0 and fire = 1.

The agent's sensing of the environment constists of, with all stacked across two frames:

  1. Agent position - Both the agent's previous and current positions in the playable area are given along both the horizontal and vertical axes each in the range of [0, 1].
  2. Agent rotation - The agent's rotation scaled to [0, 1].
  3. Raycast sensor - A 2D raycast sensor is attached to the agent, firing 30 rays on each side of the agent.

Agent Rewards

  • A reward of 0.5 is given for every asteroid destroyed.
  • A penalty of -1 is given for being eliminated.
  • A penalty of -0.1 is given for every shot fired.

Agent Training

The agent was trained with Proximal Policy Optimization (PPO), training curriculum, a curiosity reward signal to encourage exploration, and imitation learning, being both Behavioral Cloning (BC) and Generative Adversarial Imitation Learning (GAIL). The demonstrations for imitation learning were recorded using the heuristic agent.

Heuristic Agent

The heuristic agent tries to aim at the "best" asteroid. This is determined by first seeing if any asteroids are on a collision path with the agent. If there are, the nearest asteroid on a collision path is chosen. Otherwise, the nearest asteroid not on a collision path is chosen. Then, the agent rotates to face the selected asteroid, firing if it is facing said asteroid. The agent does not move on its own, but human keyboard controls can take over, allowing for manual movement and firing using the arrow keys or WASD alongside space to fire.

Results

The trained model does not perform the best, highlighting how it has likely overfit to the Behavioral Cloning (BC) from the heuristic agent despite its strength being configurated to be very low, leading to it rarely moving out of the way.

Running

If you just wish to see the agent in action, you can run the web demo.

Run Training

To train the agent, you can either read the Unity ML-Agents documentation to learn how to install and run Unity ML-Agents, or use the provided helper functions to train the agent.

Helper Functions

The helper files have been made for Windows and you must install uv. One installed, from the top menu of the Unity editor, you can select ML-Asteroids followed by the desired command to run.

  • Train - Run training.
  • TensorBoard - This will open your browser to see the TensorBoard logs of the training of all models.
  • Install - If you have uv installed for Python, this will set up your environment for running all other commands. Note: This assumes you have NVIDIA CUDA support. You will need to remove the --index-url https://download.pytorch.org/whl/cu121 from the PyTorch installation line if you do not have an NVIDIA GPU with CUDA support.
  • Activate - This will open a terminal in your uv Python virtual environment for this project, allowing you to run other commands.

Resources

Assets are from Asteroids by Zigurous.