A toolkit for visualizing MAG quality, taxonomy, clustering, abundance patterns, and functional annotation.
This tool is distributed as a Python package with a command-line interface (CLI).
There are two main ways to install and use the tool:
-
Recommended (users): install the package from source and use the command-line tool
-
Alternative (developers): clone the repository and work on the code base
- Python ≥ 3.9
- Conda (Miniconda, Miniforge, Mambaforge)
- Git
This is the recommended way to install and use the tool.
Clone the repository and change into the project directory:
git clone https://github.com/usegalaxy-eu/MAGs-visualization.git
cd MAGs-visualizationInstall the package using pip:
pip install .After installation, the command-line tool is available as:
mags-visualization --helpThis method works independently of the repository structure.
This option is intended for development, testing, or extending the code.
git clone https://github.com/alexandrah1704/MAGs-visualization.git
cd MAGs-visualizationCreate conda environment and activate it:
conda env create -f environment.yml
conda activate mags
pip install -e .# Change into project directory
cd MAGs-visualization
# Create virtual environment
python -m venv .venv
# Allow script execution for this session
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
# Activate virtual environment
.\.venv\Scripts\Activate.ps1
# Install dependencies
pip install -e .After installation, the command-line tool is available as:
mags-visualization --helpThis tool generates a variety of visualizations for MAGs, including:
- Taxonomic Sankey diagrams
- Completeness/Contamination-Plots
- Heatmaps
- dRep cluster visualization with taxonomic annotation
- dRep cluster visualization with functional annotation (KEGG pathway completeness)
- Standalone KEGG pathway module heatmaps
- Rank distribution diagram...
All plots are saved in a user-defined output directory.
Below are the inputs for a visualization run:
| Argument | Description |
|---|---|
| --coverm | CoverM table |
| --checkm | CheckM result file |
| --checkm2 | CheckM2 result file |
| --gtdb | GTDB annotation table |
| --drep | dRep cluster table |
| -o | Output folder for all generated plots |
Optional:
| Argument | Description |
|---|---|
| --quast | QUAST assembly statistics |
| --bakta | Bakta annotation table |
| --metadata | Metadata table for coloring plots |
| --metadata | Metadata for heatmap visualization |
| --amber | CAMI Amber binning evaluation |
| --pathways | KEGG pathway completeness |
--coverm(required)--gtdb(required)--metadata(optional)--meta_cols(optional)
--checkm(required)--checkm2(required)--gtdb(required for--mode tax)--metadata+--meta_col(required for--mode meta)
--drep(required)--gtdb(required)--checkm2,--quast,--bakta(required for annotated heatmap)
--drep(required)--gtdb(required)--pathways(required for functional annotation heatmap)
--drep(required)--gtdb(required)--pathways(required)
--gtdb(required)
The command-line interface is organized into subcommands. Each subcommand generates exactly one type of plot and only shows the parameters relevant for that plot.
mags-visualization <subcommand> [OPTIONS]Available subcommands:
sample-heatmap- MAG detection heatmap per sampledrep-cluster-annot- dRep cluster visualization with taxonomic/assembly annotationdrep-cluster-func- dRep cluster overview with taxonomy and functional module heatmappathway-module-heatmap- heatmap of KEGG pathway module completeness across MAGscomp-conta- completeness/contamination plotstaxa-sankey- GTDB taxonomy sankey plotsall- legacy mode running multiple plots in one command
The all subcommand is mainly intended for testing.
For Galaxy integration, the dedicated subcommands are recommended.
mags-visualization --helpmags-visualization sample-heatmap --help
mags-visualization drep-cluster-annot --helpmags-visualization sample-heatmap \
--coverm test-data/coverm.tsv \
--gtdb test-data/gtdb.tsv \
--output out/sample-heatmapmags-visualization all \
--coverm test-data/coverm.tsv \
--checkm test-data/checkm.tsv \
--checkm2 test-data/checkm2.tsv \
--gtdb test-data/gtdb.tsv \
--drep test-data/drep.csv \
--quast test-data/quast.tsv \
--bakta test-data/bakta.tsv \
--pathways test-data/kegg_pathway_completeness.tsv \
--metadata test-data/metadata.tsv \
--meta_cols "Infection by Nosema ceranae" "Chronic exposure to neonicotinoid" "Treatment with probiotic" \
--color_by tax \
--tax_level phylum \
--top_n 30 \
--top_bar_spacer -0.5 \
--spacer_meta 2.5 \
-o test-plots-runpython scripts/test-script.py--rank phylumAvailable ranks:
domain, phylum, class, order, family, genus, species--top_n_counts 10Minimum and Default = 5
--fig_size WIDTH HEIGHT--format png # png, pdf or svg--quality # color points by quality categories hq, mq, lq
or
--color_by quality
--tax # color by taxonomy
--color_by tax
--tax_level genus
--color_by meta # color by metadata
--meta_col temperature # weather or others
--meta_bin_width 5 # for numeric columnsTo show in the heatmap more than one metadata column:
--meta_cols weather temp ground # example columns--top_bar_height 0.8 # Height of top bar
--hspace 0.25 # Gap between top bar and heatmap
--heatmap_width 11.0
--spacer_legend 0.3 # Gap between legend and meta_bar
--spacer_meta 2.0 # Gap between meta_bar and heatmap
--spacer_heatmap # Gap between heatmap and histogram
--legend 2.5 # Size of legend
--meta_bar_add 1.5 # Additional width for meta_bar
--top_bar_spacer 0.0 # Gap between header and top bar
--max_col 10 # How many taxonomy names are shown (top 10)--top_n 30 # show top 30 clusters with most cluster membersFull examples can be found in 'use-cases/README.md'