GEMCAT GUI (Gene Expression-based Metabolic Context Analysis Tool) is a user-friendly graphical application for performing gene-expression to metabolite-concentration analysis. It maps gene expression data (RNA sequencing, proteomics) to genome-scale metabolic models and predicts changes in metabolite concentrations.
- 𧬠Gene Expression Analysis - Load and analyze RNA-seq or proteomics data
- πΊοΈ Metabolic Model Mapping - Support for Human (Recon3D), Mouse, and Rat models
- π Interactive Visualization - Plot and explore top metabolite changes
- πΎ Export Results - Save analysis results to CSV or Excel format
- π¨ Modern Interface - Clean PyQt5-based GUI with intuitive workflow
- π Pathway Filtering - Filter metabolites by specific metabolic pathways
- Python 3.10 or higher
- pip package manager
# Clone the repository
git clone https://github.com/MolecularBioinformatics/gemcat_gui.git
cd gemcat_gui
# Install dependencies
pip install -r requirements.txt
# Install the package in development mode
pip install -e .After installation, you can launch GEMCAT GUI in several ways:
# Method 1: Using the command-line entry point
gemcat-gui
# Method 2: Using Python module
python -m gemcat_gui.gui.main_window
# Method 3: Using the convenience script
python run_gemcat.pyRun the application using one of the methods above. The GUI has three main tabs:
- Configuration - Load data, select models, and configure analysis
- Results - View and export analysis results
- Help - Access in-app documentation
- Navigate to the Configuration tab
- Click Load RNA Data and select your gene expression file (CSV, TSV, or XLSX)
- Select the appropriate Data Separator (comma, tab, etc.)
- Choose the RNA Index Column containing gene identifiers
- Select a COBRA Model (Human_Recon3D, Mouse, Rat)
- Choose the Mapping ID Column that matches your gene identifiers
- Click Map Genes to link expression data to the metabolic model
- Select Comparison Column (treatment/experimental group)
- Select Baseline Column (control group)
- Click Run GEMCAT to perform the analysis
- View results in the Results tab
This tab is where you prepare and configure your data for GEMCAT analysis.
This section is for loading your RNA gene expression data.
- Load RNA Data: Click this button to open a file dialog. Select your gene expression data file (e.g., CSV, TSV, XLSX).
- Note: Your RNA data file should have gene identifiers (e.g., Ensembl IDs, Gene Symbols) in one column and expression values for different samples/conditions in other columns. Make sure the gene identifiers are consistent and accurate, as they are crucial for successful gene mapping.
- Show Advanced Settings: Check this box to reveal advanced options, including the data separator.
- Data Separator: Select the delimiter used in your data file (e.g., comma
,for CSV, tab\tfor TSV). This setting is essential for correctly parsing your input file. - RNA Index Column: After loading your RNA data, this dropdown will populate with all columns from your file. Select the column that contains your gene identifiers (e.g., 'Gene_ID', 'Ensembl_ID').
- Hint: This column typically contains unique identifiers for each gene. Common examples include gene symbols (e.g., 'GAPDH'), Ensembl IDs (e.g., 'ENSG00000111640'), or Entrez IDs. Ensure the identifiers in this column are clean and consistent, as they will be used to match with the gene mapping file.
- RNA Data Preview (First 10 Rows): This table will display the first 10 rows of your loaded RNA data, allowing you to verify that it has been loaded correctly.
In this section, you will configure the COBRA model and gene mapping settings.
- COBRA Model: Select the COBRA metabolic model you wish to use for the GEMCAT analysis. The available models are pre-configured within the application.
- Note: Selecting a model will automatically load the corresponding gene mapping file required for that model. If a mapping file is not found, an error message will appear. The gene number column used for mapping (e.g., 'gene_number' for Human_Recon3D, 'SYMBOL' for Rat and Mouse) is automatically determined by the selected model and does not require manual selection.
- Mapping ID Column: After selecting a COBRA model, this dropdown will populate with columns from the loaded gene mapping file. Select the column in the mapping file that contains gene identifiers matching those in your RNA data (e.g., 'Ensembl_ID', 'Locus_Tag'). This column will be used to link your RNA data to the metabolic model.
- Hint: This column should contain gene identifiers that correspond exactly to the identifiers in your 'RNA Index Column' (e.g., if your RNA data uses Ensembl IDs, select the Ensembl ID column from the mapping file). Mismatches here will lead to failed gene mapping.
- Map Genes: Click this button after you have loaded RNA data, selected an RNA Index Column, chosen a COBRA model, and specified the Mapping ID Column. This step performs the gene mapping, translating your gene expression data into a format compatible with the GEMCAT model.
- Important: You must successfully map genes before you can run the GEMCAT analysis.
Once all configurations are complete, you can initiate the GEMCAT analysis.
- Comparison Column: Select the column from your RNA data that represents your experimental or 'treatment' group.
- Baseline Column: Select the column from your RNA data that represents your control or 'baseline' group. GEMCAT will calculate changes relative to this column.
- Warning: The Comparison and Baseline columns cannot be the same.
- Run GEMCAT: Click this button to start the GEMCAT analysis. This process can take some time, depending on the size of your data and the complexity of the model.
- Note: The GUI may appear to be temporarily unresponsive during this calculation, as the analysis is run in a separate thread. A 'Wait Cursor' will indicate that a process is running.
- Note: Upon completion, the application will automatically switch to the 'Results' tab.
This tab displays the computed GEMCAT results and allows for visualization.
- This table will display the predicted metabolite changes resulting from the GEMCAT analysis. The primary column will show the predicted change (e.g., fold change) for each metabolite.
- Search: Use the search box above the table to instantly filter results by any value or keyword.
- Sort: Click any column header to sort the table. Numeric columns (including log2FoldChange) sort correctly.
- Save Results: Click this button to save the entire results table to a CSV (
.csv) or Excel (.xlsx) file.
- Pathway: Use this dropdown to filter the metabolites displayed in the plot by a specific metabolic pathway. Select "All Pathways" to view metabolites from all pathways.
- Top N: Use the spin box to specify how many of the top metabolites (based on absolute predicted change) you want to display in the plot.
- Plot Top N Metabolites: Click this button to generate or update the bar plot.
- Red bars indicate a predicted decrease in metabolite concentration.
- Green bars indicate a predicted increase in metabolite concentration.
- Plot Navigation Toolbar: Below the plot, you'll find a toolbar with standard matplotlib navigation controls (Pan, Zoom, Home, Save, etc.).
- Save Plot: The plot can be saved as an image (e.g., PNG, JPG, PDF) directly from the plot navigation toolbar by clicking the disk icon.
This tab provides access to this help documentation directly within the application. You can also use the Open Log File button to view the application's log file (gemcat_gui.log) in your default text editor for troubleshooting.
- "Error loading RNA data..." / "Error mapping genes..."
- Check your file path and ensure the file exists.
- Verify the selected 'Data Separator' matches your file.
- Ensure the 'RNA Index Column' and 'Mapping ID Column' accurately reflect your gene identifiers.
- Confirm your data is correctly formatted and not empty.
- "Comparison and baseline columns cannot be the same."
- Select different columns for your 'Comparison' and 'Baseline' conditions.
- "No results to display or dataframe is empty."
- Ensure GEMCAT analysis completed successfully. Check the log file (
gemcat_gui.log) for errors.
- Ensure GEMCAT analysis completed successfully. Check the log file (
- Log File: A log file named
gemcat_gui.logis generated in the application's directory. You can open it directly from the Help tab using the "Open Log File" button for detailed information about application events, warnings, and errors. - Model Files and Gene Mappings: Ensure that the necessary COBRA model files and their corresponding gene mapping files (as defined in
config.py) are correctly placed and accessible by the application.
gemcat_gui/
βββ src/
β βββ gemcat_gui/ # Main package
β βββ config/ # Configuration and constants
β βββ core/ # Core data processing logic
β βββ gui/ # GUI components and widgets
βββ tests/ # Test suite
βββ examples/ # Example scripts and notebooks
β βββ notebooks/ # Jupyter notebooks
β βββ data/ # Sample datasets
βββ src/models/ # Metabolic model files (139 MB)
βββ README.md # This file
βββ setup.py # Package setup
βββ pyproject.toml # Modern Python packaging config
βββ requirements.txt # Dependencies
| Model | Organism | Reactions | Metabolites | Genes |
|---|---|---|---|---|
| Recon3D | Human | 13,543 | 4,140 | 2,248 |
| Rat-GEM | Rat | 8,104 | 2,773 | 1,855 |
| Mouse-GEM | Mouse | 8,104 | 2,773 | 1,835 |
# Run all tests
pytest tests/ -v
# Run specific test suite
pytest tests/test_basic.py -v
# Run with coverage
pytest tests/ --cov=src/gemcat_gui --cov-report=html- Version: 1.4.0
- Python: 3.9+
- License: MIT
- Repository: https://github.com/MolecularBioinformatics/gemcat_gui
- pandas - Data manipulation
- numpy - Numerical computations
- cobra - Metabolic modeling
- gemcat - Core GEMCAT analysis
- PyQt5 - GUI framework
- matplotlib - Plotting and visualization
- seaborn - Enhanced visualizations
- openpyxl - Excel file support
Import Errors
# Ensure the package is installed
pip install -e .Missing Dependencies
# Reinstall all dependencies
pip install -r requirements.txtPyQt5 Issues
# Install PyQt5 explicitly
pip install PyQt5Model Files Not Found
- Ensure
src/models/directory contains the metabolic model files - Check that model paths in configuration are correct
The application generates a log file gemcat_gui.log in the working directory. Check this file for detailed error messages and debugging information.
Contributions are welcome! Please feel free to submit issues or pull requests.
If you use GEMCAT GUI in your research, please cite:
[Citation information to be added]
Maintainer: Suraj Sharma
Email: suraj.sharma@uib.no
Institution: University of Bergen
For bug reports and feature requests, please use the GitHub Issues page.
This project is licensed under the MIT License - see the LICENSE file for details.