Server Assistant AI Agent for Bio Lab

[ English | 简体中文 ]

An intelligent server management assistant based on DeepSeek model and OpenAI SDK, specifically designed for biology laboratories.

An intelligent server management assistant specifically designed for bioinformatics laboratories, developed based on the DeepSeek model. Capable of answering questions about server software paths, genome locations, server configurations based on the knowledge base, and providing analysis code as needed. Supports natural language interaction, suitable for beginners to quickly get started with bioinformatics analysis.

Core functionality

Intelligent Q&A System - Natural language queries based on knowledge base
Structured Knowledge Base Query - Intelligent retrieval of multiple information categories including software details, genomic data, server configurations, etc.
Cross-Platform Compatibility - Supports macOS and Ubuntu systems
AI-Enhanced Responses - Integrated DeepSeek model for professional answers
Modular Design - Easily extensible and maintainable code structure

Currently Supported Knowledge Base Types

Software Information - Paths, versions, descriptions
Genomic Data - Reference genome locations, versions, file paths
Dataset Information - Storage locations, sample information
Server Configuration - Server information, storage capacity
Can add knowledge base types as needed

Project Structure

├── api/ # API client modules
├── config/ # Configuration files
├── core/ # Core functionality modules
├── handlers/ # Knowledge base handlers
├── knowledge/ # Knowledge base data files, modify according to lab specific needs
├── utils/ # Utility functions
├── main.py # Main program entry
└── requirements.txt # Dependency list

Quick Start

Environment Requirements

Python 3.7+
Ubuntu or macOS
Deepseek API key

Apply for Deepseek API key

https://api-docs.deepseek.com

Installation

Clone the Project

git clone https://github.com/helianfeixing/lab-assistant-ai.git
cd lab-assistant-ai

Install Dependencies

pip install openai requests

Set API Key

# For Ubuntu
echo 'export DEEPSEEK_API_KEY="your_Deepseek_API_key"' >> ~/.bashrc
source ~/.bashrc

# For macOS
echo 'export DEEPSEEK_API_KEY="your_Deepseek_API_key"' >> ~/.zshrc
source ~/.zshrc

Run the Program

python main.py

Operation Interface

🤖 === 欢迎使用 ** Lab服务器助手 ===
正在初始化AI助手...
✅ 初始化完成!

🤖  您好，我是 ** Lab服务器助手 ===
我可以帮您处理以下类型的问题：

🔧 **软件管理**
  - '如何运行samtools？'

🧬 **基因组数据**
  - '人类参考基因组存放在哪里？'

📊 **分析工具**
  - 'star比对索引在哪里？'

📋 **服务器规则**
  - '数据应该存放在哪里？'
  - 'home目录的使用限制'
  - '下载文件的存储规范'

💬 **对话命令**:
  - reset/重置: 开始新的对话
  - status/状态: 查看对话状态
  - help/帮助: 显示此帮助
  - quit/退出: 退出程序

💡 提示：可以用自然语言提问，如有疑问请联系管理员

💭 您有什么问题？

Usage Example

💭 您有什么问题？ 我希望比对人类RNA-seq数据，提供代码
🤔 正在思考...

当然可以！针对人类RNA-seq数据的比对，我推荐使用 **STAR** 或 **HISAT2**，这两个工具在RNA-seq比对中都很常用。以下是基于你的需求的代码示例和说明：

---

### 1. **使用 STAR 进行比对**
STAR 适合处理剪接位点，速度快且准确。索引路径为 `/data/data/index/STAR_index/hg38`。

**示例代码：**
STAR --genomeDir /data/data/index/STAR_index/hg38 \
     --readFilesIn sample_R1.fastq.gz sample_R2.fastq.gz \
     --readFilesCommand zcat \
     --outFileNamePrefix sample_ \
     --outSAMtype BAM SortedByCoordinate \
     --runThreadN 8

**参数说明：**
- `--genomeDir`：指定人类基因组索引路径。
- `--readFilesIn`：输入测序文件（如果是单端数据，只用一个文件）。
- `--readFilesCommand zcat`：如果输入是 `.gz` 压缩文件，用 `zcat` 解压。
- `--outSAMtype BAM SortedByCoordinate`：输出排序后的BAM文件。
- `--runThreadN`：线程数，根据服务器资源调整。

**注意事项：**
- 确保输入文件路径正确，如果是单端数据，移除 `sample_R2.fastq.gz`。
- 输出文件会以 `sample_` 为前缀，包括BAM文件和日志。

---

### 2. **使用 HISAT2 进行比对**
HISAT2 内存占用较小，适合资源有限的环境。索引路径为 `/data/data/index/hisat2_index/hg38`。

**示例代码：**
hisat2 -x /data/data/index/hisat2_index/hg38 \
       -1 sample_R1.fastq.gz -2 sample_R2.fastq.gz \
       -S sample.sam \
       --threads 8

**参数说明：**
- `-x`：指定索引路径。
- `-1` 和 `-2`：双端测序文件。
- `-S`：输出SAM文件。
- `--threads`：线程数。

**后续处理（将SAM转为BAM并排序）：**
samtools view -bS sample.sam | samtools sort -o sample_sorted.bam

---

### 选择建议：
- 如果数据量大或需要高精度剪接位点检测，用 **STAR**。
- 如果服务器内存有限或数据量较小，用 **HISAT2**。

### 需要确认的细节：
- 你的数据是单端还是双端测序？
- 文件是否是压缩格式（如 `.gz`）？
- 是否需要其他输出格式（如BAM索引）？

如果有其他需求或细节，随时告诉我，我可以帮你调整代码！

Contact

Yunlong Zhang 12131328@mail.sustech.edu.cn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Server Assistant AI Agent for Bio Lab

Core functionality

Currently Supported Knowledge Base Types

Project Structure

Quick Start

Environment Requirements

Apply for Deepseek API key

Installation

Contact

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
api		api
config		config
core		core
handlers		handlers
knowledge		knowledge
test		test
utils		utils
README.md		README.md
README_CN.md		README_CN.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Server Assistant AI Agent for Bio Lab

Core functionality

Currently Supported Knowledge Base Types

Project Structure

Quick Start

Environment Requirements

Apply for Deepseek API key

Installation

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages