Multi-agent precision medicine pipeline for novel drug discovery β generating molecules, predicting selectivity, matching clinical trials, and persisting discoveries in under 60 seconds.
Drug Discovery AI is a 18-agent pipeline that takes a gene mutation query (e.g. EGFR T790M) and:
- Parses the mutation with LLM + regex fallback
- Fetches literature (PubMed), proteins (UniProt), structures (RCSB), compounds (PubChem) in parallel
- Downloads PDB structures and detects binding pockets (fpocket / centroid fallback)
- Generates novel molecules via RDKit scaffold hopping + bioisostere SMARTS mutations
- Docks molecules (Gnina β Vina β AI hash fallback)
- Dual-docks vs off-target proteins to compute selectivity ratio β the key differentiator
- Screens ADMET (Lipinski + PAINS + toxicophore images)
- Optimizes leads with scaffold hopping, bioisostere replacement, and fragment growing β building an evolution tree
- Forecasts resistance mutations with LLM
- Matches active clinical trials via ClinicalTrials.gov API v2
- Builds a knowledge graph and reasoning trace
- Saves discoveries to Neon PostgreSQL and exposes a full REST API
drug-discovery-ai/
βββ backend/ # Python 3.11 + FastAPI + 18 agents
β βββ agents/ # 18 pipeline agents
β βββ pipeline/ # LangGraph state + orchestrator
β βββ utils/ # LLM router, DB, ADMET, molecule utils
β βββ routers/ # FastAPI route handlers
β βββ data/ # JSON data files (curated profiles, resistance, etc.)
β βββ evaluation/ # Benchmark runner
β
βββ frontend/ # Next.js 16 + TypeScript + Tailwind v4
β βββ app/
β βββ components/ # All UI components (analysis, landing, settings)
β βββ hooks/ # useSSEStream, useAnalysis, useDiscoveries, useTheme
β βββ lib/ # api.ts, types.ts, utils.ts, theme.ts
β βββ analysis/ # [sessionId] analysis page (10 tabs)
β βββ discoveries/ # Discovery library browser
β βββ settings/ # Theme customizer, pipeline config, API key checker
β
βββ .github/
β βββ workflows/ # backend-ci.yml + frontend-ci.yml
βββ start.sh # Unix quick-start
βββ start.bat # Windows quick-start
βββ README.md
- Python 3.11+
- Node.js 20+
- (Optional) AutoDock Vina or Gnina for real docking
- (Optional) fpocket for pocket detection
cd backend
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env
# Fill in at least GROQ_API_KEY (free at console.groq.com)
uvicorn main:app --reload --port 8000cd frontend
npm install
cp .env.local.example .env.local
npm run devOpen http://localhost:3000.
| Variable | Required | Description |
|---|---|---|
OPENAI_API_KEY |
Optional | GPT-4o-mini (primary LLM) |
GROQ_API_KEY |
Recommended | Llama 3.3 70B β free at console.groq.com |
TOGETHER_API_KEY |
Optional | Mistral 7B fallback |
NCBI_API_KEY |
Optional | Higher PubMed rate limits |
LANGCHAIN_API_KEY |
Optional | LangSmith observability |
DATABASE_URL |
Optional | Neon PostgreSQL connection string |
AUTO_SAVE_DISCOVERIES |
Optional | Set true to auto-persist every run |
NEXT_PUBLIC_API_URL=http://localhost:8000
NEXT_PUBLIC_APP_NAME="Drug Discovery AI"- FastAPI 0.115 + uvicorn + SSE streaming
- LangGraph 0.2 pipeline orchestration
- LangSmith observability (enterprise trace dashboard)
- RDKit β molecule generation, ADMET, depiction
- Multi-LLM fallback: OpenAI GPT-4o-mini β Groq Llama 3.3 70B β Together Mistral 7B β deterministic template
- Neon PostgreSQL via SQLAlchemy asyncio + asyncpg
- External APIs: PubMed, UniProt, RCSB, PubChem, ClinicalTrials.gov
- Next.js 16 App Router + TypeScript strict
- Tailwind CSS v4 + amber minimal theme system
- GSAP 3.12 + ScrollTrigger β hero animations, counter reveals
- Framer Motion 11 β agent status stagger, 2D/3D crossfade
- D3.js 7 β knowledge graph force-directed, evolution tree
- Recharts β ADMET radar chart, docking score chart
- shadcn/ui (Radix primitives) β accessible components
- NGL β 3D molecular viewer
- Biome β linting + formatting (no ESLint, no Prettier)
| Feature | Description |
|---|---|
| Selectivity Ratio | Dual-docks top leads vs. off-target proteins. ratio = target / off-target. β₯3.0 = High selectivity |
| Evolution Tree | D3 visualization of how seed molecules transform through scaffold hopping + bioisostere operations |
| LangSmith Tracing | Enterprise observability β every agent call, token count, latency, auditable in real time |
| Clinical Trial Matching | Live ClinicalTrials.gov API β links discovery to active patient trials |
| Resistance Forecasting | LLM predicts which secondary mutations will emerge under treatment pressure |
| Discovery Database | Neon PostgreSQL β persists all discoveries, browseable in the Discoveries Library |
- "Meet Sarah. 52 years old. Lung cancer. EGFR T790M. Erlotinib stopped working."
- Type
EGFR T790Mβ Launch Analysis - Watch 18 agents stream live in the PipelineStatus panel
- Top Leads tab: "Our AI generated molecules that don't exist in any database. This one binds 3.2Γ harder to the cancer target than to healthy ABL1 kinase."
- Evolution Tree: "Here's exactly how the seed molecule was transformed β scaffold hop +1.4 kcal/mol, bioisostere +0.8 kcal/mol."
- Clinical Trials: "3 active Phase II trials targeting EGFR T790M. Our molecule targets the same pocket."
- Open LangSmith tab: "Every decision, auditable. Enterprise ready."
- Click Save Discovery β "Permanently saved to our Neon database."
# Backend
cd backend
ruff check .
mypy .
# Frontend
cd frontend
npm run check # Biome check
npm run check:fix # Auto-fix
npm run typecheck # TypeScriptcd backend
python -c "
from evaluation.benchmark_runner import run_benchmark_cases
import asyncio
r = asyncio.run(run_benchmark_cases())
print(f'Accuracy: {r[\"accuracy\"]*100:.0f}%')
"cd backend
python -c "
from agents.SelectivityAgent import SelectivityAgent
import asyncio
r = asyncio.run(SelectivityAgent().run({
'docking_results': [{'smiles': 'CC(=O)Nc1ccc(O)cc1', 'binding_energy': -8.5, 'compound_name': 'Test'}],
'mutation_context': {'gene': 'EGFR'},
'analysis_plan': type('P', (), {'run_selectivity': True})(),
}))
print('SelectivityAgent OK:', r['selectivity_results'][0]['selectivity_label'])
"| Branch | Owner | Scope |
|---|---|---|
main |
β | Protected. Merge via PR only |
feature/backend-agents |
Backend dev | Python agents, pipeline, bioinformatics |
feature/frontend-ui |
Frontend dev | Next.js pages, components, GSAP |
feature/testing-validation |
QA dev | Pytest, benchmark, E2E validation |
MIT β built for a hackathon. Not for clinical use. All results are computational predictions only.