An AI data analyst agent that explores a semantic layer in a sandbox environment to answer natural language questions with SQL.
OSS Data Analyst uses a sandboxed exploration approach: instead of hardcoding schema knowledge into prompts, the agent is given shell access to a sandbox containing your semantic layer files. It discovers the schema dynamically using cat, grep, and ls commands, then builds and executes SQL queries based on what it finds.
This architecture means the agent can:
- Adapt to any schema without prompt changes
- Explore relationships between entities naturally
- Handle schema updates without redeployment
- Reason about data the same way a human analyst would
- Sandbox Creation - A Vercel Sandbox is spun up and populated with your semantic layer YAML files
- Schema Exploration - The agent uses shell commands to browse the catalog and entity definitions
- Query Building - Based on discovered schema, the agent constructs SQL queries
- Execution - Queries run against your SQLite database
- Reporting - Results are formatted with a narrative explanation
User Question
↓
┌─────────────────────────────────────┐
│ Vercel Sandbox │
│ ┌─────────────────────────────┐ │
│ │ semantic/ │ │
│ │ ├── catalog.yml │ │
│ │ └── entities/ │ │
│ │ ├── companies.yml │ │
│ │ ├── people.yml │ │
│ │ └── accounts.yml │ │
│ └─────────────────────────────┘ │
│ │
│ Agent explores with: │
│ • cat semantic/catalog.yml │
│ • grep -r "keyword" semantic/ │
│ • cat semantic/entities/*.yml │
└─────────────────────────────────────┘
↓
SQL Query → Database → Results → Narrative
- Node.js 20+
- pnpm
- Vercel AI Gateway API key
git clone https://github.com/vercel-labs/oss-data-analyst.git
cd oss-data-analyst
pnpm installcp env.local.example .env.localAdd your Vercel AI Gateway key to .env.local.
pnpm initDatabaseCreates a SQLite database with sample data (Companies, People, Accounts).
pnpm devThe semantic layer lives in src/semantic/ and defines your data model:
src/semantic/
├── catalog.yml # Entity index with descriptions
└── entities/
├── companies.yml # Company entity definition
├── people.yml # People entity definition
└── accounts.yml # Accounts entity definition
Each entity YAML includes:
sql_table_name- The underlying tablefields- Available columns with SQL expressionsjoins- Relationships to other entities- Example questions the entity can answer
The agent reads these files at runtime to understand your schema.
- "How many companies are in the Technology industry?"
- "What is the average salary by department?"
- "Show me the top 5 accounts by monthly value"
- "Which companies have the most employees?"
Stack: Next.js, Vercel AI SDK, Vercel Sandbox, SQLite
Key Files:
src/lib/agent.ts- Agent definition and system promptsrc/lib/tools/sandbox.ts- Sandbox creation with semantic filessrc/lib/tools/shell.ts- Shell command tool for explorationsrc/lib/tools/execute-sqlite.ts- SQL execution tool
- Add entity YAML files to
src/semantic/entities/ - Update
src/semantic/catalog.ymlwith the new entity - The agent will automatically discover and use the new schema
No code changes required—the sandbox approach means schema changes are picked up at runtime.
Database Not Found
pnpm initDatabaseBuild Errors
pnpm type-check