Add AIREV-Agent-0.8B v2: Sub-billion parameter model for BFCL V4 by mk42-ai · Pull Request #1319 · ShishirPatil/gorilla

mk42-ai · 2026-04-03T07:24:33Z

Model

AIREV-Agent-0.8B — a 752M parameter model fine-tuned for agentic tool calling.

Base: Qwen3.5-0.8B (Gated Delta Network architecture, 262K context)
HuggingFace: https://huggingface.co/airev-ai/AIREV-Agent-0.8B
License: Apache 2.0
Organization: AIREV (On-Demand.io)

Training Pipeline

SFT on 50K Claude Opus 4.6-generated BFCL-format samples with chain-of-thought reasoning
AutoResearch — Karpathy-style automated hyperparameter discovery (112 experiments, 4 GPUs) found optimal GRPO config: lr=2e-6, 24 generations, temp=0.6, format_bonus=0.1
GRPO with AutoResearch-optimized config on 43K clean training samples (14 hours, single H100)
Targeted SFT on multi-turn, memory, and web_search categories using real BFCL function schemas

Evaluation

All 20 BFCL V4 categories evaluated. Results generated using transformers inference with temperature=0.6, chain-of-thought reasoning via tokens.

Prompt Mode

This model uses prompt-based function calling (not native FC mode). The BFCL system prompt with bracket format [func_name(params)] is used.

Hardware

Trained on a single NVIDIA H100 80GB GPU. Total training time: ~24 hours (SFT + GRPO + targeted SFT).

Model: airev-ai/AIREV-Agent-0.8B (0.8B params, Qwen3.5-0.8B base) Training: SFT on 50K Claude Opus data + GRPO with AutoResearch-optimized hyperparameters Architecture: Gated Delta Network (GDN), 262K context License: Apache 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AIREV-Agent-0.8B v2: Sub-billion parameter model for BFCL V4#1319

Add AIREV-Agent-0.8B v2: Sub-billion parameter model for BFCL V4#1319
mk42-ai wants to merge 1 commit intoShishirPatil:mainfrom
mk42-ai:airev-agent-0.8b-v2

mk42-ai commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mk42-ai commented Apr 3, 2026

Model

Training Pipeline

Evaluation

Prompt Mode

Hardware

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant