AiDotNet Roadmap: High-Impact Backlog and Tracking

### Overview

This issue is the central, living roadmap for AiDotNet's development. It organizes the high-impact backlog into thematic areas, tracks their status, and links to the relevant, detailed implementation plans. Each epic listed below contains a granular, checklist-style breakdown suitable for a junior developer.

**Quick Links:**
-   **Project Board:** https://github.com/users/ooples/projects/7
-   **Milestones:** https://github.com/ooples/AiDotNet/milestones

---

### Theme 1: Retrieval-Augmented Generation (RAG) & Search

**Status:** `Complete`

**Goal:** Build a state-of-the-art, in-house RAG framework and the persistent storage backends required to support it at scale.

**Current State (March 2026):** 165+ RAG source files. Knowledge graph system with 5 embedding methods (TransE, RotatE, ComplEx, DistMult, TemporalTransE), link prediction, Leiden community detection, GraphRAG (Local/Global/DRIFT modes), temporal KG support, and KG construction from text. File-based document store with HNSW indexing, WAL crash recovery, and inverted metadata index. Full AiModelBuilder facade integration. 40+ tests.

#### **Epic: [#306] In-House Graph Database** - CLOSED
-   [x] **Phase 1: Decouple Graph Logic from Storage.** See issue #306 for 4 detailed Acceptance Criteria (AC 1.1 - 1.4).
-   [x] **Phase 2: Implement Persistent On-Disk Store.** See issue #306 for 3 detailed Acceptance Criteria (AC 2.1 - 2.3).

#### **Epic: [#305] In-House Document Store** - CLOSED
-   [x] **Phase 1: Core Document Persistence.** See issue #305 for 2 detailed Acceptance Criteria (AC 1.1 - 1.2).
-   [x] **Phase 2: Search and Retrieval.** See issue #305 for 3 detailed Acceptance Criteria (AC 2.1 - 2.3).

#### **Epic: [#303] RAG Framework Finalization** - CLOSED
-   [x] **Task 1: Finalize Builder Integration.** See issue #303 for AC 1.1.
-   [x] **Task 2: Create Comprehensive End-to-End Example.** See issue #303 for AC 2.1 - 2.3.
-   [x] **Task 3: Enhance Test Coverage and Benchmarking.** See issue #303 for AC 3.1 - 3.2.

---

### Theme 2: Advanced Generative AI (Diffusion Models)

**Status:** `Complete`

**Goal:** Build a comprehensive suite of tools for generative AI, centered around diffusion models for image generation.

**Current State (March 2026):** 411 source files, ~102,000 lines of code. Covers: 35 noise schedulers (DDPM, DDIM, DPM-Solver, Euler, PNDM, LCM, Flow Matching, etc.), 17 VAE implementations, 12 noise predictor architectures (U-Net, DiT, MMDiT, UViT, Flux), 34 text-to-image models (SD 1.5/2/3/3.5/XL, Flux 1/2, DALL-E 2/3, Imagen 2/3, etc.), 60 video generation models (Sora, Hunyuan, Mochi, Kling, etc.), 35 fast generation models (Consistency, LCM, Turbo, Lightning), 25 ControlNet/conditioning variants, 32 image editing models, 15 audio/music models, 14 3D generation models, 11 text conditioners (CLIP, T5, SigLIP), 8 attention mechanisms, 7 guidance methods, plus distillation, super-resolution, style transfer, panorama, virtual try-on, and alignment modules.

#### **Epic: [#298] Advanced Diffusion Models & Schedulers** - CLOSED
-   [x] **Phase 1: Advanced Scheduler Implementation.** See issue #298 for 3 detailed Acceptance Criteria (AC 1.1 - 1.3).
-   [x] **Phase 2: U-Net and VAE Implementation.** See issue #298 for 3 detailed Acceptance Criteria (AC 2.1 - 2.3).
-   [x] **Phase 3: Latent Diffusion Pipeline.** See issue #298 for AC 3.1.

---

### Theme 3: Meta-Learning

**Status:** `Substantially Complete` - Core algorithms implemented, sub-issues re-scoped for 2024-2026 research extensions

**Goal:** Implement state-of-the-art meta-learning algorithms to enable models that can learn new tasks rapidly from a small number of examples.

**Current State (March 2026):** 65+ meta-learning algorithms implemented. Core suite: MAML, MAML++, iMAML, ANIL, BOIL, MetaSGD, WarpGrad, CAVIA, Reptile, SEAL, OpenMAML, HyperMAML. Metric-based: ProtoNets, MatchingNetworks, RelationNetwork, TADAM, SimpleShot, DeepEMD, LaplacianShot, FEAT, TIM. Memory-based: MANN, NTM. Plus: CNAP, GNNMeta, MetaOptNet, SNAIL, MCL, DKT, DPGN, EPNet, FRN, ConstellationNet, PMF, and more. 5 episodic data loader variants (Uniform, Balanced, Stratified, Curriculum). 321 passing tests. Full enum-based algorithm selection.

#### **Epic: [#290] Episodic Data Abstractions** - OPEN (re-scoped for extensions)
-   [x] **AC 1.1: N-way K-shot DataLoader:** `EpisodicDataLoaderBase` + `UniformEpisodicDataLoader`, `BalancedEpisodicDataLoader`, `StratifiedEpisodicDataLoader`, `CurriculumEpisodicDataLoader` all implemented. Issue re-scoped (Feb 2026) for additional PyTorch/learn2learn pattern alignment.

#### **Epic: [#289] Implement SEAL** - CLOSED
-   [x] **AC 1.1: SEAL Algorithm:** `SEALAlgorithm` implemented with temperature scaling, entropy regularization, and adaptive learning rates. Full unit and integration tests.

#### **Epic: [#291] Implement MAML** - OPEN (re-scoped for extensions)
-   [x] **AC 1.1: MAML Algorithm:** Full MAML family implemented: `MAMLAlgorithm` (first/second order), `MAMLPlusPlusAlgorithm`, `iMAMLAlgorithm`, `ANILAlgorithm`, `BOILAlgorithm`, `MetaSGDAlgorithm`, `WarpGradAlgorithm`, `CAVIAAlgorithm`, `OpenMAMLAlgorithm`. Issue re-scoped (Feb 2026) for additional 2024-2026 research extensions.

#### **Epic: [#292] Implement Reptile** - CLOSED
-   [x] **AC 1.1: Reptile Algorithm:** `ReptileAlgorithm` implemented with first-order gradient updates and parameter interpolation. Full unit and integration tests.

---

### Theme 4: Core Infrastructure & Productionization

**Status:** `Complete`

**Goal:** Build the cross-cutting infrastructure required to make the library robust, efficient, and easy to use in production.

**Current State (March 2026):** 348+ data loader files covering vision (ImageNet, COCO, CelebA, 30+ benchmarks), audio (LibriSpeech, CommonVoice, 10+ loaders), video (Kinetics-400, UCF101), text (GLUE, SQuAD), graph (OGB), and 3D (ShapeNet, ModelNet40). Full YAML training configuration with source-generated validation. Production-grade KV cache (874 lines) with FP16/INT8 backends, sliding window, and paged attention. 5 PTQ strategies (GPTQ, AWQ, SmoothQuant, QuIP#, SpinQuant) plus QAT, NF4, FP8. ONNX export (Opset 17) and runtime with CPU/CUDA/TensorRT/DirectML providers.

#### **Epic: [#282] Datasets and DataLoaders** - CLOSED
-   [x] **AC 1.1: ImageFolderLoader:** `ImageFolderDataset` implemented with PyTorch-compatible directory structure, bilinear resizing, channel conversion, train/val/test split.
-   [x] **AC 1.2: AudioLoader:** `AudioFileDataset` base + 10+ benchmark-specific loaders (LibriSpeech, CommonVoice, AudioSet, ESC-50, MAESTRO, etc.). FLAC/WAV support.

#### **Epic: [#283] Training Recipes & Config System** - CLOSED
-   [x] **AC 1.1: YAML Configuration:** `YamlConfigLoader`, `YamlConfigApplier`, `YamlConfigSourceGenerator` (compile-time validation), `TrainingRecipeConfig` with Model/Dataset/Optimizer/Loss/Trainer sections.
-   [x] **AC 1.2: Configurable Trainer:** `Trainer` class with `TrainerBase<T>`, supports YAML file or `TrainingRecipeConfig` initialization, full training pipeline.

#### **Epic: [#277] Inference Optimizations** - CLOSED
-   [x] **AC 1.1: KV Cache:** `KVCache<T>` (874 lines) with native/FP16/INT8 storage backends, sliding window eviction, batch support, cache statistics. Integrated with `CachedMultiHeadAttention`, `CachedGroupedQueryAttention`, and `PagedCachedMultiHeadAttention`.

#### **Epic: [#278] Quantization** - CLOSED
-   [x] **AC 1.1: Post-Training Quantization (PTQ):** 5 strategies: GPTQ, AWQ, SmoothQuant, QuIP#, SpinQuant. Quantization-Aware Training (QAT) with `EfficientQATOptimizer`. Format-specific: FP8, NF4, MXFP4. Inference layers: `QuantizedDenseLayer`, `QuantizedAttentionLayer`. Per-tensor/channel/group/block granularity.

#### **Epic: [#280] ONNX Export & Runtime** - CLOSED
-   [x] **AC 1.1: ONNX Exporter:** `OnnxExporter` with automatic input shape inference, Opset 17 support, graph building via `OnnxGraph`/`OnnxNode`. Export to file or byte array.
-   [x] **AC 1.2: ONNX Runtime Executor:** `OnnxModel<T>` wrapper with automatic execution provider selection (CPU, CUDA, TensorRT, DirectML), multi-input/output, warm-up, async inference. `OnnxModelDownloader` for pre-trained models. CoreML conversion support.

---

## Progress Summary (March 2026)

| Theme | Status | Sub-Issues | Closed | Open |
|-------|--------|-----------|--------|------|
| **RAG & Search** | Complete | #303, #305, #306 | 3/3 | 0 |
| **Diffusion Models** | Complete | #298 | 1/1 | 0 |
| **Meta-Learning** | Substantially Complete | #289, #290, #291, #292 | 2/4 | 2 (re-scoped) |
| **Core Infrastructure** | Complete | #277, #278, #280, #282, #283 | 5/5 | 0 |
| **Total** | | **13 epics** | **11/13** | **2** |

**Note:** The 2 open meta-learning issues (#290, #291) have their original requirements fully implemented. They were re-scoped in Feb 2026 to include additional extensions from 2024-2026 research papers.

---

## CRITICAL ARCHITECTURAL REQUIREMENTS

**Before implementing any remaining work, you MUST review:**
- **Full Requirements:** [`.github/USER_STORY_ARCHITECTURAL_REQUIREMENTS.md`](../.github/USER_STORY_ARCHITECTURAL_REQUIREMENTS.md)
- **Project Rules:** [`.github/PROJECT_RULES.md`](../.github/PROJECT_RULES.md)

### Mandatory Implementation Checklist

#### 1. INumericOperations<T> Usage (CRITICAL)
- [x] Include `protected static readonly INumericOperations<T> NumOps = MathHelper.GetNumericOperations<T>();` in base class
- [x] NEVER hardcode `double`, `float`, or specific numeric types - use generic `T`
- [x] NEVER use `default(T)` - use `NumOps.Zero` instead
- [x] Use `NumOps.Zero`, `NumOps.One`, `NumOps.FromDouble()` for values
- [x] Use `NumOps.Add()`, `NumOps.Multiply()`, etc. for arithmetic
- [x] Use `NumOps.LessThan()`, `NumOps.GreaterThan()`, etc. for comparisons

#### 2. Inheritance Pattern (REQUIRED)
- [x] Create `I{FeatureName}.cs` in `src/Interfaces/` (root level, NOT subfolders)
- [x] Create `{FeatureName}Base.cs` in `src/{FeatureArea}/` inheriting from interface
- [x] Create concrete classes inheriting from Base class (NOT directly from interface)

#### 3. PredictionModelBuilder Integration (REQUIRED)
- [x] Add private field: `private I{FeatureName}<T>? _{featureName};` to `PredictionModelBuilder.cs`
- [x] Add Configure method taking ONLY interface (no parameters)
- [x] Use feature in `Build()` with default
- [x] Verify feature is ACTUALLY USED in execution flow

#### 4. Beginner-Friendly Defaults (REQUIRED)
- [x] Constructor parameters with defaults from research/industry standards
- [x] Document WHY each default was chosen (cite papers/standards)
- [x] Validate parameters and throw `ArgumentException` for invalid values

#### 5. Property Initialization (CRITICAL)
- [x] NEVER use `default!` operator
- [x] String properties: `= string.Empty;`
- [x] Collections: `= new List<T>();` or `= new Vector<T>(0);`
- [x] Numeric properties: appropriate default or `NumOps.Zero`

#### 6. Class Organization (REQUIRED)
- [x] One class/enum/interface per file
- [x] ALL interfaces in `src/Interfaces/` (root level)
- [x] Namespace mirrors folder structure

#### 7. Documentation (REQUIRED)
- [x] XML documentation for all public members
- [x] `<b>For Beginners:</b>` sections with analogies and examples
- [x] Document all `<param>`, `<returns>`, `<exception>` tags
- [x] Explain default value choices

#### 8. Testing (REQUIRED)
- [x] Minimum 80% code coverage
- [x] Test with multiple numeric types (double, float)
- [x] Test default values are applied correctly
- [x] Test edge cases and exceptions
- [x] Integration tests for PredictionModelBuilder usage


Theme	Status	Sub-Issues	Closed	Open
RAG & Search	Complete	#303, #305, #306	3/3	0
Diffusion Models	Complete	#298	1/1	0
Meta-Learning	Substantially Complete	#289, #290, #291, #292	2/4	2 (re-scoped)
Core Infrastructure	Complete	#277, #278, #280, #282, #283	5/5	0
Total		13 epics	11/13	2

Uh oh!

AiDotNet Roadmap: High-Impact Backlog and Tracking #293

Description

Overview

Theme 1: Retrieval-Augmented Generation (RAG) & Search

Epic: [#306] In-House Graph Database - CLOSED

Epic: [#305] In-House Document Store - CLOSED

Epic: [#303] RAG Framework Finalization - CLOSED

Theme 2: Advanced Generative AI (Diffusion Models)

Epic: [#298] Advanced Diffusion Models & Schedulers - CLOSED

Theme 3: Meta-Learning

Epic: [#290] Episodic Data Abstractions - OPEN (re-scoped for extensions)

Epic: [#289] Implement SEAL - CLOSED

Epic: [#291] Implement MAML - OPEN (re-scoped for extensions)

Epic: [#292] Implement Reptile - CLOSED

Theme 4: Core Infrastructure & Productionization

Epic: [#282] Datasets and DataLoaders - CLOSED

Epic: [#283] Training Recipes & Config System - CLOSED

Epic: [#277] Inference Optimizations - CLOSED

Epic: [#278] Quantization - CLOSED

Epic: [#280] ONNX Export & Runtime - CLOSED

Progress Summary (March 2026)

CRITICAL ARCHITECTURAL REQUIREMENTS

Mandatory Implementation Checklist

1. INumericOperations Usage (CRITICAL)

2. Inheritance Pattern (REQUIRED)

3. PredictionModelBuilder Integration (REQUIRED)

4. Beginner-Friendly Defaults (REQUIRED)

5. Property Initialization (CRITICAL)

6. Class Organization (REQUIRED)

7. Documentation (REQUIRED)

8. Testing (REQUIRED)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions