Skip to content

Commit 37f47e3

Browse files
author
Your Name
committed
docs: add scientific references to blog posts
1 parent a9e1e94 commit 37f47e3

6 files changed

+47
-8
lines changed

docs/_posts/2025-11-14-context-engineering-for-real-codebases.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ Chuchu's multi-agent architecture is designed around this principle:
165165
- Routes to appropriate specialized agent
166166

167167
**Query Agent** (reasoning model)
168-
- Research and codebase analysis
168+
- Research and codebase analysis[^1]
169169
- Reads files, searches patterns
170170
- Compacts findings into structured output
171171
- Fresh context for each analysis
@@ -322,4 +322,10 @@ But the foundation is always the same: **manage your context window like your pr
322322

323323
---
324324

325+
## References
326+
327+
[^1]: Lewis, P., Perez, E., Piktus, A., et al. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. *NeurIPS 2020*. https://arxiv.org/abs/2005.11401
328+
329+
---
330+
325331
*Have questions about context engineering? Join the discussion in [GitHub Discussions](https://github.com/jadercorrea/chuchu/discussions)*

docs/_posts/2025-11-19-model-performance-benchmarks.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,8 @@ tags: [benchmarks, performance, models, comparison]
1111

1212
*Updated January 2025*
1313

14-
**Important**: AI models evolve rapidly.
14+
**Important**: AI models evolve rapidly. Benchmark your models using established coding benchmarks like HumanEval[^1], SWE-Bench[^2], and LiveCodeBench[^3].
15+
1516
1. Testing models with your specific workload
1617
2. Checking [Groq configurations]({% post_url 2025-11-15-groq-optimal-configs %}) for current recommendations
1718
3. Exploring [OpenRouter guide]({% post_url 2025-11-16-openrouter-multi-provider %}) for latest models
@@ -102,3 +103,11 @@ chu models search --agent editor openrouter
102103
```
103104

104105
See our [detailed configuration guides]({% post_url 2025-11-15-groq-optimal-configs %}) for setup instructions and cost breakdowns.
106+
107+
## References
108+
109+
[^1]: Chen, M., Tworek, J., Jun, H., Yuan, Q., et al. (2021). Evaluating large language models trained on code. *arXiv preprint arXiv:2107.03374*. https://arxiv.org/abs/2107.03374
110+
111+
[^2]: Jimenez, C. E., Yang, J., Wettig, A., et al. (2024). SWE-bench: Can Language Models Resolve Real-World GitHub Issues? *ICLR 2024*. https://arxiv.org/abs/2310.06770
112+
113+
[^3]: Jain, N., Han, K., Gu, A., et al. (2024). LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code. *arXiv preprint arXiv:2403.07974*. https://arxiv.org/abs/2403.07974

docs/_posts/2025-11-20-advanced-context-management.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ One of the biggest challenges in AI coding is the **Context Window**.
1313

1414
## How Chuchu Manages Context
1515

16-
Chuchu uses **Retrieval-Augmented Generation (RAG)** to fetch only relevant information:
16+
Chuchu uses **Retrieval-Augmented Generation (RAG)**[^1] to fetch only relevant information:
1717

1818
1. **Project Map**: The `project_map` tool generates a tree-like view of your project structure in ~500 tokens, giving the model a "mental map" of where things are.
1919

@@ -91,6 +91,10 @@ Each command starts with fresh context, preventing pollution.
9191

9292
**Adaptive context**: Dynamic context window management based on task complexity and available token budget.
9393

94+
## References
95+
96+
[^1]: Lewis, P., Perez, E., Piktus, A., et al. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. *NeurIPS 2020*. https://arxiv.org/abs/2005.11401
97+
9498
## Related Posts
9599

96100
- [Context Engineering for Real Codebases]({% post_url 2025-11-14-context-engineering-for-real-codebases %})

docs/_posts/2025-11-22-ml-powered-intelligence.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ User Input
8484
8585
TF-IDF Vectorization (1-3 grams)
8686
87-
Logistic Regression
87+
Logistic Regression[^1]
8888
8989
Confidence Score
9090
@@ -292,7 +292,7 @@ $ chu chat
292292
Pure ML would be faster but less accurate.
293293
Pure LLM would be more accurate but slower and expensive.
294294
295-
**Hybrid ML + LLM** gives you the best of both worlds:
295+
**Hybrid ML + LLM**[^2] gives you the best of both worlds:
296296
- Fast path for confident decisions (80-90% of requests)
297297
- Smart fallback for edge cases
298298
- Configurable balance between speed and accuracy
@@ -355,6 +355,12 @@ But the foundation is here today: fast, cheap, accurate routing powered by embed
355355
356356
*Have questions about the ML system? Check out the [full documentation](../ml-features) or ask in [GitHub Discussions](https://github.com/jadercorrea/chuchu/discussions)!*
357357
358+
## References
359+
360+
[^1]: Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). LIBLINEAR: A library for large linear classification. *Journal of Machine Learning Research*, 9(Aug), 1871-1874. https://www.jmlr.org/papers/v9/fan08a.html
361+
362+
[^2]: Teerapittayanon, S., McDanel, B., & Kung, H. T. (2016). BranchyNet: Fast inference via early exiting from deep neural networks. *ICPR 2016*. https://arxiv.org/abs/1709.01686
363+
358364
## See Also
359365
360366
- [Full ML Features Documentation](../ml-features) - Technical deep dive

docs/_posts/2025-11-23-future-of-ai-pair-programming.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,15 @@ Imagine this workflow:
3838

3939
## Chuchu's Roadmap
4040

41-
We are building towards Phase 3.
41+
We are building towards Phase 3, inspired by recent advances in multi-agent systems[^1][^2].
4242
- **Memory**: Long-term memory of your coding style and architectural decisions.
4343
- **Proactivity**: Agents that run in the background, running tests and fixing lint errors before you even see them.
4444
- **Collaboration**: Agents that can comment on PRs and discuss architecture with other agents.
4545

4646
The goal is not to replace the developer, but to elevate them. You become the **Architect**, and AI becomes your **Engineering Team**.
47+
48+
## References
49+
50+
[^1]: Qian, C., Cong, X., Yang, C., et al. (2023). Communicative Agents for Software Development. *arXiv preprint arXiv:2307.07924*. https://arxiv.org/abs/2307.07924
51+
52+
[^2]: Hong, S., Zheng, X., Chen, J., et al. (2023). MetaGPT: Meta Programming for Multi-Agent Collaborative Framework. *arXiv preprint arXiv:2308.00352*. https://arxiv.org/abs/2308.00352

docs/_posts/2025-11-24-complete-workflow-guide.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,10 @@ Traditional AI coding assistants give you code immediately. Sometimes that works
2828
❌ No incremental verification
2929
❌ No way to course-correct
3030

31-
Chuchu's workflow solves this:
31+
Chuchu's workflow[^1] solves this:
3232

3333
✅ Research phase builds context
34-
✅ Planning ensures coherent approach
34+
✅ Planning ensures coherent approach[^2]
3535
✅ Implementation is incremental and verified
3636
✅ You control the pace (interactive or autonomous)
3737

@@ -240,4 +240,12 @@ Implementation itself works for any language (LLM-based), but build/test verific
240240

241241
---
242242

243+
## References
244+
245+
[^1]: Beck, K. (2003). *Test-Driven Development: By Example*. Addison-Wesley Professional. ISBN: 978-0321146533
246+
247+
[^2]: Fowler, M. (2018). *Refactoring: Improving the Design of Existing Code* (2nd ed.). Addison-Wesley Professional. ISBN: 978-0134757599
248+
249+
---
250+
243251
**Questions or issues?** [Open an issue on GitHub](https://github.com/jadercorrea/chuchu/issues)

0 commit comments

Comments
 (0)