g023

g023 g023

Programmer, AI Goblin Tinkerer

Achievements

aroc aroc Public

Agentic Read-Only Chat - A rich terminal chat interface powered by locally installed llama.cpp and g023/g023-Qwen3.5-9B-GGUF:IQ2_M

Python
harnessharvest harnessharvest Public

A self-learning, self-correcting, LLM-powered harness creation and management system with FAISS-powered RAG, sandboxed execution, and autonomous improvement modes. Powered by Ollama and offline mod…

Python 2 1
localmodelrouter localmodelrouter Public

Local LLM server that provides drop-in API compatibility with both Ollama and OpenAI, using your locally installed [llama.cpp](https://github.com/ggerganov/llama.cpp) 's `llama-server` as the infer…

Python 2
xinf xinf Public

g023's TurboXInf 🚀: 2x+ faster inference for Qwen3-1.77B or Qwen3.5-2B on RTX 3060! Custom Triton INT8 GEMV kernels halve memory traffic by fusing dequantization, paired with torch.compile. Hits 11…

Python 1
turboquant turboquant Public

Standalone TurboQuant KV Cache Inference for https://huggingface.co/g023/Qwen3-1.77B-g023

Python 4
g023-OllamaMan g023-OllamaMan Public

A Concept Ollama Server Management OS that runs in a web browser.

PHP 7