You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: document local model inference feature in README
Add "Local Model Inference" section covering supported models (Z-Image Turbo/Base, Dreamshaper, Realistic Vision, Anything v5, SDXL), auxiliary file requirements for Z-Image, step-by-step usage, and hardware notes for Metal GPU on Apple Silicon. Also add Local Inference bullet to the Features list.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: README.md
+41Lines changed: 41 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -120,9 +120,50 @@ For a deep dive into the technical architecture and the philosophy behind the "I
120
120
121
121

122
122
123
+
## ⚡ Local Model Inference (Desktop App Only)
124
+
125
+
The desktop app includes a built-in **local generation engine** powered by [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) — generate images entirely on your own machine with no API key and no internet connection required.
- The system may slow during generation — the process uses all available CPU cores while running
160
+
161
+
---
162
+
123
163
## ✨ Features
124
164
125
165
-**Image Studio** — Generate images from text prompts (50+ text-to-image models) or transform existing images (55+ image-to-image models). Switches model set automatically based on whether a reference image is provided. Quality and resolution controls visible for models that support them.
166
+
-**Local Inference** — Generate images on-device with no API key using Z-Image Turbo/Base, Dreamshaper, Realistic Vision, Anything v5, or SDXL — powered by stable-diffusion.cpp with Metal GPU acceleration on Apple Silicon.
126
167
-**Multi-Image Input** — Upload up to 14 reference images for compatible edit models (Nano Banana 2 Edit, Flux Kontext Dev, GPT-4o Edit, and more). Multi-select picker with order badges, batch upload, and a "Use Selected" confirmation flow.
127
168
-**Video Studio** — Generate videos from text prompts (40+ text-to-video models) or animate a start-frame image (60+ image-to-video models). Same intelligent mode switching as Image Studio.
128
169
-**Lip Sync Studio** — Animate portrait images or sync lips on existing videos using audio. 9 dedicated models across two modes: portrait image + audio → talking video, and video + audio → lipsync video.
0 commit comments