This project is an iOS application based on the MNN engine, supporting local large-model multimodal conversations.
It operates fully offline with high privacy. Once the models are downloaded to the device, all conversations occur locally without any network uploads or processing.
- Local Models
- Display locally downloaded models
- Support custom pinning
- Model Market
- Get list of models supported by MNN
- Model management: download and delete models
- Support switching between Hugging Face, ModelScope, and Modeler download sources
- Model search: support keyword search and tag search
- Benchmark Testing
- Support automated benchmark testing, outputting Prefill speed, Decode Speed, and Memory Usage information
- Support batch testing for text, image, and audio inputs
- Multimodal Chat: Supports full Markdown format output
- Text-to-text
- Audio-to-text (supports audio output for Omni models)
- Image-to-text: images can be captured or selected from gallery
- Video-to-text: supports video input processing
- Sana Diffusion: supports image style transfer (e.g., Ghibli style)
- Model Configuration
- Support configuring mmap
- Support configuring sampling strategy
- Support configuring diffusion settings
- Support configuring backend type (CPU/Metal)
- Support configuring precision (low/normal/high)
- Support configuring thread count
- Support configuring multimodal prompt API
- Support configuring audio output for Omni models
- Chat History
- Support model conversation history list, restore historical conversation scenarios
Click here to download the original resolution introduction video
| Text To Text | Image To Text | Audio To Text | Model Filter |
![]() |
|||
| Local Model | Model Market | Benchmark | History |
![]() |
Additionally, the app supports edge-side usage of DeepSeek with Think mode:
-
Clone the repository:
git clone https://github.com/alibaba/MNN.git
-
Build the MNN.framework:
sh package_scripts/ios/buildiOS.sh " -DMNN_ARM82=ON -DMNN_LOW_MEMORY=ON -DMNN_SUPPORT_TRANSFORMER_FUSE=ON -DMNN_BUILD_LLM=ON -DMNN_CPU_WEIGHT_DEQUANT_GEMM=ON -DMNN_METAL=ON -DMNN_BUILD_DIFFUSION=ON -DMNN_OPENCL=OFF -DMNN_SEP_BUILD=OFF -DLLM_SUPPORT_AUDIO=ON -DMNN_BUILD_AUDIO=ON -DLLM_SUPPORT_VISION=ON -DMNN_BUILD_OPENCV=ON -DMNN_IMGCODECS=ON -DMNN_BUILD_LLM_OMNI=ON "
-
Copy the framework to the iOS project:
mv MNN-iOS-CPU-GPU/Static/MNN.framework apps/iOS/MNNLLMChat
Ensure the
Link Binary With Librariessection includes theMNN.framework:
If it's missing, add it manually:
-
Update iOS signing and build the project:
cd apps/iOS/MNNLLMChat open MNNLLMiOS.xcodeprojIn Xcode, go to
Signing & Capabilities > Teamand input your Apple ID and Bundle Identifier:Wait for the Swift Package to finish downloading before building.
Due to memory limitations on iPhones, it is recommended to use models with 7B parameters or fewer to avoid memory-related crashes.
Here is the professional technical translation of the provided text:
For local debugging, simply drag the model files to the LocalModel folder and run the project:
-
First, download the MNN-related models from Hugging Face or Modelscope:
-
Drag the downloaded model folder into the project's LocalModel folder.
-
For root directory models, you can configure them:
Go to ModelListViewModel.swift for configuration, such as whether to support thinking mode:
// MARK: Config the Local Model here
let modelName = "Qwen3-0.6B-MNN-Inside" // Model name
let localModel = ModelInfo(
modelName: modelName,
tags: [
// MARK: if you know that model support think, uncomment the line
// NSLocalizedString("tag.deepThinking", comment: "Deep thinking tag for local model"), // Whether to support think
NSLocalizedString("tag.localModel", comment: "Local model inside the app")],
categories: ["Local Models"],
vendor: "Local",
sources: ["local": "bundle_root/\(modelName)"],
isDownloaded: true
)
localModels.append(localModel)
ModelStorageManager.shared.markModelAsDownloaded(modelName)- Run the project, navigate to the chat page, and perform model interactions and debugging.
The app will automatically detect and load models from the LocalModel folder without requiring additional configuration.
- Added Sana Diffusion support for image style transfer (e.g., Ghibli style)
- Added audio output support for Omni models
- Added video input support for multimodal conversations
- Added multimodal prompt API configuration
- Added backend type configuration (CPU/Metal)
- Added precision configuration (low/normal/high)
- Added thread count configuration
- Added batch testing support for text, image, and audio inputs
- Added three major project modules: Local Models, Model Market, and Benchmark Testing
- Added benchmark testing to test different model performance
- Added settings page, accessible from the history sidebar
- Added Ali CDN for getting model lists
- Added model market filtering functionality
- Add support for model parameter configuration
New Features:
- Add support for downloading from the Modeler source
- Add support for Stable Diffusion text-to-image generation
New Features:
- Add support for mmap configuration and manual cache clearing
- Add support for downloading models from the ModelScope source




