Offline AI Chat Assistant for Android
Features • Screenshots • Installation • Building • Tech Stack • License
Xirea is a fully offline AI chat assistant that runs lightweight language models directly on your Android device. No internet required, no API keys, no data leaving your phone — your conversations stay completely private.
Powered by llama.cpp for efficient on-device inference with GGUF models.
100% Offline — All AI processing happens on-device
Fast Inference — Optimized for mobile with dynamic RAM scaling
Chat History — Persistent local storage with Room database
Model Management — Download, switch, and delete AI models
Dark Mode — Beautiful Material3 light and dark themes
Modern UI — Built with Jetpack Compose
Privacy First — No data collection, no servers, no tracking
| Home | Chat | Models | Settings |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
- Android 8.0+ (API 26)
- ARM64 device (arm64-v8a)
- At least 4GB RAM recommended
- Storage space for AI models (500MB - 4GB per model)
Or browse all versions on the Releases page.
- Android Studio Hedgehog or newer
- Android NDK 29.0.14206865
- CMake 3.22.1
- JDK 17
-
Clone the repository
git clone https://github.com/Danyalkhattak/Xirea.git cd xirea -
Open in Android Studio
- Open the project folder in Android Studio
- Wait for Gradle sync to complete
-
Build Debug APK
./gradlew assembleDebug
-
Build Release APK (requires signing keystore)
./gradlew assembleRelease
Set signing properties in local.properties:
RELEASE_STORE_FILE=path/to/keystore.jks
RELEASE_STORE_PASSWORD=your_store_password
RELEASE_KEY_ALIAS=your_key_alias
RELEASE_KEY_PASSWORD=your_key_password
The APK will be generated at app/build/outputs/apk/
Xirea works with GGUF format models. Recommended models for mobile:
| Model | Size | RAM Required |
|---|---|---|
| Qwen2.5 0.5B Q4 | ~400MB | 4GB |
| Qwen2.5 1.5B Q4 | ~1GB | 6GB |
| Llama 3.2 1B Q4 | ~700MB | 4GB |
| Phi-3 Mini Q4 | ~2GB | 8GB |
| Gemma 2B Q4 | ~1.5GB | 6GB |
Xirea automatically optimizes for your device:
| Device RAM | Context Size | Batch Size |
|---|---|---|
| 4GB | 512 | 128 |
| 6GB | 768 | 256 |
| 8GB | 1024 | 256 |
| 12GB+ | 2048 | 512 |
- CPU-only inference for maximum compatibility
- Memory-mapped model loading for reduced RAM usage
- Pre-allocated batch buffers for zero-allocation generation
- Near-greedy sampling for faster token generation
xirea/
├── app/
│ ├── src/main/
│ │ ├── java/com/dannyk/xirea/
│ │ │ ├── ai/ # AI engine & llama.cpp wrapper
│ │ │ ├── data/ # Room database & repositories
│ │ │ ├── service/ # Download service
│ │ │ └── ui/ # Compose UI screens
│ │ ├── cpp/ # Native C++ code
│ │ │ ├── llama.cpp/ # llama.cpp library
│ │ │ └── llama_jni.cpp # JNI bridge
│ │ └── res/ # Resources
│ └── build.gradle.kts
├── gradle/
│ └── libs.versions.toml # Version catalog
└── build.gradle.kts
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Danyal Khattak
- llama.cpp — Excellent C++ inference engine
- Jetpack Compose — Modern Android UI toolkit
- Material3 — Design system




