Local LLM support

**Describe the Feature Request**  

Google recently released Gemma 4, including 2 really small yet smart models, Gemma 4 E2B and E4B. These models are specifically optimized to run on smartphones quite quickly. I personally have a phone with a mid-range chip (CMF Phone 1) and the E2B version runs great. My idea basically is that it would be nice if we could download these two models in LastChat, and then use them as we would with models from inference providers. Google also made LiteRT which LastChat could use to integrate the models into the app, since it would handle running the model itself, etc.

**Usage Scenario**  
 
It would be a nice and private alternative to cloud-based APIs, and it would even work without internet, which is awesome!

**Alternatives Considered**  

None


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Local LLM support #151

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Local LLM support #151

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions