Describe the Feature Request
Google recently released Gemma 4, including 2 really small yet smart models, Gemma 4 E2B and E4B. These models are specifically optimized to run on smartphones quite quickly. I personally have a phone with a mid-range chip (CMF Phone 1) and the E2B version runs great. My idea basically is that it would be nice if we could download these two models in LastChat, and then use them as we would with models from inference providers. Google also made LiteRT which LastChat could use to integrate the models into the app, since it would handle running the model itself, etc.
Usage Scenario
It would be a nice and private alternative to cloud-based APIs, and it would even work without internet, which is awesome!
Alternatives Considered
None
Describe the Feature Request
Google recently released Gemma 4, including 2 really small yet smart models, Gemma 4 E2B and E4B. These models are specifically optimized to run on smartphones quite quickly. I personally have a phone with a mid-range chip (CMF Phone 1) and the E2B version runs great. My idea basically is that it would be nice if we could download these two models in LastChat, and then use them as we would with models from inference providers. Google also made LiteRT which LastChat could use to integrate the models into the app, since it would handle running the model itself, etc.
Usage Scenario
It would be a nice and private alternative to cloud-based APIs, and it would even work without internet, which is awesome!
Alternatives Considered
None