Body
Summary
When using Grounding with Google Search, the docs require developers to link model statements to their sources inline and in aggregate. However, Segment.startIndex and Segment.endIndex are documented as UTF-8 byte offsets.
On Android/Kotlin (and also JavaScript), normal string APIs use native string indices, not UTF-8 byte offsets. That means developers cannot safely use Segment offsets directly to insert inline citations.
Problem
To implement inline citations correctly, the SDK user currently has to:
- read
candidate.content.parts
- respect
segment.partIndex
- convert UTF-8 byte offsets to native string indices
- handle multi-byte characters / emoji correctly
- guard against invalid or mismatched offsets
- insert citations carefully to avoid shifting later indices
This is easy to get wrong and can lead to misplaced citations or crashes.
In our Android app, using Segment.endIndex directly as a StringBuilder index caused a StringIndexOutOfBoundsException because the SDK offset was valid in UTF-8 bytes but not as a Kotlin string index.
Why this is difficult for app developers
Segment uses UTF-8 byte offsets, but Kotlin/Java strings use UTF-16 indexing.
- The SDK already knows the original content parts and grounding metadata, so it is in a much better position than the app to format grounded text correctly.
- The grounding docs require inline attribution, so developers are pushed into solving a subtle encoding/indexing problem just to comply.
Requested improvement
Any of the following would solve the problem:
- Provide grounded text with inline citations already applied.
- Add an opt-in SDK helper/formatter, for example something like
textWithInlineSources().
- Expose platform-native indices alongside the existing UTF-8 byte offsets (for example UTF-16 indices on Android/JavaScript).
Body
Summary
When using Grounding with Google Search, the docs require developers to link model statements to their sources inline and in aggregate. However,
Segment.startIndexandSegment.endIndexare documented as UTF-8 byte offsets.On Android/Kotlin (and also JavaScript), normal string APIs use native string indices, not UTF-8 byte offsets. That means developers cannot safely use
Segmentoffsets directly to insert inline citations.Problem
To implement inline citations correctly, the SDK user currently has to:
candidate.content.partssegment.partIndexThis is easy to get wrong and can lead to misplaced citations or crashes.
In our Android app, using
Segment.endIndexdirectly as aStringBuilderindex caused aStringIndexOutOfBoundsExceptionbecause the SDK offset was valid in UTF-8 bytes but not as a Kotlin string index.Why this is difficult for app developers
Segmentuses UTF-8 byte offsets, but Kotlin/Java strings use UTF-16 indexing.Requested improvement
Any of the following would solve the problem:
textWithInlineSources().