Why
The current app uses a manual retrieval on/off switch and injects retrieved context before generation. With Gemma 4 + LiteRT-LM, the app can potentially move to a tool-driven workflow where the model decides when to call retrieval.
This should be treated as a controlled migration, not an unmeasured refactor.
Scope
Prototype a first tool-driven workflow for guideline search, for example:
search_guidelines(query, top_k) -> returns retrieved chunks / metadata
Keep the first tool set narrow and read-only. Do not let the model autonomously trigger UI actions like opening PDFs.
Implementation direction
- Use LiteRT-LM tool support in the Android runtime
- Start behind a feature flag
- Keep the current manual retrieval path as fallback
- Reuse the existing offline retrieval stack / SQLite vector store
Dependencies
Evaluation requirements
Compare against the current workflow on the same benchmark line:
- retrieval off
- current manual retrieval-on flow
- tool-triggered retrieval flow
Acceptance criteria
- There is a minimal tool registry in the Android pipeline.
- The first tool is retrieval, not arbitrary app actions.
- Behavior is benchmarked against the current manual search workflow before any default switch.
- Failure modes are documented, especially around safety and unsupported questions.
Why
The current app uses a manual retrieval on/off switch and injects retrieved context before generation. With Gemma 4 + LiteRT-LM, the app can potentially move to a tool-driven workflow where the model decides when to call retrieval.
This should be treated as a controlled migration, not an unmeasured refactor.
Scope
Prototype a first tool-driven workflow for guideline search, for example:
search_guidelines(query, top_k)-> returns retrieved chunks / metadataKeep the first tool set narrow and read-only. Do not let the model autonomously trigger UI actions like opening PDFs.
Implementation direction
Dependencies
Evaluation requirements
Compare against the current workflow on the same benchmark line:
Acceptance criteria