Skip to content

Conversation

@ansh-info
Copy link

@ansh-info ansh-info commented Dec 11, 2025

• Summary

Issue - #473

  • Add a configurable context_window to RULER (ruler and ruler_score_group), propagating it to LiteLLM as num_ctx/max_input_tokens so Ollama/long-context models aren’t capped at the default 8k.
  • Preserve caller overrides: only set num_ctx/max_input_tokens when not already provided via extra_litellm_params.

Motivation

  • With Ollama or other long-context backends, LiteLLM defaults to ~8k context unless num_ctx/max_input_tokens are passed. RULER didn’t expose a way to set these, leading to “token count exceeded 8192”
    despite larger model ctx settings.

Details

  • ruler now accepts context_window and merges it into LiteLLM params using setdefault.

  • ruler_score_group exposes the same knob and forwards it to ruler.

    Testing

  • Not run (please run RULER/LiteLLM smoke tests against a long-context Ollama model with context_window set, and verify no 8k cap errors).

…or RULER to avoid the implicit 8k cap when using Ollama or other long-context backends

Co-authored-by: Apoorva Gupta <[email protected]>
@ansh-info ansh-info changed the title feat: Implemented an explicit, configurable context-window override f… feat: Implemented an explicit, configurable context-window override Dec 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant