File tree Expand file tree Collapse file tree 3 files changed +8
-3
lines changed
inference-platforms/archgw Expand file tree Collapse file tree 3 files changed +8
-3
lines changed Original file line number Diff line number Diff line change @@ -64,7 +64,11 @@ and anything added in Arch Gateway's [wasm filter][archgw-wasm].
6464 instructions to run from Docker (to avoid nested docker).
6565* Traces come from Envoy, whose configuration is written by ` archgw ` . At the
6666 moment, this hard-codes aspects including default ports.
67+ * Prometheus metrics show the cluster as "ollama_host" - the provider_interface
68+ plus the first segment of the hostname (dots truncate the rest). The "host"
69+ comes from "host.docker.internal".
6770* Until [ this] [ openai-responses ] resolves, don't use ` --use-responses-api ` .
71+ * Until [ this] [ docker-env ] resolves, make sure your PATH has /usr/local/bin.
6872
6973The chat prompt was designed to be idempotent, but the results are not. You may
7074see something besides 'South Atlantic Ocean.'.
@@ -78,3 +82,4 @@ Just run it again until we find a way to make the results idempotent.
7882[ uv ] : https://docs.astral.sh/uv/getting-started/installation/
7983[ openai-responses ] : https://github.com/katanemo/archgw/issues/476
8084[ otel-tui ] : https://github.com/ymtdzzz/otel-tui
85+ [ docker-env ] : https://github.com/katanemo/archgw/issues/573
Original file line number Diff line number Diff line change @@ -8,8 +8,9 @@ listeners:
88 timeout : 30s
99
1010llm_providers :
11+ # Use ollama directly, since we can't inherit OPENAI_BASE_URL etc and need
12+ # to hard-code the model anyway.
1113 - model : ollama/qwen3:0.6b
12- provider_interface : openai
1314 # This configuration is converted to Envoy and run inside Docker.
1415 base_url : http://host.docker.internal:11434
1516 default : true
Original file line number Diff line number Diff line change 22 # Configuration is simplified from archgw here:
33 # https://github.com/katanemo/archgw/blob/main/docs/source/guides/observability/monitoring.rst
44 #
5- # Note: The prometheus cluster name for qwen3:0.65b will shows up as '6b'
6- # See https://github.com/katanemo/archgw/issues/504
5+ # Note: The cluster name for ollama + host.docker.internal = ollama_host
76 prometheus-pump-config :
87 content : |
98 receivers:
You can’t perform that action at this time.
0 commit comments