Skip to content

Commit 36dcfe6

Browse files
archgw: address drift in prometheus cluster name (#87)
Signed-off-by: Adrian Cole <[email protected]>
1 parent ddaadd1 commit 36dcfe6

File tree

3 files changed

+8
-3
lines changed

3 files changed

+8
-3
lines changed

inference-platforms/archgw/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,11 @@ and anything added in Arch Gateway's [wasm filter][archgw-wasm].
6464
instructions to run from Docker (to avoid nested docker).
6565
* Traces come from Envoy, whose configuration is written by `archgw`. At the
6666
moment, this hard-codes aspects including default ports.
67+
* Prometheus metrics show the cluster as "ollama_host" - the provider_interface
68+
plus the first segment of the hostname (dots truncate the rest). The "host"
69+
comes from "host.docker.internal".
6770
* Until [this][openai-responses] resolves, don't use `--use-responses-api`.
71+
* Until [this][docker-env] resolves, make sure your PATH has /usr/local/bin.
6872

6973
The chat prompt was designed to be idempotent, but the results are not. You may
7074
see something besides 'South Atlantic Ocean.'.
@@ -78,3 +82,4 @@ Just run it again until we find a way to make the results idempotent.
7882
[uv]: https://docs.astral.sh/uv/getting-started/installation/
7983
[openai-responses]: https://github.com/katanemo/archgw/issues/476
8084
[otel-tui]: https://github.com/ymtdzzz/otel-tui
85+
[docker-env]: https://github.com/katanemo/archgw/issues/573

inference-platforms/archgw/arch_config.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,9 @@ listeners:
88
timeout: 30s
99

1010
llm_providers:
11+
# Use ollama directly, since we can't inherit OPENAI_BASE_URL etc and need
12+
# to hard-code the model anyway.
1113
- model: ollama/qwen3:0.6b
12-
provider_interface: openai
1314
# This configuration is converted to Envoy and run inside Docker.
1415
base_url: http://host.docker.internal:11434
1516
default: true

inference-platforms/archgw/docker-compose-elastic.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,7 @@ configs:
22
# Configuration is simplified from archgw here:
33
# https://github.com/katanemo/archgw/blob/main/docs/source/guides/observability/monitoring.rst
44
#
5-
# Note: The prometheus cluster name for qwen3:0.65b will shows up as '6b'
6-
# See https://github.com/katanemo/archgw/issues/504
5+
# Note: The cluster name for ollama + host.docker.internal = ollama_host
76
prometheus-pump-config:
87
content: |
98
receivers:

0 commit comments

Comments
 (0)