Skip to content

Conversation

@fionaliao
Copy link
Contributor

@fionaliao fionaliao commented Jan 7, 2026

manual backport of #13944 (due to merge conflicts)


Note

Ensures MQE remote execution maps ring/storage errors to consistent query errors.

  • Changes Dispatcher to accept storage.SampleAndChunkQueryable and wraps it with NewErrorTranslateSampleAndChunkQueryable to translate storage errors
  • Adds TestDispatcher_RingErrorTranslation validating translation of ring errors (e.g., ErrTooManyUnhealthyInstances, ErrEmptyRing) to internal query errors with expected messages
  • Updates CHANGELOG.md with the bugfix entry

Written by Cursor Bugbot for commit dc3298b. This will update automatically on new commits. Configure here.

<!--  Thanks for sending a pull request!  Before submitting:

1. Read our CONTRIBUTING.md guide
2. Rebase your PR if it gets out of sync with main
-->

The querier api wraps its queryable with a
NewErrorTranslateSampleAndChunkQueryable, which includes Mimir error
mappings, by default mapping to promql.ErrStorage (which is later mapped
to a 500).

https://github.com/grafana/mimir/blob/f5d064968c732ac49f72ac2551b4d98f596d21ed/pkg/api/handlers.go#L228

https://github.com/grafana/mimir/blob/da237be9f86efbd4a1daa38cb7d0db37dfe80daa/pkg/querier/error_translate_queryable.go#L105

https://github.com/grafana/mimir/blob/da237be9f86efbd4a1daa38cb7d0db37dfe80daa/pkg/querier/error_translate_queryable.go#L91

https://github.com/grafana/mimir/blob/068f3d023248d572b929234940cf981705cf8d82/pkg/api/error/error.go#L65-L68

MQE remote execution does not use the querier api, instead using the
`Dispatcher`, which does not use the error mapping queryable. This means
some errors coming back from storage (e.g. `"too many unhealthy
instances in the ring"`) were being incorrectly mapped as 422s instead,
as for a fallback the dispatcher returns a `apierror.TypeExec` error
which maps to 422 and there's no custom mapping for storage.

https://github.com/grafana/mimir/blob/1fdbf332931c920fbf5ff0503003328761b74a74/pkg/querier/dispatcher.go#L193-L196

Fixed by wrapping the queryable used by the dispatcher with
NewErrorTranslateSampleAndChunkQueryable, the same as for the querier
api.

Fixes #<issue number>

- [x] Tests updated.
- [ ] Documentation added.
- [x] `CHANGELOG.md` updated - the order of entries should be
`[CHANGE]`, `[FEATURE]`, `[ENHANCEMENT]`, `[BUGFIX]`. If changelog entry
is not needed, please add the `changelog-not-needed` label to the PR.
- [ ]
[`about-versioning.md`](https://github.com/grafana/mimir/blob/main/docs/sources/mimir/configure/about-versioning.md)
updated with experimental features.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **BUGFIX: MQE remote execution error mapping**
>
> - Wraps `Dispatcher` queryable with
`NewErrorTranslateSampleAndChunkQueryable` and changes type to
`storage.SampleAndChunkQueryable` to ensure storage errors map correctly
(e.g., HTTP 500).
> - Adds `TestDispatcher_RingErrorTranslation` covering ring errors like
`ErrTooManyUnhealthyInstances` and `ErrEmptyRing` (including wrapped
cases).
> - Updates `CHANGELOG.md` with the MQE bugfix entry.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
897a85f. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

(cherry picked from commit 0d601c4)
@fionaliao fionaliao requested a review from a team as a code owner January 7, 2026 20:00
@fionaliao fionaliao changed the title Map remote execution storage errors correctly (#13944) [r376Map remote execution storage errors correctly (#13944) Jan 7, 2026
@fionaliao fionaliao changed the title [r376Map remote execution storage errors correctly (#13944) [r376] Map remote execution storage errors correctly (#13944) Jan 7, 2026
@fionaliao fionaliao merged commit 636dc37 into r376 Jan 8, 2026
39 checks passed
@fionaliao fionaliao deleted the backport-13944-to-r376 branch January 8, 2026 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants