Skip to content

feat: Surface Flow task report on Task API#2288

Merged
kunzhao-nv merged 11 commits into
NVIDIA:mainfrom
kunzhao-nv:feat/rest-api-task-report
Jun 9, 2026
Merged

feat: Surface Flow task report on Task API#2288
kunzhao-nv merged 11 commits into
NVIDIA:mainfrom
kunzhao-nv:feat/rest-api-task-report

Conversation

@kunzhao-nv

@kunzhao-nv kunzhao-nv commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Description

  • Adds the structured execution report (already produced by Flow) to the rack Task endpoints.
  • Single-task GET /rack/task/{id} and POST /rack/task/{id}/cancel always include the report. The two list endpoints (/rack/{id}/task, /tray/{id}/task) take a new withReport=true opt-in.
  • IncludeReport is plumbed all the way down to flow.ListTasksRequest, so Flow drops the multi-KB blob server-side when the caller hasn't asked for it; the savings cover the full Temporal workflow / activity payload chain, not just the wire response.

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@kunzhao-nv kunzhao-nv requested a review from a team as a code owner June 7, 2026 18:44
@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 93e3f9ce-1f25-493c-9810-b4e0adde0fdc

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-06-07 18:46:44 UTC | Commit: fe67fc1

Task.report bodies can be several KB. Always returning them on
ListTasks would persist the blob in every caller-side Temporal
activity / workflow payload along the call path, even when the
caller never reads it. Opt callers in via with_report, defaulting
to off for list endpoints; single-task RPCs always return it.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Pulls in the new ListTasksRequest.with_report field along with prior
upstream drift (Task.report, override_assignment_check on the rack
power / firmware / bring-up requests, and refreshed doc comments)
that earlier PRs landed without running flow-proto + flow-protogen.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Surfaces the structured execution report Flow already produces on
Task. Single-task GET and Cancel always include it; the rack/tray
list endpoints accept withReport=true and forward the flag down to
Flow's ListTasksRequest so the multi-KB payload is not paid for on
the default list path.

The report is exposed as an opaque JSON document with a typed
version anchor; clients must branch on version before reading
anything else, so Flow's internal schema can evolve without
breaking REST consumers.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
nico-flow 116 13 50 41 4 8
nico-nsm 133 11 45 66 11 0
nico-psm 118 13 52 41 4 8
nico-rest-api 182 16 84 67 7 8
nico-rest-cert-manager 95 5 47 32 3 8
nico-rest-db 116 13 50 41 4 8
nico-rest-site-agent 115 13 50 41 3 8
nico-rest-site-manager 102 6 48 37 3 8
nico-rest-workflow 118 13 52 41 4 8
TOTAL 1095 103 478 407 43 64

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

Covers 82626a9 (the task report + withReport spec change).

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
@kunzhao-nv kunzhao-nv force-pushed the feat/rest-api-task-report branch from fe67fc1 to 3aa6e8b Compare June 7, 2026 19:01
Comment thread rest-api/api/pkg/api/handler/task.go Outdated
Both list-task handlers built the same APIRackTaskOption slice from
APIGetTasksRequest. Move the translation into
model.BuildAPIRackTaskOptions so new opt-in fields wire up in one
place. Addresses review feedback on PR NVIDIA#2288.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Comment thread rest-api/api/pkg/api/model/task.go Outdated
Comment thread rest-api/api/pkg/api/model/task.go Outdated
type: string
format: date-time
description: Timestamp when the task was last updated.
report:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to establish some kind of contract with the user. If this schema is tied to version 1 of the report then we should provide a version 1 schema for the user.

The same model serves tray-scoped task endpoints in addition to
rack ones; "RackTask" was misleading. Rename in lockstep with the
OpenAPI schema Task. Handler / route / telemetry-label names
keep "Rack" where they describe a specific endpoint scope rather
than the response model.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Aligns the public query param with the REST naming convention. The
Flow proto field stays with_report to match the existing with_rack
/ with_components flags on adjacent RPCs.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Replace the opaque report blob with a typed APITaskReportV1 schema
that mirrors flow/internal/task/report. Clients get a stable v1
contract (status enum, stage / step nesting, RFC3339 timestamps);
malformed or non-v1 payloads decode to nil so the response never
carries an off-contract shape.

Wire keys are camelCase (componentType, startedAt, ...) to match
the rest of the REST API surface; conversion from Flow's snake_case
emission happens inside parseTaskReportV1.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Covers 0f7cb00 (APITask rename), 9001abb (includeReport query
param), and a583f98 (typed TaskReportV1 schema).

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>

@thossain-nv thossain-nv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @kunzhao-nv, left a few more notes.

Comment thread rest-api/api/pkg/api/model/task.go Outdated
Comment thread rest-api/api/pkg/api/model/task.go Outdated
Comment thread rest-api/api/pkg/api/model/task.go Outdated
// APITaskReportV1. Without this option, Report is left nil and is
// omitted from the JSON response. A malformed or non-v1 payload also
// yields nil so the response never carries an off-contract shape.
func WithReport() APITaskOption {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package is model so WithReport needs to be more specific e.g. APITaskOptionWithReport

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to keep WithReport here:

  1. Call-site already disambiguates: model.NewAPITask(t, model.WithReport()) — the second-arg type (APITaskOption) tells which "Report" this is.

  2. Go convention is bare pkg.WithX(): grpc.WithInsecure(), aws.WithRegion(), redis.WithMaxRetries(). The type comes from the constructor that consumes the option, not from the option name.

  3. Repo precedent matches:

    • HTTPMiddlewareOptionWithTelemetry / WithRequestMetrics / WithHandlerTimeout
    • SharedInformerOptionWithNamespace / WithCustomResyncConfig / WithTweakListOptions
    • cert-manager/pkg/core/context.go (multi-concern, like model) → WithLogger / WithAppName / WithSignalHandler / WithClock

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference here is that model is a large general package. The example you provided all have specific package scope e.g. Core.

By defining WithReport we're claiming WithReport function name for all models. Another model Instance won't be able to define func WithReport() APIInstanceOption because Go doesn't allow function overloading.

If you want to keep WithX convention, then I'd use WithTaskReport

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Addressed in 7e2d079

// (snake_case JSON keys). Used solely as an unmarshal target so the
// REST-facing APITaskReportV1 can keep camelCase keys per the rest of
// the API surface. Kept private; clients consume APITaskReportV1.
type flowTaskReportV1 struct {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not have proto objects for report?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it is string type in flow proto.

kunzhao-nv added a commit to kunzhao-nv/ncx-infra-controller-core that referenced this pull request Jun 9, 2026
Three follow-ups from thossain-nv on PR NVIDIA#2288:

- Inline BuildAPITaskOptions as APIGetTasksRequest.TaskOptions, since
  the function only ever consumed a single request and was already
  conceptually a getter on it.
- Move parseTaskReportV1 into APITaskReportV1.UnmarshalJSON so the
  custom Flow snake_case decoder is reachable via the standard
  encoding/json entry point. The marshal path stays camelCase via
  struct tags; the doc comment flags the resulting asymmetry.
- Rename WithReport to WithTaskReport so future model option
  constructors for other API types (machine, rack, ...) can coexist
  in this shared package without colliding on a generic "Report" name.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
kunzhao-nv added a commit to kunzhao-nv/ncx-infra-controller-core that referenced this pull request Jun 9, 2026
Three follow-ups from PR NVIDIA#2288 review:

- Inline BuildAPITaskOptions as APIGetTasksRequest.TaskOptions, since
  the function only ever consumed a single request and was already
  conceptually a getter on it.
- Move parseTaskReportV1 into APITaskReportV1.UnmarshalJSON so the
  custom Flow snake_case decoder is reachable via the standard
  encoding/json entry point. The marshal path stays camelCase via
  struct tags; the doc comment flags the resulting asymmetry.
- Rename WithReport to WithTaskReport so future model option
  constructors for other API types (machine, rack, ...) can coexist
  in this shared package without colliding on a generic "Report" name.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
@kunzhao-nv kunzhao-nv force-pushed the feat/rest-api-task-report branch from 8bfe293 to b02af2e Compare June 9, 2026 04:09
kunzhao-nv added a commit to kunzhao-nv/ncx-infra-controller-core that referenced this pull request Jun 9, 2026
Two follow-ups from PR NVIDIA#2288 review:

- Inline BuildAPITaskOptions as APIGetTasksRequest.TaskOptions, since
  the function only ever consumed a single request and was already
  conceptually a getter on it.
- Move parseTaskReportV1 into APITaskReportV1.UnmarshalJSON so the
  custom Flow snake_case decoder is reachable via the standard
  encoding/json entry point. The marshal path stays camelCase via
  struct tags; the doc comment flags the resulting asymmetry.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
@kunzhao-nv kunzhao-nv force-pushed the feat/rest-api-task-report branch from b02af2e to 5988d28 Compare June 9, 2026 04:12

@thossain-nv thossain-nv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding +1, consider the suggestion on option chaining.

kunzhao-nv added a commit to kunzhao-nv/ncx-infra-controller-core that referenced this pull request Jun 9, 2026
Three follow-ups from PR NVIDIA#2288 review:

- Inline BuildAPITaskOptions as APIGetTasksRequest.TaskOptions, since
  the function only ever consumed a single request and was already
  conceptually a getter on it.
- Move parseTaskReportV1 into APITaskReportV1.UnmarshalJSON so the
  custom Flow snake_case decoder is reachable via the standard
  encoding/json entry point. The marshal path stays camelCase via
  struct tags; the doc comment flags the resulting asymmetry.
- Rename WithReport to WithTaskReport so the bare "WithReport" name
  stays available for option constructors on other model types.
  Go has no function overloading, so claiming WithReport in this
  shared package would permanently preclude e.g. an APIInstance
  WithReport() down the line.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
@kunzhao-nv kunzhao-nv force-pushed the feat/rest-api-task-report branch from 5988d28 to c4223e7 Compare June 9, 2026 21:38
- Inline BuildAPITaskOptions as APIGetTasksRequest.TaskOptions; it
  only ever consumed a single request.
- Move parseTaskReportV1 into APITaskReportV1.UnmarshalJSON so the
  Flow snake_case decoder is reachable via encoding/json. Marshal
  stays camelCase via struct tags.
- Rename WithReport to WithTaskReport so the shared model package
  keeps the bare WithReport name available for option constructors
  on other types.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
@kunzhao-nv kunzhao-nv force-pushed the feat/rest-api-task-report branch from c4223e7 to 7e2d079 Compare June 9, 2026 21:43
@kunzhao-nv kunzhao-nv enabled auto-merge (squash) June 9, 2026 21:49
@kunzhao-nv kunzhao-nv merged commit a4a8507 into NVIDIA:main Jun 9, 2026
89 checks passed
kunzhao-nv added a commit to kunzhao-nv/ncx-infra-controller-core that referenced this pull request Jun 10, 2026
Resolve task conflicts with NVIDIA#2288 (Task report API):
- APITask gains both RuleID (ours) and Report (theirs) plus the
  APITaskOption / WithTaskReport plumbing
- spec.yaml interleaves Task example (ruleId + report) and keeps
  both schema sets (OperationRule chain + TaskReportV1 chain)
- regenerate docs/index.html and sdk/standard/

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants