feat: Add Flow Operation Rule REST APIs#2286
Conversation
Wire 5 site-scoped endpoints (POST/GET/LIST/PATCH/DELETE /v2/org/{org}/nico/rule)
backed by Flow's OperationRule gRPC, with matching Temporal workflows/activities,
typed OpenAPI schemas, and unit tests across model/handler/workflow/activity layers.
Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Adds optional ruleId to power/firmware/bring-up request models (single + batch, rack + tray) and forwards it to Flow's rule_id proto field via the shared Execute* helpers. UUID-validated at the model layer. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
🌿 Preview your docs: https://nvidia-preview-pull-request-2286.docs.buildwithfern.com/infra-controller |
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-06-07 05:09:58 UTC | Commit: dd435f1 |
Keep the surface minimal; can be reintroduced alongside default-rule management. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Covers 8464a1e (drop isDefault list filter); also adds SPDX headers to the rule SDK files that the previous regen missed. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
🔍 Container Scan Summary
Per-CVE detail lives in the per-service |
|
@kunzhao-nv Rule also feels like too generic of a high level endpoint and it seems best to contextualize it. Should we consider |
Rules govern how tasks execute, so colocating the rule endpoints under
the /task namespace matches how callers think about them and keeps the
top-level nico path uncluttered.
POST /v2/org/{org}/nico/task/rule
GET /v2/org/{org}/nico/task/rule
GET /v2/org/{org}/nico/task/rule/{id}
PATCH /v2/org/{org}/nico/task/rule/{id}
DELETE /v2/org/{org}/nico/task/rule/{id}
Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Flow records the OperationRule it resolves for each task (proto field Task.applied_rule_id, populated whether the caller pinned a ruleId on the originating request or Flow picked the default). Expose it on the REST RackTask schema so clients can answer "which rule ran for this task" by reading the task itself instead of inferring it. Field name is ruleId — same as the request-side field on rack/tray operation APIs — so request and response use the same vocabulary for the same concept. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
9817d24 to
c23abc7
Compare
|
@kunzhao-nv I've suggested using |
Rules apply to tasks; the type, file, and function names now reflect that domain context so readers don't have to infer the relationship from prose comments. The change is mechanical — no behavior change. * Files: model / handler / workflow / activity rule.go (+ tests) → taskrule.go. * Handler types: CreateRuleHandler → CreateTaskRuleHandler (and the matching Get / GetAll / Update / Delete variants), with receiver variables updated to the codebase's first-letter-abbreviation convention (ctrh / gtrh / gatrh / utrh / dtrh). * List handler renamed to GetAll to match how every other listing handler in this package is spelled. * Model types: APIOperationRule / APIRuleDefinition / APISequenceStep / APIActionConfig / APIRetryPolicy → APITaskRule[…]. Request structs switch to the APIResourceActionRequest form used elsewhere in this package (APITaskRuleCreateRequest / GetRequest / UpdateRequest / DeleteRequest / GetAllRequest). * Workflow + activity entry points: CreateRule → CreateTaskRule, and the activity counterparts (CreateOperationRuleOnFlow → CreateTaskRuleOnFlow, ManageRule interface → ManageTaskRule, etc.). * Internal helpers: errRuleResponseSent → errTaskRuleResponseSent, ruleHandlerPrepare → prepareTaskRuleHandler. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Match the convention every other enum-like string on the REST surface already follows (TaskStatus, ComponentType, DiffType, BMCType). The wire form changes from "power_control" / "firmware_control" to "PowerControl" / "FirmwareControl". Flow's internal YAML rule files keep their existing snake_case spelling; conversion now happens at the REST boundary via the existing ProtoToAPIOperationTypeName / apiToProtoOperationType maps. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
…tion The nested SequenceStep / ActionConfig / RetryPolicy schemas previously used snake_case JSON keys (component_type, max_parallel, …) to round- trip 1:1 with Flow's stored rule_definition_json blob. Snake_case is out of step with every other model in this package, which is uniformly camelCase. Switch the REST surface to camelCase and bridge to Flow's blob via a parallel set of unexported flow*Wire structs whose tags preserve snake_case. * APITaskRuleSequenceStep, APITaskRuleActionConfig, APITaskRuleRetry- Policy json tags now use componentType / maxParallel / preOperation / mainOperation / postOperation / delayAfter / pollInterval / maxAttempts / initialInterval / backoffCoefficient / maxInterval. * New flowRuleDefinitionWire / flowSequenceStepWire / flowAction- ConfigWire / flowRetryPolicyWire mirror the API types with snake_ case tags, plus toFlowJSON / fromFlowJSON helpers used by APITaskRule.FromProto and APITaskRuleCreateRequest/UpdateRequest's ToProto. Field-by-field conversion is mechanical; keep the wire structs in lock-step when adding fields. * OpenAPI spec (SequenceStep, ActionConfig, RetryPolicy) updated to match. ActionConfig.parameters keys remain free-form pass-through (Flow vocabulary). Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
The type and its OpenAPI schema previously read as if tasks only ever belong to racks, but the same shape is returned by both rack-scoped and tray-scoped task endpoints (and will cover further task sources in the future). Drop the misleading prefix. * APIRackTask → APITask, NewAPIRackTask → NewAPITask, ProtoToAPIRackTaskStatusName → ProtoToAPITaskStatusName. * OpenAPI schema `RackTask` → `Task`; four `$ref` sites updated. * Description broadened to reflect that tasks model any asynchronous, site-scoped operation. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Follow the convention set by allocation.go and other existing models: optional fields render as explicit null / "" rather than disappearing from the JSON payload, so clients always see the expected shape. omitempty remains on the unexported flow*Wire structs because those serialise to Flow's stored blob, not to clients. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Replace the bespoke validateOptionalUUID helper and the manual
if-checks scattered across the TaskRule create/update/get/delete/
list requests, plus the rack and firmware operation requests, with
ozzo-validation chains so every model in this package speaks the
same validation idiom (see allocation.go for the long-standing
convention).
* firmware.go: APIUpdateFirmwareRequest, APIBatchRackFirmwareUpdate-
Request, APIBatchTrayFirmwareUpdateRequest now use validation.
ValidateStruct with validation.When + validationis.UUID for ruleId.
validateOptionalUUID is removed.
* rack.go: APIBringUpRackRequest, APIBatchBringUpRackRequest follow
the same pattern.
* taskrule.go: APITaskRuleCreateRequest / UpdateRequest / GetRequest
/ DeleteRequest / GetAllRequest move off manual ifs to ozzo,
including validation.In for operationType against a shared
validOperationTypesAny enum slice.
* Tests updated to match ozzo's error format
("operationType must be one of …" / "X: must be a valid UUID.").
Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
The helper has always been a thin pointer-to-Flow-UUID wrapper that has nothing rule-specific about it. Lift it out of the rule-id niche so future call sites (any optional API ID forwarded to Flow) can reach for the same primitive instead of cloning it. Behavior is unchanged. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Covers spec changes from 74b3d16..fc054ca: 74b3d16 rename Rule API surface to TaskRule 5a68262 PascalCase operationType enum values 6a473ae camelCase ruleDefinition keys + Flow blob translation 546a65f rename APIRackTask to APITask Generated by `make publish-openapi` (docs) and `make generate-sdk` (SDK). model_rack_task.go removed as the schema is now Task; the generator dropped an unrelated import-grouping change to model_tenant_identity_config_create_or_update_request.go which was reverted. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
|
Pushed
Notes:
PTAL when you have a moment. |
Trailing blank line left over from fb08d14 (validator cleanup) trips the Style Check job's gofmt check. No code change. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Finishes the Rule → TaskRule rename for the layer that was missed in 74b3d16: the five Temporal workflow functions still carried the Operation* spelling, leaving the codebase inconsistent (handler / model / activity all on TaskRule; workflow alone on OperationRule). * site-workflow/pkg/workflow/taskrule.go: CreateOperationRule → CreateTaskRule (+ matching Get / GetAll / Update / Delete variants). Log fields now follow the package convention used by tray.go and ibpartition.go — Str("Workflow", "TaskRule").Str("Action", "Create") instead of bundling the resource into the Action label. * site-workflow/pkg/workflow/taskrule_test.go: test-suite types and Test_* method names move to the new spelling. * api/pkg/api/handler/taskrule.go and *_test.go: the workflow-name string passed to ExecuteWorkflow / mocked via testify and the objectType handed to SetupHandler / TerminateWorkflowOnTimeOut all switch to "TaskRule" + the new workflow names. Temporal looks workflows up by name, so the registered name (subscriber.go) and the dispatch name (handler) have to agree. * site-agent/.../flowgrpc/subscriber.go: RegisterWorkflow now binds the new function identifiers; the accompanying log messages move with them. Flow's gRPC method names (flowv1.CreateOperationRule* and the matching proto types) intentionally stay as-is — they're Flow's contract, not ours; the activity layer (CreateTaskRuleOnFlow et al.) is the bridge. Drive-by: drop two stray "MVP" mentions from doc comments per the project's "no MVP / phase vocabulary in code comments" preference. Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Per PR review (model/rule.go:141): an additional NewX constructor
isn't needed when FromProto already exists. Callers create the
zero-value struct and call FromProto directly, matching the simpler
shape used elsewhere in the package.
* model/taskrule.go: remove NewAPITaskRule.
* handler/taskrule.go (Get + GetAll): inline &model.APITaskRule{} +
FromProto at the two call sites; error handling is unchanged.
* model/taskrule_test.go: switch the two FromProto test cases to
invoke the method directly.
Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
Description
1. Operation Rule CRUD API — 5 site-scoped endpoints, backed by Flow's
OperationRulegRPC. Rules govern how tasks execute, so they live under the/tasknamespace alongside task endpoints:/v2/org/{org}/nico/task/rule/v2/org/{org}/nico/task/rulesiteId,operationType)/v2/org/{org}/nico/task/rule/{id}/v2/org/{org}/nico/task/rule/{id}/v2/org/{org}/nico/task/rule/{id}2. Optional
ruleIdon rack/tray operation APIs — added to 8 request schemas (power / firmware / bring-up, single + batch, rack + tray). When set, the value is forwarded to Flow'srule_idproto field via the three sharedExecute*helpers inhandler/util/common/common.go. When omitted, Flow falls back to its default rule resolution — no behavior change for existing callers.3.
ruleIdon task responses — theRackTaskschema now carries the Operation Rule Flow resolved for the task (sourced fromTask.applied_rule_idin the Flow proto), whether the caller pinned it viaruleIdon the originating request or Flow's default resolution picked it. Callers reading a task can now answer "which rule ran for this task" by reading the task itself, no extra lookup.Type of Change
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes
Followups: