-
Notifications
You must be signed in to change notification settings - Fork 8.5k
[One Workflow] fix: http step error handling lacks structured error data #243395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[One Workflow] fix: http step error handling lacks structured error data #243395
Conversation
…-step-execution-displays-skeletons-for-not-executed-steps
…ns-for-not-executed-steps
…ns-for-not-executed-steps
…ns-for-not-executed-steps
…ration-will-cancel-only-when-delay-resolves
…ration-will-cancel-only-when-delay-resolves
…-cancel-only-when-delay-resolves' into 14737-HTTP-Step-Error-Handling-Lacks-Structured-Error-Data
…-Structured-Error-Data
…ata' of https://github.com/skynetigor/kibana into 14737-HTTP-Step-Error-Handling-Lacks-Structured-Error-Data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances error handling in the One Workflow execution engine by replacing string-based error storage with structured ExecutionError objects. The changes enable better error introspection and conditional error handling in retry and continue steps.
Key Changes:
- Introduced
ExecutionErrorschema withtype,message, and optionaldetailsfields - Added conditional error handling support for retry and continue steps via KQL expressions
- Enhanced HTTP step error handling to include detailed response information
- Updated all error-related code paths to use structured error objects
Reviewed Changes
Copilot reviewed 30 out of 31 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/platform/packages/shared/kbn-workflows/spec/schema.ts |
Added ExecutionError schema and condition field to retry/continue configurations |
src/platform/packages/shared/kbn-workflows/types/v1.ts |
Updated error fields to use ExecutionError type |
src/platform/plugins/shared/workflows_execution_engine/server/utils/map_error/map_error.ts |
New utility to convert errors to ExecutionError format |
src/platform/plugins/shared/workflows_execution_engine/server/step/http_step/http_step_impl.ts |
Enhanced HTTP error handling with structured error details |
src/platform/plugins/shared/workflows_execution_engine/server/step/on_failure/retry_step/enter_retry_node_impl.ts |
Added conditional retry logic based on error properties |
src/platform/plugins/shared/workflows_execution_engine/server/step/on_failure/continue_step/enter_continue_node_impl.ts |
Added conditional continue logic based on error properties |
src/platform/plugins/shared/workflows_execution_engine/server/workflow_context_manager/workflow_context_manager.ts |
Added evaluateBooleanExpressionInContext method and error context injection |
| Multiple test files | Updated assertions to match new structured error format |
Comments suppressed due to low confidence (1)
src/platform/plugins/shared/workflows_execution_engine/integration_tests/tests/on_failure_continue.test.ts:1
- The comment is misleading in the context of the continue test. Since the condition is not met and retries don't occur, the duration comment doesn't accurately describe this test case. The comment should reflect that no retries are expected here.
/*
...shared/workflows_execution_engine/server/step/on_failure/retry_step/enter_retry_node_impl.ts
Outdated
Show resolved
Hide resolved
...hared/workflows_execution_engine/server/workflow_context_manager/workflow_context_manager.ts
Show resolved
Hide resolved
src/platform/plugins/shared/workflows_execution_engine/server/step/http_step/http_step_impl.ts
Show resolved
Hide resolved
💔 Build Failed
Failed CI StepsTest Failures
Metrics [docs]Public APIs missing comments
Async chunks
Unknown metric groupsAPI count
ESLint disabled line counts
Total ESLint disabled count
History |
Summary
closes: https://github.com/elastic/security-team/issues/14737
🔧 HTTP Step Error Handling Enhancement
Structured Error Response
type: Error type classification (e.g.,HttpRequestError,ConnectionRefused,HttpRequestCancelledError)message: Human-readable error messagedetails: Additional context (status code, headers, response body when available)HTTP Step Improvements
mapAxiosError()method to transform Axios errors into structured execution errorsError Type Schema
ExecutionErrorschema with fields:type,message,detailsEsWorkflowExecution.errorandEsWorkflowStepExecution.errortypes fromstring | nulltoExecutionError | null⚙️ Error Handling Infrastructure
Error Mapping Utility
mapError()utility function to standardize error conversionErrorobjects, string errors, and existingExecutionErrorobjectsConditional Error Recovery
Retry logic now supports conditional retries:
conditionfield toWorkflowRetrySchema(KQL expression;, liquid expression, or boolean)truecondition: "${{error.type == 'NetworkError'}}"Continue on failure now supports conditions:
continuefield accepts boolean, liquidjs or KQL expression stringContext Manager Enhancement
evaluateBooleanExpressionInContext()method for evaluating conditional expressionssteps.<stepName>.error🔄 Error Propagation Improvements
Node Error Catching Interface
NodeWithErrorCatching.catchError()signature to acceptStepExecutionRuntimeparameterRetry Behavior
Workflow Context
steps.<stepName>.errorin fallback scope📝 Logging & Observability
ExecutionErrorobjects✅ Test Coverage
Checklist
Check the PR satisfies following conditions.
Reviewers should verify this PR satisfies this list as well.
release_note:breakinglabel should be applied in these situations.release_note:*label is applied per the guidelinesbackport:*labels.Identify risks
Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss.
Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging.