-
Notifications
You must be signed in to change notification settings - Fork 393
feat: add manual trigger for lakefs actions #9338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
repoRecord := &graveler.RepositoryRecord{ | ||
RepositoryID: graveler.RepositoryID(repository.Name), | ||
Repository: &graveler.Repository{ | ||
StorageID: graveler.StorageID(repository.StorageID), | ||
StorageNamespace: graveler.StorageNamespace(repository.StorageNamespace), | ||
CreationDate: repository.CreationDate, | ||
DefaultBranchID: graveler.BranchID(repository.DefaultBranch), | ||
State: graveler.RepositoryState_ACTIVE, | ||
InstanceUID: "", // Not available from catalog API | ||
ReadOnly: repository.ReadOnly, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There must be a better way to get RepositoryRecord
that I could not find. I asked cursor and failed, got this response instead:
This pattern is used throughout the codebase when converting between the catalog API layer and the internal graveler layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually how we do it. This is kinda a pattern - the different layers use different data object types. The most common use of this pattern outside of lakeFS is probably in DTOs.
@nopcoder for more context on why we do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah cursor is wrong :)
} | ||
|
||
// For async execution, just return 202 Accepted | ||
writeResponse(w, r, http.StatusAccepted, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this return something? If not, should this return 204?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"202 Accepted" is actually a good choice! MDN says about 202:
indicates that a request has been accepted for processing, but processing has not been completed or may not have started
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GitHub does 204 on workflow_dispatch
, and does not return a body.
So, I was trying to imitate that. If we return a body, 202 makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect manually triggering an action to be able to trigger a particular action, rather than all action configured with a manual trigger. So it is rather different from the other "hooks" - clearly a post-commit hook should run all hooks configured for post commit.
I am requesting changes chiefly because I expected a different feature than this defines. I might very well be (very) wrong - and then we should pretty much ignore my review.
api/swagger.yml
Outdated
@@ -5597,6 +5597,37 @@ paths: | |||
default: | |||
$ref: "#/components/responses/ServerError" | |||
|
|||
/repositories/{repository}/refs/{ref}/actions/trigger: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the feature: which action will this run?
graveler.EventTypePreCherryPick, | ||
graveler.EventTypePostCherryPick, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an unrelated bugfix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like a bug fix for my pr, can take it as a different pr.
and I need to think of a better way to validate next time I'll not miss this one.
repoRecord := &graveler.RepositoryRecord{ | ||
RepositoryID: graveler.RepositoryID(repository.Name), | ||
Repository: &graveler.Repository{ | ||
StorageID: graveler.StorageID(repository.StorageID), | ||
StorageNamespace: graveler.StorageNamespace(repository.StorageNamespace), | ||
CreationDate: repository.CreationDate, | ||
DefaultBranchID: graveler.BranchID(repository.DefaultBranch), | ||
State: graveler.RepositoryState_ACTIVE, | ||
InstanceUID: "", // Not available from catalog API | ||
ReadOnly: repository.ReadOnly, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually how we do it. This is kinda a pattern - the different layers use different data object types. The most common use of this pattern outside of lakeFS is probably in DTOs.
@nopcoder for more context on why we do this.
pkg/api/controller.go
Outdated
_, err = c.Actions.TriggerAction(ctx, repo, refParam) | ||
if c.handleAPIError(ctx, w, r, err) { | ||
return | ||
} | ||
|
||
// For async execution, just return 202 Accepted | ||
writeResponse(w, r, http.StatusAccepted, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I might kind-of almost agree with copilot here. But of course we would need to define the body payload in swagger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple and solid implementation. I think I would choose a slightly different approach with the triggering, i.e. choose the specific action to be triggered. Lacks tests and documentation, but there's a limit to what can be achieved in a day. Good job!
pkg/actions/service.go
Outdated
@@ -210,6 +211,7 @@ type Service interface { | |||
GetTaskResult(ctx context.Context, repositoryID string, runID string, hookRunID string) (*TaskResult, error) | |||
ListRunResults(ctx context.Context, repositoryID string, branchID, commitID string, after string) (RunResultIterator, error) | |||
ListRunTaskResults(ctx context.Context, repositoryID string, runID string, after string) (TaskResultIterator, error) | |||
TriggerAction(ctx context.Context, repository *catalog.Repository, ref string) (string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it better to leverage the existing Run
and tweak the HookRecord
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the idea. I have changed to do as you suggested.
Run()
runs synchronously, so I adjusted that to use asyncRun()
.
repoRecord := &graveler.RepositoryRecord{ | ||
RepositoryID: graveler.RepositoryID(repository.Name), | ||
Repository: &graveler.Repository{ | ||
StorageID: graveler.StorageID(repository.StorageID), | ||
StorageNamespace: graveler.StorageNamespace(repository.StorageNamespace), | ||
CreationDate: repository.CreationDate, | ||
DefaultBranchID: graveler.BranchID(repository.DefaultBranch), | ||
State: graveler.RepositoryState_ACTIVE, | ||
InstanceUID: "", // Not available from catalog API | ||
ReadOnly: repository.ReadOnly, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah cursor is wrong :)
SourceRef: graveler.Ref(ref), | ||
Repository: repoRecord, | ||
} | ||
s.asyncRun(ctx, record) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we're basically running all actions that are on that ref and registered to manual trigger event trigger, right?
This is slightly different than Github API where you pass the action you want to trigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed this in the new iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good code, but I'm worried whether it's the desired capability. I would like something that more closely mirrors what GitHub does. On GitHub, I get to run manually some particular action, and even pick parameters such as "branch". I don't need to run all actions for a branch on a repo.
If we do it this way, for instance, we might be painting ourselves into a corner. When we add a hook that is triggered by other parameters (think of a post-upload hook perhaps), it will be very weird to trigger it like this.
} | ||
|
||
// For async execution, just return 202 Accepted | ||
writeResponse(w, r, http.StatusAccepted, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"202 Accepted" is actually a good choice! MDN says about 202:
indicates that a request has been accepted for processing, but processing has not been completed or may not have started
@arielshaqed, that was what cursor generated in 4c3290f, but I asked it to simplify as it was not clear to me what was expected. I can revert but need to figure out how to run those in async way (Cursor's code ran tasks synchronously). |
01972ae
to
f7cc263
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a manual trigger functionality for lakeFS actions, allowing users to trigger actions on-demand via API calls rather than only through repository events. The feature enables async execution with immediate 202 Accepted responses.
Key changes include:
- Added
manual-trigger
event type to the actions system with async execution capabilities - New API endpoint for triggering actions manually with proper authorization
- Client SDK updates across multiple languages (Rust, Python, Java) with comprehensive documentation
Reviewed Changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
pkg/permissions/actions.go | Adds new TriggerActionsAction permission constant for authorization |
pkg/graveler/hooks_handler.go | Defines EventTypeManualTrigger event type and ActionName field in HookRecord |
pkg/actions/service.go | Implements core TriggerAction method with action validation and async execution |
pkg/api/controller.go | Adds HTTP endpoint handler with authorization and error handling |
api/swagger.yml | Defines OpenAPI specification for the new trigger endpoint |
clients/* | Generated client SDK code and documentation updates |
CreationDate: repository.CreationDate, | ||
DefaultBranchID: graveler.BranchID(repository.DefaultBranch), | ||
State: graveler.RepositoryState_ACTIVE, | ||
InstanceUID: "", // Not available from catalog API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting InstanceUID to an empty string with a comment indicating it's not available could lead to issues if this field is required downstream. Consider either fetching this value properly or documenting the implications of leaving it empty.
Copilot uses AI. Check for mistakes.
EventType: graveler.EventTypeManualTrigger, | ||
Repository: repoRecord, | ||
SourceRef: graveler.Ref(ref), | ||
BranchID: graveler.BranchID(ref), // For manual triggers, we use the ref as branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directly casting ref to BranchID assumes the ref is always a branch, but refs can also be tags or commit hashes. This could lead to incorrect behavior when triggering actions on non-branch refs. Consider validating that the ref is actually a branch or handling different ref types appropriately.
Copilot uses AI. Check for mistakes.
EventType: record.EventType, | ||
BranchID: record.BranchID, | ||
} | ||
_, err := s.loadMatchedActions(ctx, record, spec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code calls loadMatchedActions twice - once for validation and again in asyncRun. This duplicates the action loading and filtering logic. Consider refactoring to load actions once and pass them to asyncRun, or cache the result to avoid redundant processing.
Copilot uses AI. Check for mistakes.
f7cc263
to
4094bd1
Compare
Overview
This PR implements a new manual trigger event type for lakeFS actions, allowing users to trigger actions on-demand via API calls. The manual trigger executes actions asynchronously in the background and returns immediately with a 202 Accepted status.
Changes
Core Implementation
manual-trigger
event type to the actions systemasyncRun
for non-blocking executionPOST /repositories/{repository}/refs/{ref}/actions/{action}/triggers
endpoint202 Accepted
for async operations with no response bodyUsage
Action Definition
API Call
Response
Testing
Not yet.
Related Issues
Closes #5778