Skip to content

Conversation

skshetry
Copy link
Contributor

@skshetry skshetry commented Jul 23, 2025

Overview

This PR implements a new manual trigger event type for lakeFS actions, allowing users to trigger actions on-demand via API calls. The manual trigger executes actions asynchronously in the background and returns immediately with a 202 Accepted status.

Changes

Core Implementation

  • Manual Trigger Event Type: Added manual-trigger event type to the actions system
  • Async Execution: Manual triggers run in the background using asyncRun for non-blocking execution
  • API Endpoint: New POST /repositories/{repository}/refs/{ref}/actions/{action}/triggers endpoint
  • Response Status: Returns 202 Accepted for async operations with no response body

Usage

Action Definition

name: manual-action
on:
  manual-trigger:
hooks:
  - id: manual-hook
    type: lua
    properties:
      script: |
        print("Manual trigger executed!")

API Call

curl -X POST \
  "http://localhost:8000/api/v1/repositories/my-repo/refs/main/actions/manual-action/triggers" \
  -H "Authorization: Bearer <token>"

Response

HTTP/1.1 202 Accepted

Testing

Not yet.

Related Issues

Closes #5778

@skshetry skshetry changed the title initial changes add manual trigger Jul 23, 2025
@skshetry skshetry marked this pull request as ready for review July 23, 2025 13:03
@skshetry skshetry changed the title add manual trigger feat: add manual trigger for lakefs actions Jul 23, 2025
Copilot

This comment was marked as outdated.

Comment on lines +612 to +645
repoRecord := &graveler.RepositoryRecord{
RepositoryID: graveler.RepositoryID(repository.Name),
Repository: &graveler.Repository{
StorageID: graveler.StorageID(repository.StorageID),
StorageNamespace: graveler.StorageNamespace(repository.StorageNamespace),
CreationDate: repository.CreationDate,
DefaultBranchID: graveler.BranchID(repository.DefaultBranch),
State: graveler.RepositoryState_ACTIVE,
InstanceUID: "", // Not available from catalog API
ReadOnly: repository.ReadOnly,
},
}
Copy link
Contributor Author

@skshetry skshetry Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be a better way to get RepositoryRecord that I could not find. I asked cursor and failed, got this response instead:

This pattern is used throughout the codebase when converting between the catalog API layer and the internal graveler layer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually how we do it. This is kinda a pattern - the different layers use different data object types. The most common use of this pattern outside of lakeFS is probably in DTOs.
@nopcoder for more context on why we do this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah cursor is wrong :)

}

// For async execution, just return 202 Accepted
writeResponse(w, r, http.StatusAccepted, nil)
Copy link
Contributor Author

@skshetry skshetry Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return something? If not, should this return 204?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"202 Accepted" is actually a good choice! MDN says about 202:

indicates that a request has been accepted for processing, but processing has not been completed or may not have started

Copy link
Contributor Author

@skshetry skshetry Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub does 204 on workflow_dispatch, and does not return a body.

https://docs.github.com/en/rest/actions/workflows?apiVersion=2022-11-28#create-a-workflow-dispatch-event

So, I was trying to imitate that. If we return a body, 202 makes sense.

Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect manually triggering an action to be able to trigger a particular action, rather than all action configured with a manual trigger. So it is rather different from the other "hooks" - clearly a post-commit hook should run all hooks configured for post commit.

I am requesting changes chiefly because I expected a different feature than this defines. I might very well be (very) wrong - and then we should pretty much ignore my review.

api/swagger.yml Outdated
@@ -5597,6 +5597,37 @@ paths:
default:
$ref: "#/components/responses/ServerError"

/repositories/{repository}/refs/{ref}/actions/trigger:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the feature: which action will this run?

Comment on lines +88 to +89
graveler.EventTypePreCherryPick,
graveler.EventTypePostCherryPick,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an unrelated bugfix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @nopcoder as I see you implemented this in #9263.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like a bug fix for my pr, can take it as a different pr.
and I need to think of a better way to validate next time I'll not miss this one.

Comment on lines +612 to +645
repoRecord := &graveler.RepositoryRecord{
RepositoryID: graveler.RepositoryID(repository.Name),
Repository: &graveler.Repository{
StorageID: graveler.StorageID(repository.StorageID),
StorageNamespace: graveler.StorageNamespace(repository.StorageNamespace),
CreationDate: repository.CreationDate,
DefaultBranchID: graveler.BranchID(repository.DefaultBranch),
State: graveler.RepositoryState_ACTIVE,
InstanceUID: "", // Not available from catalog API
ReadOnly: repository.ReadOnly,
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually how we do it. This is kinda a pattern - the different layers use different data object types. The most common use of this pattern outside of lakeFS is probably in DTOs.
@nopcoder for more context on why we do this.

Comment on lines 2726 to 2732
_, err = c.Actions.TriggerAction(ctx, repo, refParam)
if c.handleAPIError(ctx, w, r, err) {
return
}

// For async execution, just return 202 Accepted
writeResponse(w, r, http.StatusAccepted, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I might kind-of almost agree with copilot here. But of course we would need to define the body payload in swagger.

Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple and solid implementation. I think I would choose a slightly different approach with the triggering, i.e. choose the specific action to be triggered. Lacks tests and documentation, but there's a limit to what can be achieved in a day. Good job!

@@ -210,6 +211,7 @@ type Service interface {
GetTaskResult(ctx context.Context, repositoryID string, runID string, hookRunID string) (*TaskResult, error)
ListRunResults(ctx context.Context, repositoryID string, branchID, commitID string, after string) (RunResultIterator, error)
ListRunTaskResults(ctx context.Context, repositoryID string, runID string, after string) (TaskResultIterator, error)
TriggerAction(ctx context.Context, repository *catalog.Repository, ref string) (string, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it better to leverage the existing Run and tweak the HookRecord?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the idea. I have changed to do as you suggested.

Run() runs synchronously, so I adjusted that to use asyncRun().

Comment on lines +612 to +645
repoRecord := &graveler.RepositoryRecord{
RepositoryID: graveler.RepositoryID(repository.Name),
Repository: &graveler.Repository{
StorageID: graveler.StorageID(repository.StorageID),
StorageNamespace: graveler.StorageNamespace(repository.StorageNamespace),
CreationDate: repository.CreationDate,
DefaultBranchID: graveler.BranchID(repository.DefaultBranch),
State: graveler.RepositoryState_ACTIVE,
InstanceUID: "", // Not available from catalog API
ReadOnly: repository.ReadOnly,
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah cursor is wrong :)

SourceRef: graveler.Ref(ref),
Repository: repoRecord,
}
s.asyncRun(ctx, record)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're basically running all actions that are on that ref and registered to manual trigger event trigger, right?
This is slightly different than Github API where you pass the action you want to trigger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this in the new iteration.

Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good code, but I'm worried whether it's the desired capability. I would like something that more closely mirrors what GitHub does. On GitHub, I get to run manually some particular action, and even pick parameters such as "branch". I don't need to run all actions for a branch on a repo.
If we do it this way, for instance, we might be painting ourselves into a corner. When we add a hook that is triggered by other parameters (think of a post-upload hook perhaps), it will be very weird to trigger it like this.

}

// For async execution, just return 202 Accepted
writeResponse(w, r, http.StatusAccepted, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"202 Accepted" is actually a good choice! MDN says about 202:

indicates that a request has been accepted for processing, but processing has not been completed or may not have started

@skshetry
Copy link
Contributor Author

It's good code, but I'm worried whether it's the desired capability. I would like something that more closely mirrors what GitHub does. On GitHub, I get to run manually some particular action, and even pick parameters such as "branch". I don't need to run all actions for a branch on a repo. If we do it this way, for instance, we might be painting ourselves into a corner. When we add a hook that is triggered by other parameters (think of a post-upload hook perhaps), it will be very weird to trigger it like this.

@arielshaqed, that was what cursor generated in 4c3290f, but I asked it to simplify as it was not clear to me what was expected.

I can revert but need to figure out how to run those in async way (Cursor's code ran tasks synchronously).

@skshetry skshetry force-pushed the feat/manual-trigger branch from 01972ae to f7cc263 Compare July 23, 2025 16:44
@skshetry skshetry requested a review from Copilot July 23, 2025 16:56
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a manual trigger functionality for lakeFS actions, allowing users to trigger actions on-demand via API calls rather than only through repository events. The feature enables async execution with immediate 202 Accepted responses.

Key changes include:

  • Added manual-trigger event type to the actions system with async execution capabilities
  • New API endpoint for triggering actions manually with proper authorization
  • Client SDK updates across multiple languages (Rust, Python, Java) with comprehensive documentation

Reviewed Changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/permissions/actions.go Adds new TriggerActionsAction permission constant for authorization
pkg/graveler/hooks_handler.go Defines EventTypeManualTrigger event type and ActionName field in HookRecord
pkg/actions/service.go Implements core TriggerAction method with action validation and async execution
pkg/api/controller.go Adds HTTP endpoint handler with authorization and error handling
api/swagger.yml Defines OpenAPI specification for the new trigger endpoint
clients/* Generated client SDK code and documentation updates

CreationDate: repository.CreationDate,
DefaultBranchID: graveler.BranchID(repository.DefaultBranch),
State: graveler.RepositoryState_ACTIVE,
InstanceUID: "", // Not available from catalog API
Copy link
Preview

Copilot AI Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting InstanceUID to an empty string with a comment indicating it's not available could lead to issues if this field is required downstream. Consider either fetching this value properly or documenting the implications of leaving it empty.

Copilot uses AI. Check for mistakes.

EventType: graveler.EventTypeManualTrigger,
Repository: repoRecord,
SourceRef: graveler.Ref(ref),
BranchID: graveler.BranchID(ref), // For manual triggers, we use the ref as branch
Copy link
Preview

Copilot AI Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directly casting ref to BranchID assumes the ref is always a branch, but refs can also be tags or commit hashes. This could lead to incorrect behavior when triggering actions on non-branch refs. Consider validating that the ref is actually a branch or handling different ref types appropriately.

Copilot uses AI. Check for mistakes.

EventType: record.EventType,
BranchID: record.BranchID,
}
_, err := s.loadMatchedActions(ctx, record, spec)
Copy link
Preview

Copilot AI Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code calls loadMatchedActions twice - once for validation and again in asyncRun. This duplicates the action loading and filtering logic. Consider refactoring to load actions once and pass them to asyncRun, or cache the result to avoid redundant processing.

Copilot uses AI. Check for mistakes.

@skshetry skshetry force-pushed the feat/manual-trigger branch from f7cc263 to 4094bd1 Compare July 24, 2025 03:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Action new trigger-type: manual-trigger
4 participants