-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create model for async AQL tasks #79
Conversation
Also create migration for the new AqlQuery model.
Also fix some linting issues.
When a query task is first created, it won't have any results. The best representation for this is Null, because a placeholder or default values like an empty dictionary could imply that a query returned no results.
Creation of tasks representing mutating queries are allowed, but no results will be stored in them. They will finish with an error message from Arango.
We probably want a
Sounds good to me. How hard will it be to adjust these limits per, e.g., user in the future?
Is there a difficulty involved with removing it now? If you don't remove it now, please file an issue to remind us to do so in the future.
Why would you need pagination to view a single query? |
You're absolutely right. We do actually get this for free, which is why I didn't mention it in my PR. I'll clarify that.
I don't image it would be too difficult, although I haven't spent any time looking into it. What would this look like? Would owners of workspaces be able to control query limits for readers/writers via some kind of setting? I can create an issue for this if you'd like.
I believe the current deployment of the
Maybe that was the wrong question. We currently limit query result sizes to 20MB. If a user runs a query, and their result set is 20MB, is it ok to send the entire set of data in one request, or should there be a way to limit that. |
No, we don't even need to file an issue yet. I was just idly thinking about it. 60 seconds should definitely cover us for now and if that becomes a hindrance we can revisit what to do here.
Oh yes that makes sense. I agree that the old endpoint shouldn't stop this one from moving forward.
Ah I see. I think for right now, there's no need for special handling (because most typical requests we're working with won't send 20MB responses). More generally, I think maybe the query model could report how big the result is, and there could be a But this is a design discussion to have if and when we start to see unmanageably large query results in practice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great! Just a few suggestions, mostly around naming.
For these queries, the time limit has been bumped to 60 seconds
👍
For now, it may be best to leave the /workspaces/{workspace}/aql endpoint intact, but it should be removed in the future in favor of these new endpoints. The old endpoint is still used by our client applications. Once those are updated, there will be no need for the original endpoint.
As long as we track this with an issue as @waxlamp suggested, I'm okay with it for the time being.
Should requests to the GET endpoint add some kind of pagination for the results?
This is a good question. I'm kind of inclined to think that we shouldn't include the results field in AqlQueryTaskSerializer
, and should instead have a /results
endpoint (I believe @waxlamp also suggested this), which would return the paginated JSON.
My reasoning for this is that the results field can be large in many cases, and if you just want to retrieve the status of a task, you might not want all that data. If we kept the results field where it is now and we did want to support pagination, we'd need to do it in some nonstandard way. Whereas if we separated results out like suggested above, it's the same pagination we'd use everywhere else.
On the other hand, I'm not sure if pagination is necessary yet. We could just keep that at the back of our minds, and add in the result separation with pagination at a later point, if it seems necessary.
This is a +1 for a (I had been worried that AQL queries can return any JSON value, but it turns out that indeed, they always return lists: https://www.arangodb.com/docs/stable/aql/fundamentals-query-results.html.) |
Also remove custom exception handling from that task, and instead leverage MultinetCeleryTask to handle any exceptions during task execution.
@AlmightyYakob I've addressed the comments on your initial review. I've also implemented a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Closes #74
This PR creates a Celery task that allows for asynchronous AQL query execution. It introduces new endpoints:
POST /api/workspaces/{workspace}/queries/
: Expects aquery
as part of the request. Creates anAqlQuery
task and starts it. Returns the task object.GET /api/workspaces/{workspace}/queries/{query_id}/
: Return theAqlQuery
object represented by thequery_id
in the URI. This can be used to view the status of the task, query results, and any error messages.GET /api/workspaces/{workspace}/queries/
: Return all AQL queries objects for a given workspace.GET /api/workspaces/{workspace}/queries/{query_id}/results/
: Return the results of a given AQL queryReviewers, please give opinions on the following
/workspaces/{workspace}/aql
endpoint intact, but it should be removed in the future in favor of these new endpoints. The old endpoint is still used by our client applications. Once those are updated, there will be no need for the original endpoint.GET
endpoint add some kind of pagination for the results?