Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include R errors in the log files. #18

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Include R errors in the log files. #18

wants to merge 21 commits into from

Conversation

plietar
Copy link
Member

@plietar plietar commented Feb 4, 2025

Currently, when an R error is thrown by a report, it is caught by the rrq worker and is stored in Redis, but it is not exposed over the runner API anywhere.

Rather than introduce yet another field in API, we print the error from the worker process, which will make it visible at the end of the tasks log file.

plietar and others added 12 commits January 14, 2025 18:22
The API tests used to setup individual endpoints to run the tests. That
is a bit verbose and duplicates a bunch of code from the `api.R` file.

This will be made much worse when we introduce `url` parameters to all
the endpoints, since all the tests will need to make a API call to fetch
the repository first, which will lead to more boilerplate.

Using the api object solves this issue.
In the previous design, outpack_server, the runner API and the workers
all shared a single outpack directory, and a Git repository (mostly, the
workers used a clone of the shared repository for the actual execution).

This creates a very tight and brittle coupling between all the
components. It makes it impossible to deploy the different components on
separate machines. It requires careful reasoning about data races and
conflicts between the different bits. It prevents us from sharing worker
processes across multiple Packit instances, and it prevents us from
using multiple Git repositories within a single instance.

The new design completely splits up the storage.
- The API server and each worker have their own local Git clones of the
  repositories, that are directly pulled from the upstream (eg. GitHub).
- The API servers and workers store bare Git clones of the repositories,
  without any worktree. When running a report, workers create a new
  worktree in a temporary directory, run the report and delete the
  worktree. This ensures a completely clean slate every time.
- The workers use their own outpack store, that is not shared with any
  other process.
- The workers can pull and push packets using any protocol supported by
  orderly2. In practice, we will be using HTTP to interact with the
  outpack_server used by Packit.

Currently, the workers create a new outpack store for each run, meaning
they do not cache any of the packet dependencies and need to download
them from the outpack_server from scratch every time. Given that, at
least for now, workers and outpack_server will be operating on the same
or nearby machines, this seems like a reasonable overhead.

Ideally we would keep a per-worker cache, however we need to be careful
not to mix packets between different instances. One possible approach
may be to re-use the file store, but start from an empty metadata store
everytime. This way large unnecessary file downloads are avoided, while
preserving some degree of isolation between runs and instances.
@plietar plietar changed the title Mrc 6152 Include R errors in the log files. Feb 10, 2025
@plietar plietar requested a review from M-Kusumgar February 10, 2025 16:35
@plietar plietar marked this pull request as ready for review February 10, 2025 16:58
Currently, when an R error is thrown by a report, it is caught by the
rrq worker and is stored in Redis, but it is not exposed over the runner
API anywhere.

Rather than introduce yet another field in API, we print the error from
the worker process, which will make it visible at the end of the tasks
log file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants