Batch: Process-parallel directory scan and initial file read#426
Open
Batch: Process-parallel directory scan and initial file read#426
Conversation
|
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/426/index.html |
3109cf1 to
31aba1f
Compare
4eddd7c to
607afdf
Compare
607afdf to
617a234
Compare
Also adds small draft test for expression pickling.
617a234 to
bf53164
Compare
Collaborator
Author
|
Ok, this has now been rebased over latest main and some of the performance worries observed before have now disappeared. An updated breakdown of the timings can be found below. I'm aware this is only the first step towards some further consolidation of parallel / queuing utilities in the loki package, but it's a functional start that already can bring some real-world benefit. I would appreciate some feedback on general design, so paging Dr. @reuterbal for full review. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Initial implementation of simple parallelism in the batch-processing scheduler. This PR refactors the two "trivially parallel" steps of the Scheduler initialisation (scanning the source directories and parsing the full
Sourcefileinto IR) using Python's builtinProcessPoolExecutor. It also adds a dummySerialExecutorimplementation that exposes the same interface, but works serially on the original process, thus keeping existing functionality available.In a little more detail:
Sourcefilecreation from Item creation and removeget_or_create_file_item_from_pathexecutorobject to the Scheduler that dummies to the providedSerialExecutorifnum_workers=0is selected, and otherwise creates aProcessPoolExecutor(max_workers=num_workers).Sourcefile.from_pathon the executor by passing assembledfrontend_argsSourcefile.make_complete) using theexecutor.mapfunctionality over source objects and parser-args, before updating the returned copy of source on the accordingItemsym.Castobjects andModuleAST objectsPerformance
To test performance, I've mimicked the H24-dev Plan-generation (without explicitly provided header paths), but locally enabled full source parses in the plan step. When adding the new
ProcessPoolExecutorbut keeping the number of processors low, we can see a significant overhead of the process-pipe-and-serialisation mechanics, but increasing the number of process somewhat we can still get to a reasonable quality-of-life improvement.Sequential, equivalent to previous:
Sequential, but through
ProcessPoolExecutor:And with 12 build processes: