-
Notifications
You must be signed in to change notification settings - Fork 132
UDF error handling improvements #1464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves UDF error handling by removing the intermediate process_safe wrapper and allowing raw exceptions from UDFs to propagate to the parent process, while adding proper stacktrace support for parallel execution.
Key Changes
- Introduced
UdfRunErrorexception class to wrap errors from parallel UDF execution with stacktrace information - Removed the
process_safemethod that was wrapping exceptions inDataChainError, allowing raw exceptions to propagate - Updated exception handling in parallel workers to capture and propagate stacktraces
- Updated tests to expect the actual exception types raised by UDFs instead of
DataChainError
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/datachain/lib/udf.py | Introduced UdfRunError class, removed process_safe wrapper method and unused imports, replaced all process_safe calls with direct process calls |
| src/datachain/query/dispatch.py | Added stacktrace capture in worker error handling and UdfRunError wrapping for parallel execution failures |
| tests/unit/lib/test_checkpoints.py | Updated tests to expect CustomMapperError instead of DataChainError, added custom exception class for testing |
| tests/func/test_udf.py | Updated test to expect RuntimeError instead of DataChainError for UDF execution errors |
| tests/func/test_to_database.py | Updated test to expect ValueError instead of DataChainError for processing failures |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
f166140 to
0f85aaa
Compare
6a51332 to
ba760ea
Compare
dreadatour
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few highlights for the changes in this PR.
| return self.__class__, (self.message,) | ||
|
|
||
|
|
||
| class UdfRunError(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new exception type that is caught by the parent process, capturing both the traceback and the UDF name for display in the UI.
| def process_safe(self, obj_rows): | ||
| try: | ||
| result_objs = self.process(*obj_rows) | ||
| except Exception as e: # noqa: BLE001 | ||
| msg = f"============== Error in user code: '{self.name}' ==============" | ||
| print(msg) | ||
| exc_type, exc_value, exc_traceback = sys.exc_info() | ||
| traceback.print_exception(exc_type, exc_value, exc_traceback.tb_next) | ||
| print("=" * len(msg)) | ||
| raise DataChainError( | ||
| f"Error in user code in class '{self.name}': {e!s}" | ||
| ) from None | ||
| return result_objs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to catch and log this here — the parent process now catches the exception and handles logging and user-facing output.
| { | ||
| "status": FAILED_STATUS, | ||
| "exception": e, | ||
| "stacktrace": traceback.format_exc(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Send also the traceback.
| raise exc | ||
| if isinstance(exc, KeyboardInterrupt): | ||
| raise exc | ||
| raise UdfRunError(exc, stacktrace=result.get("stacktrace")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UdfRunError wrapper for exc exception is needed to preserve the traceback.
|
|
||
| # The export should fail during processing, after table creation | ||
| with pytest.raises(dc.DataChainError, match="Processing failed on third item"): | ||
| with pytest.raises(ValueError, match="Processing failed on third item"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direct exception is now raised.
| class CustomMapperError(Exception): | ||
| pass | ||
|
|
||
|
|
||
| def mapper_fail(num: int) -> int: | ||
| raise CustomMapperError("Error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This to check if we are raising and catching any exceptions as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
| ref: ${{ env.STUDIO_BRANCH }} | ||
| token: ${{ secrets.ITERATIVE_STUDIO_READ_ACCESS_TOKEN }} | ||
|
|
||
| - uses: astral-sh/setup-uv@v7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using this trick to test datachain and studio related changes from the git branch in case both repos have branches with the same name.
- Here, in datachain, we are already checking out changes from the desired branch.
- Next, we do check if Studio have the branch with the same name — we will checkout the same branch from Studio.
- But Studio itself have datachain as a dependency, and we need to make sure it will use the same datachain branch as in this PR.
Before Studio tests in datachain repo was broken sometimes because of this — rarely, only if API changes, but still. Same happens in this PR.
This change is not related directly to this PR, let me separate it 👍
Improve UDF error handling: return raw error from UDF to handle it properly in the parent process.