Skip to content

Unable to send to sentry unhandled exceptions from pyspark jobs. #1228

Open
@fernanvarelamews

Description

@fernanvarelamews

Hi, I followed this documentation (https://docs.sentry.io/platforms/python/guides/pyspark/) and I was not able to send to sentry unhandled exceptions.

I was only able to do it using the capture_event and capture_exception methods, but for that, I have to capture any exception first.

Also, the implementation of the daemon as described in the documentation seems not to have any effect.

Python code below works as expected: Exception Sentry Poc Test is unhandled and it is sent to Sentry executing it from a windows local environment.

if __name__ == __main__:
    
    sentry_sdk.init("SENTRY_DSN",
                    integrations=[SparkIntegration()])

    raise Exception("Sentry Poc Test")

For Pyspark, the only way I found is handling the exception. This is deployed to the cluster using dbx and sentry-sdk dependencies are correctly deployed:

try:
    my_function(*args, **kwargs)
except Exception as err:
    message = f"An exception occurred"
    with sentry_sdk.push_scope() as scope:
        exc_info = exc_info_from_error(err)
        exceptions = exceptions_from_error_tuple(exc_info)
        scope.set_extra("exception", exceptions)

        sentry_sdk.capture_event({
            "message": message,
            "level": "error",
            "exception": {
                "values": exceptions
            },
        })
    raise

Am I missing something?, could you please advise me on how I should implement this to be able to send to sentry unhandled exceptions from pyspark? Thanks

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions