Skip to content

Conversation

renechoi
Copy link
Contributor

What is this PR for?

This PR improves process resource management in the detectSparkScalaVersion method of SparkInterpreterLauncher.java to prevent process hangs and resource leaks.

Current Issues Fixed:

  1. Buffer Overflow Risk: Process stdout was not consumed, potentially causing hangs if spark-submit produces output
  2. Indefinite Blocking: No timeout on process.waitFor() could cause indefinite blocking
  3. Missing Process Cleanup: Process wasn't explicitly destroyed after use
  4. No Exit Value Validation: Process exit status wasn't checked

What type of PR is it?

Improvement

Todos

  • - Code review
  • - CI build verification

What is the Jira issue?

How should this be tested?

  • Unit test added: testDetectSparkScalaVersionProcessManagement - Verifies the method works correctly with new process management

  • Manual testing:

    • Start Zeppelin with Spark interpreter
    • Verify spark-submit processes are properly terminated
    • No zombie processes should remain after interpreter initialization
  • CI: All existing tests pass

Screenshots (if appropriate)

N/A

Questions:

  • Does the license files need to update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

Implementation Details

  • Added stdout consumption using IOUtils.copy() to NullOutputStream to prevent buffer overflow
  • Implemented 30-second timeout with process.waitFor(30, TimeUnit.SECONDS)
  • Added forceful process termination on timeout
  • Added exit value logging for better diagnostics
  • Ensured process cleanup in finally block with destroyForcibly()
  • Moved file reading into try-finally block for better resource management

Benefits

  1. Prevents Hangs: Stdout consumption prevents buffer-related process hangs
  2. Guaranteed Termination: 30-second timeout ensures process won't block indefinitely
  3. Better Diagnostics: Exit value logging helps identify spark-submit failures
  4. Resource Safety: Explicit process cleanup prevents zombie processes

…erLauncher

- Added stdout consumption to prevent buffer overflow
- Implemented process timeout (30 seconds) with forceful termination
- Added exit value validation and logging
- Ensured process cleanup in finally block
- Maintained backward compatibility

This prevents process hangs and ensures proper resource cleanup.
@ParkGyeongTae
Copy link
Contributor

Could you please rebase this branch onto master?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants