Skip to content

Add performance sampler to record cpu and memory usage during execution #178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

zhangt2333
Copy link
Member

@zhangt2333 zhangt2333 commented Aug 2, 2025

Currently, monitoring Tai-e's resource usage (e.g., CPU and memory) requires external tools (e.g., OS Task Manager), and no performance data is archived after execution. This makes debugging and optimization across multiple runs inconvenient.

This PR adds built-in performance sampling that automatically collects key metrics (e.g., CPU and memory usage) during execution and saves structured data for future analysis and comparison.

When Tai-e runs with the new --performance-sampling option, it outputs a JSON file tai-e-performance.json to the output directory. Example:

{
  "version" : "0.5.2-SNAPSHOT",
  "commit" : "cf083e99336a837f3a3457614c4163fe7ae16f77",
  "operatingSystem" : "Windows 11 (amd64)",
  "javaRuntime" : "Eclipse Adoptium OpenJDK Runtime Environment 17.0.13+11",
  "username" : "admin",
  "cpuCores" : 32,
  "memoryMB" : 130767,
  "startTime" : 1754200073671,
  "finishTime" : 1754200175126,
  "samples" : [ {
    "timestamp" : 1754200073671,
    "processCpuUsage" : 0.036322855861237346,
    "systemCpuUsage" : 0.09348383513281133,
    "processMemoryUsedMB" : 223,
    "systemMemoryUsedMB" : 49495
  },...]
}

tai-e-performance.json ← Such file can be manually reviewed or visualized using external tools. For example, below shows the visualization effect created with Claude AI in just a few minutes.

image

@codecov-commenter
Copy link

codecov-commenter commented Aug 2, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 78.12500% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.32%. Comparing base (a9c37ee) to head (e875f5d).

Files with missing lines Patch % Lines
...main/java/pascal/taie/util/PerformanceSampler.java 78.48% 12 Missing and 5 partials ⚠️
.../main/java/pascal/taie/util/RuntimeInfoLogger.java 60.00% 2 Missing and 2 partials ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #178      +/-   ##
============================================
+ Coverage     75.29%   75.32%   +0.02%     
- Complexity     4585     4601      +16     
============================================
  Files           480      481       +1     
  Lines         15913    16003      +90     
  Branches       2181     2185       +4     
============================================
+ Hits          11982    12054      +72     
- Misses         3060     3074      +14     
- Partials        871      875       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

This comment was marked as outdated.

@zhangt2333 zhangt2333 requested a review from Copilot August 3, 2025 05:44
Copilot

This comment was marked as outdated.

@zhangt2333 zhangt2333 requested a review from Copilot August 3, 2025 05:58
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds built-in performance monitoring capabilities to Tai-e by introducing a new PerformanceSampler class that automatically collects CPU and memory usage metrics during execution and saves them to a JSON file for later analysis.

  • Introduces a PerformanceSampler class that collects system and JVM performance metrics at regular intervals
  • Adds a new --performance-sampling command-line option to enable performance monitoring
  • Extracts version and commit information utilities from RuntimeInfoLogger to be reusable by the performance sampler

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/main/java/pascal/taie/util/PerformanceSampler.java Core performance sampling implementation with metrics collection and JSON output
src/main/java/pascal/taie/Main.java Integration of performance sampler with main application lifecycle
src/main/java/pascal/taie/config/Options.java Addition of --performance-sampling command-line option
src/main/java/pascal/taie/util/RuntimeInfoLogger.java Extraction of version/commit utilities for reuse by performance sampler
src/test/java/pascal/taie/util/PerformanceSamplerTest.java Unit and integration tests for performance sampling functionality
src/test/java/pascal/taie/analysis/pta/PointerAnalysisResultTest.java Addition of copyright header (comment-only change)
Comments suppressed due to low confidence (1)

src/test/java/pascal/taie/util/PerformanceSamplerTest.java:37

  • The test doesn't verify the return value of outputFile.delete() or handle the case where deletion fails. This could lead to false test results if the file cannot be deleted before the test starts.
        outputFile.delete();

- Change Tai-e to read build information from "META-INF/tai-e-build.properties" instead of "META-INF/MANIFEST.MF", which can be overwritten during repackaging
- Replace dual-strategy reading (gradle.properties/.git for developers, MANIFEST.MF for users) with unified approach that doesn't depend on source code structure
@silverbullettt
Copy link
Contributor

This looks cool! However, I have some concerns regarding the usefulness of this sampling. Much of it can be handled by the operating system or with a simple shell script. In my view, a more effective performance sampler for Tai-e should record CPU, memory, and other resource usages during specific phases, such as the front-end and each analysis.

Regarding the sampler itself, I have a few additional points:

  • Why not use YAML as the output format?
  • Why isn’t elapsed time included in the output?
  • Is the cpuCores information truly useful? I believe including CPU model information would be more beneficial.

@zhangt2333
Copy link
Member Author

Thank you @silverbullettt very much for your time and constructive feedback.

Please allow me to address your questions individually:

In my view, a more effective performance sampler for Tai-e should record CPU, memory, and other resource usages during specific phases, such as the front-end and each analysis.

This PR originally stemmed from the need to monitor resource usage (particularly memory), rather than to provide statistics on Tai-e's various phases (which are already output through logging).

Much of it can be handled by the operating system or with a simple shell script.

I completely agree that there are indeed multiple ways to perform performance sampling on Java programs (such as CLI tools like pidstat, GUI tools like IntelliJ Profiler, etc.).

However, as I emphasized in the PR description "This PR adds built-in performance sampling", the built-in provides significant advantages for developers:

  • Out-of-the-box functionality: Simply enable the Tai-e option to achieve sampling without additional configuration
  • Zero dependencies: No need to install or rely on any external tools
  • Reduced learning curve: Developers don't need to master the usage of various performance analysis tools
  • Cross-platform consistency: Eliminates tool differences and script compatibility issues across different operating systems
  • Unified output format: Automatically handles the format and storage location of result files, so users don't need to be aware of or burdened with these details

Why not use YAML as the output format?

This is primarily due to JSON's broader compatibility, as modern web browsers and JavaScript environments have native JSON parsing support that allows direct parsing without additional libraries, whereas YAML requires third-party libraries for parsing in JavaScript. Therefore, directly outputting JSON facilitates seamless web-based visualization without additional processing steps.

Why isn’t elapsed time included in the output?

My initial consideration was that elapsed time could be calculated from the startTime and finishTime values. I believe your suggestion is excellent, and I would like to propose replacing finishTime with elapsedTime, while also converting the timestamps in samples to represent elapsed time since startTime. What are your thoughts on this approach?

Is the cpuCores information truly useful? I believe including CPU model information would be more beneficial.

You raise an excellent point about obtaining CPU model information, which would indeed provide richer details compared to cpuCores, including information about CPU frequency and cache specifications.

However, obtaining CPU model information requires native OS interaction capabilities, which the JVM does not provide directly through its APIs. An alternative approach would be to utilize cross-platform third-party libraries (e.g., https://github.com/oshi/oshi), but this would introduce an additional dependency that may not be essential for Tai-e's core functionality.

The use of cpuCores is based on referencing the Gradle Report format (e.g., https://scans.gradle.com/s/i3dazlpreguec), and it can be easily obtained through the JDK. By comparing cpuCores with CPU usage samples, we can also observe Tai-e's multi-core utilization efficiency over time.


I appreciate your thoughtful review and look forward to your feedback on these points.

@silverbullettt
Copy link
Contributor

silverbullettt commented Aug 14, 2025

@zhangt2333 I agree with all your points, including the benefits of the built-in sampler. My only concern is that since we're creating an internal performance sampler, why not leverage its intrusive characteristics to provide more valuable information?

(P.S. comment AI味浓了点哈bro😏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants