-
Notifications
You must be signed in to change notification settings - Fork 182
Add performance sampler to record cpu and memory usage during execution #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #178 +/- ##
============================================
+ Coverage 75.29% 75.32% +0.02%
- Complexity 4585 4601 +16
============================================
Files 480 481 +1
Lines 15913 16003 +90
Branches 2181 2185 +4
============================================
+ Hits 11982 12054 +72
- Misses 3060 3074 +14
- Partials 871 875 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds built-in performance monitoring capabilities to Tai-e by introducing a new PerformanceSampler
class that automatically collects CPU and memory usage metrics during execution and saves them to a JSON file for later analysis.
- Introduces a
PerformanceSampler
class that collects system and JVM performance metrics at regular intervals - Adds a new
--performance-sampling
command-line option to enable performance monitoring - Extracts version and commit information utilities from
RuntimeInfoLogger
to be reusable by the performance sampler
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
src/main/java/pascal/taie/util/PerformanceSampler.java | Core performance sampling implementation with metrics collection and JSON output |
src/main/java/pascal/taie/Main.java | Integration of performance sampler with main application lifecycle |
src/main/java/pascal/taie/config/Options.java | Addition of --performance-sampling command-line option |
src/main/java/pascal/taie/util/RuntimeInfoLogger.java | Extraction of version/commit utilities for reuse by performance sampler |
src/test/java/pascal/taie/util/PerformanceSamplerTest.java | Unit and integration tests for performance sampling functionality |
src/test/java/pascal/taie/analysis/pta/PointerAnalysisResultTest.java | Addition of copyright header (comment-only change) |
Comments suppressed due to low confidence (1)
src/test/java/pascal/taie/util/PerformanceSamplerTest.java:37
- The test doesn't verify the return value of outputFile.delete() or handle the case where deletion fails. This could lead to false test results if the file cannot be deleted before the test starts.
outputFile.delete();
- Change Tai-e to read build information from "META-INF/tai-e-build.properties" instead of "META-INF/MANIFEST.MF", which can be overwritten during repackaging - Replace dual-strategy reading (gradle.properties/.git for developers, MANIFEST.MF for users) with unified approach that doesn't depend on source code structure
This looks cool! However, I have some concerns regarding the usefulness of this sampling. Much of it can be handled by the operating system or with a simple shell script. In my view, a more effective performance sampler for Tai-e should record CPU, memory, and other resource usages during specific phases, such as the front-end and each analysis. Regarding the sampler itself, I have a few additional points:
|
Thank you @silverbullettt very much for your time and constructive feedback. Please allow me to address your questions individually:
This PR originally stemmed from the need to monitor resource usage (particularly memory), rather than to provide statistics on Tai-e's various phases (which are already output through logging).
I completely agree that there are indeed multiple ways to perform performance sampling on Java programs (such as CLI tools like pidstat, GUI tools like IntelliJ Profiler, etc.). However, as I emphasized in the PR description "This PR adds built-in performance sampling", the built-in provides significant advantages for developers:
This is primarily due to JSON's broader compatibility, as modern web browsers and JavaScript environments have native JSON parsing support that allows direct parsing without additional libraries, whereas YAML requires third-party libraries for parsing in JavaScript. Therefore, directly outputting JSON facilitates seamless web-based visualization without additional processing steps.
My initial consideration was that elapsed time could be calculated from the startTime and finishTime values. I believe your suggestion is excellent, and I would like to propose replacing finishTime with elapsedTime, while also converting the timestamps in samples to represent elapsed time since startTime. What are your thoughts on this approach?
You raise an excellent point about obtaining CPU model information, which would indeed provide richer details compared to However, obtaining CPU model information requires native OS interaction capabilities, which the JVM does not provide directly through its APIs. An alternative approach would be to utilize cross-platform third-party libraries (e.g., https://github.com/oshi/oshi), but this would introduce an additional dependency that may not be essential for Tai-e's core functionality. The use of I appreciate your thoughtful review and look forward to your feedback on these points. |
@zhangt2333 I agree with all your points, including the benefits of the built-in sampler. My only concern is that since we're creating an internal performance sampler, why not leverage its intrusive characteristics to provide more valuable information? (P.S. comment AI味浓了点哈bro😏 |
Currently, monitoring Tai-e's resource usage (e.g., CPU and memory) requires external tools (e.g., OS Task Manager), and no performance data is archived after execution. This makes debugging and optimization across multiple runs inconvenient.
This PR adds built-in performance sampling that automatically collects key metrics (e.g., CPU and memory usage) during execution and saves structured data for future analysis and comparison.
When Tai-e runs with the new
--performance-sampling
option, it outputs a JSON filetai-e-performance.json
to the output directory. Example:tai-e-performance.json ← Such file can be manually reviewed or visualized using external tools. For example, below shows the visualization effect created with Claude AI in just a few minutes.