Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Android Startup Perfetto trace support #4776

Merged
merged 7 commits into from
Mar 25, 2025

Conversation

LoopedBard3
Copy link
Member

Add support for getting perfetto traces for the android runs. This will start out as a primarily manual thing to help users of our scenario testing when doing manual runs, or manually setup pipeline runs.

@LoopedBard3 LoopedBard3 added the enhancement New feature or request label Mar 18, 2025
@LoopedBard3 LoopedBard3 self-assigned this Mar 18, 2025
LoopedBard3 and others added 2 commits March 18, 2025 14:41
…testing. Still need to add copy of the file to the upload dir if running in helix.
@LoopedBard3 LoopedBard3 force-pushed the startup_perfetto_support branch from d340e3b to fc53ef5 Compare March 18, 2025 21:41
@LoopedBard3 LoopedBard3 marked this pull request as ready for review March 18, 2025 23:22
@LoopedBard3 LoopedBard3 requested a review from matouskozak March 18, 2025 23:22
@LoopedBard3 LoopedBard3 marked this pull request as draft March 18, 2025 23:23
@LoopedBard3 LoopedBard3 removed the request for review from matouskozak March 18, 2025 23:23
@LoopedBard3 LoopedBard3 marked this pull request as ready for review March 18, 2025 23:46
@LoopedBard3
Copy link
Member Author

Tested successfully locally and initial use is just in manual runs.

# Wait until the total time taken to start the app is greater than the max startup time + 3
# This is to ensure that the trace has captured the entire startup process
getLogger().info("Ensuring the trace capture has been completed.")
while time.time() - perfetto_start_time_sec < max_startup_time_sec + 3:
Copy link
Contributor

@e-kharion e-kharion Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where the 3 seconds come from?
Would it make a sense to try verify the trace capturing finished somehow?
This way it feels potentially fragile

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely moved away from using the seconds and instead switched to killing the tracing after we finish the startup command. This background start and then kill seems to be the recommended approach after further digging in the docs: https://perfetto.dev/docs/concepts/config#android.

pull_trace_cmd = xharness_adb() + [
'pull',
perfetto_device_save_file,
os.path.join(os.getcwd(), const.TRACEDIR, f'perfetto_startup_trace_{self.packagename}_{time.time()}.trace')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would it make sense to use formatted time instead of a number here? Would somebody be checking the files manually?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would somebody be checking the files manually?

The idea is that users would upload the captured trace to https://ui.perfetto.dev and analyzed the results there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Does the filename format matter anyhow for the tool? If not (meaning the filename would not get processed by a machine) than some more human friendly time format make more sense to me. Still a nit, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the filename format matter anyhow for the tool?

Unless we do some very weird filename it shouldn't matter. Human friendly file name would be nice :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, I have made the change to instead have a nicely formatted datetime instead. It will be in my next update.


perfetto_cmd = xharness_adb() + [
'shell',
f'perfetto --background --txt -o {perfetto_device_save_file} --time {max_startup_time_sec + 3}s -b 64mb sched freq idle am wm gfx view binder_driver hal dalvik camera input res memory'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where the +3 comes from?

I have no idea about the usual "magnitude" of the max_startup_time_sec - but if it's just some kind of reserve, it might be better to add some percentage to the original value (in case the value might vary significantly)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally want to run the tracing for the startup duration, that is what is calculated in:
max_startup_time_sec = int(max(int(re.search(r"TotalTime: (\d+)", str(result)).group(1)) for result in allResults) / 1000).

+3s is probably just to introduce some slack if the startup would take unexpectedly longer.

I would maybe add the max_startup_time_sec + 3 as a separate variable value with the comment from below why we use + 3 to make it a bit clearer.

I have no idea about the usual "magnitude" of the max_startup_time_sec

In scenarios we are measuring the startup ranges from hundreds of milliseconds to low seconds. Making it, for example, 1.5x of the max startup should be good enough as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely moved away from using the seconds and instead switched to killing the tracing after we finish the startup command. This background start and then kill seems to be the recommended approach after further digging in the docs: https://perfetto.dev/docs/concepts/config#android. Still kept a max time for the tracing at 2x the max found startup, up to 60 seconds. At 60 seconds it throws an error as an arbitrary but, I think reasonable, cut off.

e-kharion
e-kharion previously approved these changes Mar 19, 2025
Copy link
Contributor

@e-kharion e-kharion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, left few comments/questions


try:
# Get the current value of persist.traced.enable
getLogger().info("Getting current persist.traced.enable value")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: I know you follow the pattern here, but I feel that decomposing the code a bit would improve the readability/maintainability significantly. For example separating all the commands into some helper commands library class and only call those here would make the code much nicer in my opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this assessment, so I made a task here: #4784.

matouskozak
matouskozak previously approved these changes Mar 19, 2025
Copy link
Member

@matouskozak matouskozak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot Parker!

…time human readable, and switching from catching the trace by taking the max startup and adding 3 seconds to instead just killing perfetto once startup testing is complete.
@LoopedBard3 LoopedBard3 dismissed stale reviews from matouskozak and e-kharion via d77604a March 20, 2025 22:03

# Stop perfetto now that the app has started. Sending a Terminate signal should be enough per the longer trace capturing guidance here: https://perfetto.dev/docs/concepts/config#android.
getLogger().info("Stopping perfetto trace capture")
stop_perfetto_cmd = xharness_adb() + [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the perfetto killall also be in the finally block?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, it might make sense to run a precautionary perfetto killall before running the kill command at the top. I know a couple times we have been bitten by processes not being killed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a kill before the execution is a great idea. I have updated the code per these comments. I included a perfetto kill in the finally block, though am not super worried about it as the time parameter in the call starting the trace should ensure it ends eventually.

…art and to kill as a finally command if we failed before the normal kill spot.
@LoopedBard3 LoopedBard3 merged commit 56bd5d9 into dotnet:main Mar 25, 2025
77 of 84 checks passed
@LoopedBard3 LoopedBard3 deleted the startup_perfetto_support branch March 25, 2025 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants