Skip to content

Add scripts/plot.py for comparing latencies, etc. #3327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

jberryman
Copy link
Collaborator

This can be used to visualize performance changes between two or more
branches, across several categories (e.g. requests-per-second load).

While not perfect I think the output goes a long way to helping the user
absorb a lot of information about the data, while avoiding drawing wrong
inferences, e.g. paying too much attention to mean, or forgetting how
noisey the data is.

This will hopefully be integrated into some benchmarking scripts, and improved, probably in collaboration with Nizar's work in e.g.

tirumaraiselvan#70 , and...
#3310 (review)

Wanted to share this ASAP though.

Steps to test and verify

See README.md

Example

The graph offers:

  • easy comparison, across categories
  • very granular view of actual samples, while the violinplot (like a histogram) shows us where points are most dense (more expressive than just mean or median); the raw points also help validate whether comparing percentiles is meaningful (although, we should do this better)
  • useful percentiles

Tentative plan is to use this to plot internal service timing, and combine it with some hdr histogram timing from wrk2 (the client), so we can see where throughput is actually making things fall over.

plot

This can be used to visualize performance changes between two or more
branches, across several categories (e.g. requests-per-second load).

While not perfect I think the output goes a long way to helping the user
absorb a lot of information about the data, while avoiding drawing wrong
inferences, e.g. paying too much attention to mean, or forgetting how
noisey the data is.
@jberryman
Copy link
Collaborator Author

@nizar-m let me know what you think of this;

@netlify
Copy link

netlify bot commented Nov 10, 2019

Deploy preview for hasura-docs ready!

Built with commit 283dec9

https://deploy-preview-3327--hasura-docs.netlify.com

@nizar-m
Copy link
Contributor

nizar-m commented Nov 11, 2019

Definitely very interesting plots. Why the Indian oil lamp like plots for latencies, I have no idea.

We should have this for our latency plots.

@jberryman
Copy link
Collaborator Author

Why the Indian oil lamp like plots for latencies, I have no idea.

These are "violin plots" (maybe they have other names too), and technically speaking graph https://en.wikipedia.org/wiki/Kernel_density_estimation but you can think of them like a continuous histogram; so places where the violin is very thick are areas that are very dense with samples (above these are unsurprisingly right around the median represented by the thin black line).

- no need to add category and variant labels on command line
- produce an html table of median and min values, with highlighting
  based on pct difference
- Flag --baseline-variant affects above
- color sample dots to highlight drift
- put output files in input data directory
@jberryman
Copy link
Collaborator Author

Closing this for now, as I think it's going to live in its own repo.

@jberryman jberryman closed this Dec 3, 2019
@hasura-bot
Copy link
Contributor

Review app https://hge-ci-pull-3327.herokuapp.com is deleted

@jberryman
Copy link
Collaborator Author

Development continues here for this: https://github.com/hasura/latency-plotting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants