-
Notifications
You must be signed in to change notification settings - Fork 80
add benchmarks of Graph using asv #797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #797 +/- ##
=====================================
Coverage 85.4% 85.4%
=====================================
Files 150 150
Lines 15971 15951 -20
=====================================
- Hits 13632 13617 -15
+ Misses 2339 2334 -5
🚀 New features to boost your workflow:
|
possible/beneficial to add in memory profiling? |
Possibly, though I am probably not the one who's going to write it. |
i think we need to test lags and such. The graph ordering is critical to the sparse representation/alignment, and we need to maintain the integrity of that for all the linear algebra stuff. i.e. if we want to test indexing stuff, we need to test not only creation of sparse, but the math that goes through it |
Why? Everything beyond sparse creation is out of our reach. |
I mean, correctness is covered by tests, if that's what you mean. And performance on top of sparse has nothing to do with graph. |
i guess what im getting at is, in my gut, i don't think we're going to be able to avoid the performance overhead from sorting the graph (sorting for sparse will be unavoidable, but maybe we'll have a faster way than a pandas reindex or something). The question is when the hit occurs (when the graph is instantiated, when the sparse attribute is called, etc) so might be nice to compare different options |
Yeah, that's precisely aim of this benchmark. And I think to capture that, benchmarking lag or any other op on top of sparse is redundant. |
cool, i trust your lead then. I was imagining a scenario where we considered things like reimplementing lag or transform using groupby/apply instead of matrix operation which would have implications for when you need to build sparse (i.e. try to skip re-ordering by avoiding sparse whenever possible), then what tasks/packages that burden falls hardest onto |
Ah, yeah. If that happens we should certainly include these in the benchmarks. |
I have updated this to use @jGaboardi in the same way, |
So we can keep this PR open in the docket and simply trigger a re-run of it after other PRs are merged to compare vs. |
Nope, those comments are now obsolete. This is completely mergeable now, following the same model as geopandas does with benchmarks (which do not work at the moment but that is not relevant here). not was supposed to be now :) |
Starting with some simple benchmarks of Graph. My interest is to compare performance on main vs in a PR, not necessarily to compare it with W. Hence the assumption is that we will be comparing different JSON outputs.
Given I need to disable caching in base.py, let's not merge this.Any suggestions what to add? Something that could be sensitive to changes of the way we represent adjacency, since I am mostly interested in removing the overheads of reindexing we have now.