Skip to content

Performance improvement for topological sort #850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 4, 2015

Conversation

testn
Copy link
Contributor

@testn testn commented Nov 1, 2015

  • topologicalSort() method has a bug where if there are no dependent nodes, it will treat all the nodes as dependent nodes. By fixing that logic, it will significantly speed up the test case that has a lot of independent nodes. The code in Severe thread contention while running large test with parallel methods #772 with 6,000 tests now takes 78 seconds vs 98 seconds earlier.
  • also provide a shortcut for removeFromNodes where if the node does not have any predecessor, it will not try to perform HashMap.get() which is expensive due to the hashing it has to perform. This change will help the the test case where tests are heavily dependent on each other. It seems to shed off 2-3% off from the start up time.

@cbeust
Copy link
Collaborator

cbeust commented Nov 1, 2015

The change looks non-breaking to me but can you explain why treating independent nodes as dependent ones is slowing things down?

Also, it would be nice to have a test for this but since it's internal logic, it's probably not worth it.

@testn
Copy link
Contributor Author

testn commented Nov 1, 2015

the change was an attempt to fix a bug but it turns out to fix other problem too so the build is breaking now. Most of the broken tests are caused by the fact that previously the test code expects the tests to be executed in alphabetical order where my test fixes the bug and thus removes that constraint.

  • Before the change, if in a test class, there is no dependency among test methods. By initializing m_independentNodes to be empty, it will treat all the nodes as dependent nodes. The logic inside topologicalSort will call removeFromNodes which is O(N) and it has to remove one node at a time which is also O(N). This method will therefore be O(N^2) to the number of dependent nodes.
  • To workaround the problems where the unit test fails, I added the code to sort the independent nodes alphabetically as well to ensure that those methods will be executed in a predictable order to keep the existing behavior even though it's not strictly needed.

@juherr
Copy link
Member

juherr commented Nov 1, 2015

As I said on #772:

I never feel good perf improvments without something concrete to evaluate it.

BTW, you should fix failing tests first.

@testn
Copy link
Contributor Author

testn commented Nov 1, 2015

BTW this one shaves around 20 seconds off... that should be enough if the tests pass. 😄

@testn
Copy link
Contributor Author

testn commented Nov 4, 2015

@cbeust have you got a chance to take a peek?

cbeust added a commit that referenced this pull request Nov 4, 2015
Performance improvement for topological sort
@cbeust cbeust merged commit 2c12573 into testng-team:master Nov 4, 2015
@testn testn deleted the topological-perf branch November 4, 2015 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants