Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up test execution for non-cached tests #181

Open
gladyshcodes opened this issue Dec 26, 2024 · 10 comments
Open

Speed up test execution for non-cached tests #181

gladyshcodes opened this issue Dec 26, 2024 · 10 comments

Comments

@gladyshcodes
Copy link
Contributor

What

Speed up test execution by finding ways to addressing issues outlined below.

Why

While working on #179, I have found that screenshooting perhaps takes the most time when test runs in --no-cache mode. Sometimes, screenshots are taken several times when there's no need for it. Also, delay before making a screenshot is about a second or so.

@m2rads
Copy link
Contributor

m2rads commented Dec 27, 2024

That is a good observation and I here is my plan to speed it up. Here's an outline of what I have in mind:

  • Instead of saving screenshots in the folder, we can save it in the memory thus making efficient use of space and
  • Adding screenshots automatically after executing an action. (Currently we wait for AI to instruct us when to take sc)
  • Performing multiple actions per screenshot whenever possible. (Not sure if this is possible with Computer use API yet)

I think implementing these simple changes should speed up the AI execution by 2x at least.

@slavingia, @gladyshcodes Wdyt?

@slavingia
Copy link
Contributor

Makes sense. I pinged Anthropic to see if they'd support multiple actions in one step.

@gladyshcodes
Copy link
Contributor Author

Makes sense. I pinged Anthropic to see if they'd support multiple actions in one step.

Have you received a callback from Anthropic yet?

@slavingia
Copy link
Contributor

Not yet, will bump

@slavingia slavingia changed the title [SUGGESTION] Performance bottleneck for non-cached tests Performance bottleneck for non-cached tests Dec 30, 2024
@Shawns2759
Copy link

The executions are already pretty expensive. Do we have ways to cut down on cost as well as speed up executions?

@slavingia
Copy link
Contributor

We should probably tackle #187 first, to see that, and then evaluate. Anything that caches computer use should help.

@gladyshcodes
Copy link
Contributor Author

With this, we can debug things faster and measure results

Recently we introduced caching #179 that made test execution about 6 times faster. I have several more ideas in mind:

  • Running tests in parallel. Similarly to how Jest does that. This can skyrocket exec time
  • Running 'pre-validation' phase makes an initial LLM request to evaluate the test suite and answer questions like: do we need to lunch chromium? and what tests can be ran in parallel? letting the LLM decide the most efficient execution order itself
    • For example, the new Bash tool API tests (Bash tool #233) we’re rolling out soon—most of them can be run in parallel.
    • The initial LLM request may increase input tokens count, increasing costs but there may be a way to round it. Maybe leverage
      LLM memory and sending all tests at once and then referencing them by name to start execution
  • Batching multiple requests into a single LLM call could also help.

Hoping quota of LLM providers will decrease over time (similar to how the price of GFLOPS or disk space has dropped, making this tool more affordable for everyone

@slavingia
Copy link
Contributor

Batching multiple requests into a single LLM call could also help.

This will be huge and eventually happen.

Running tests in parallel

This seems like relatively low-hanging fruit to explore. In theory a server could run one browser for every test (just keeping in mind chaining/caching) that needs be run, and the entire test suite should just take as long as the slowest chain of tests.

@slavingia slavingia changed the title Performance bottleneck for non-cached tests Speed up test execution for non-cached tests Jan 2, 2025
@PedroAVJ
Copy link
Contributor

PedroAVJ commented Jan 4, 2025

Wouldn't running tests in parallel run into rate limit issues, and in turn, make null the speed gains? I suppose it depends partly on the API key tier, but when I ran the original claude computer use demo I would constantly get rate limited

@slavingia
Copy link
Contributor

Wouldn't running tests in parallel run into rate limit issues, and in turn, make null the speed gains? I suppose it depends partly on the API key tier, but when I ran the original claude computer use demo I would constantly get rate limited

Things may have changed, but overall you're right it'll be a bottleneck. I'll bring it up with them!

@rmarescu rmarescu moved this to For discussion in Shortest Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: For discussion
Development

No branches or pull requests

5 participants