Skip to content

feat: add Olostep toolkit for web scraping, crawling, search and AI #7151

Open
umerkay wants to merge 3 commits intoagno-agi:mainfrom
umerkay:feat/olostep-toolkit
Open

feat: add Olostep toolkit for web scraping, crawling, search and AI #7151
umerkay wants to merge 3 commits intoagno-agi:mainfrom
umerkay:feat/olostep-toolkit

Conversation

@umerkay
Copy link
Copy Markdown

@umerkay umerkay commented Mar 24, 2026

Summary

Closes #7048

Adds OlostepTools which is an Agno toolkit wrapping the Olostep web data API.

Olostep is a web scraping, crawling, and AI-powered data extraction API used by leading AI startups. This toolkit gives Agno agents access to all of Olostep's core capabilities through a clean, consistent interface.

Tools added:

  • scrape_url: scrape any URL to markdown/html/text/json, with support for parsers and LLM extraction
  • crawl_website: recursively crawl a site with URL glob filtering and relevance-based search
  • map_website: discover all URLs on a site (sitemaps + discovered links)
  • search_web: web search returning structured links with titles and descriptions
  • answer_question: AI-synthesized answers grounded in live web data, with optional JSON schema output
  • batch_scrape: concurrent scraping of up to 10,000 URLs in a single job

Files changed:

  • libs/agno/agno/tools/olostep.py: toolkit implementation
  • libs/agno/pyproject.toml: added olostep optional dependency and mypy override
  • cookbook/91_tools/olostep_tools.py: three working example agents

Type of change

  • New feature

Checklist

  • Code complies with style guidelines
  • Ran format/validation scripts (./scripts/format.sh and ./scripts/validate.sh)
  • Self-review completed
  • Documentation updated (comments, docstrings)
  • Examples and guides: Relevant cookbook examples have been included or updated (if applicable)
  • Tested in clean environment
  • Tests added/updated (if applicable)

Duplicate and AI-Generated PR Check

  • I have searched existing open pull requests and confirmed that no other PR already addresses this issue
  • If a similar PR exists, I have explained below why this PR is a better approach
  • Check if this PR was entirely AI-generated (by Copilot, Claude Code, Cursor, etc.)

Additional Notes

Tested end-to-end locally with live API keys. All three cookbook examples ran successfully:

  • Single URL scrape returning clean markdown
  • Parallel search_web + answer_question tool calls
  • map_websitebatch_scrape pipeline (mapped 12 feature pages, batch scraped all in one job, returned structured summaries)

Install: pip install agno[olostep]
Olostep API key required: https://www.olostep.com/dashboard/api-keys

@umerkay umerkay requested a review from a team as a code owner March 24, 2026 11:22
@kausmeows
Copy link
Copy Markdown
Contributor

hi @umerkay thanks for this, would you also be willing to add docs for this here- https://github.com/agno-agi/docs ?

@umerkay
Copy link
Copy Markdown
Author

umerkay commented Mar 24, 2026

Ofcourse. Let me fix the test case and create the docs as well. @kausmeows

@umerkay umerkay force-pushed the feat/olostep-toolkit branch from 75e4e3c to 247d9f8 Compare March 24, 2026 11:46
@umerkay
Copy link
Copy Markdown
Author

umerkay commented Mar 24, 2026

@kausmeows The CI is failing with exit code 143 (OOM/timeout) during Installing agno-infra in editable mode
before pytest even runs.
This appears to be a pre-existing infrastructure issue unrelated to this PR. Could you advise? Happy to make any code changes needed but this seems outside my control. Thanks!

@umerkay
Copy link
Copy Markdown
Author

umerkay commented Mar 24, 2026

agno-agi/docs#583

PR made for docs as well :)

@umerkay
Copy link
Copy Markdown
Author

umerkay commented Mar 26, 2026

@kausmeows The CI is failing with exit code 143 (OOM/timeout) during Installing agno-infra in editable mode
before pytest even runs.
This appears to be a pre-existing infrastructure issue unrelated to this PR. Could you advise? Happy to make any code changes needed but this seems outside my control. Thanks!

@kausmeows
Copy link
Copy Markdown
Contributor

@kausmeows The CI is failing with exit code 143 (OOM/timeout) during Installing agno-infra in editable mode before pytest even runs. This appears to be a pre-existing infrastructure issue unrelated to this PR. Could you advise? Happy to make any code changes needed but this seems outside my control. Thanks!

Merged in main, its fixed there, should be good now

@umerkay
Copy link
Copy Markdown
Author

umerkay commented Mar 26, 2026

@kausmeows Thanks. Tests have passed. Whenever possible, it would be wonderful if the PR can be accepted and merged :D

@umerkay
Copy link
Copy Markdown
Author

umerkay commented Mar 30, 2026

@kausmeows Whenever possible, please complete the PR and merge with official docs :)

I have also separately submitted a docs page for this at agno-agi/docs#583

@umerkay
Copy link
Copy Markdown
Author

umerkay commented Apr 2, 2026

@kausmeows mentioning for your response :)

@umerkay
Copy link
Copy Markdown
Author

umerkay commented Apr 13, 2026

@kausmeows response appreciated :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] feat: add Olostep toolkit for web scraping, crawling, search and AI answers

2 participants