Quick Start Guide

Get up and running in 5 minutes!

1️⃣ One-Command Setup (Recommended)

cd ai-web-crawler-bootcamp
./setup.sh

This will:

✅ Check Python version
✅ Create virtual environment
✅ Install all dependencies
✅ Install Playwright browsers
✅ Create configuration files

2️⃣ Add Your API Key

Edit the .env file:

nano .env

Add your API key (choose one):

For OpenAI:

OPENAI_API_KEY=sk-your-actual-key-here
LLM_MODEL=gpt-4-turbo-preview

For Azure OpenAI:

AZURE_API_KEY=your-key-here
AZURE_API_BASE=https://your-resource.openai.azure.com
AZURE_API_VERSION=2024-02-15-preview
LLM_MODEL=azure/gpt-4

For Anthropic Claude:

ANTHROPIC_API_KEY=your-key-here
LLM_MODEL=claude-3-opus-20240229

Save and exit (Ctrl+X, then Y, then Enter)

3️⃣ Run the Application

# Make sure virtual environment is activated
source venv/bin/activate

# Start the web interface
streamlit run app.py

Your browser should open automatically to http://localhost:8501

If not, open it manually.

4️⃣ Try the Example

In the web interface:

Go to the Search-Based tab
Enter: Software development consultancy finland
Number of results: 5
Click Start Search-Based Crawl
Wait 5-10 minutes (AI is working!)
View results and download reports

Or run from command line:

python orchestrator.py

This runs the example automatically.

✅ Verify Installation

Test that everything works:

python -c "import litellm, playwright, streamlit; print('✅ All packages installed')"

🎯 What You Should See

Web Interface: Clean Streamlit UI with two tabs
Search Tab: Input field for search term and number selector
CSV Tab: File upload area with sample download
Sidebar: Configuration status and about info

📊 First Results

After your first crawl:

Results Table: Shows all companies with their values
Download Buttons: Get Excel or CSV
Company Details: Expand to see individual analysis
Files Created: Check ./reports/ directory

🐛 Quick Troubleshooting

Virtual environment not activated?

source venv/bin/activate

Port 8501 already in use?

streamlit run app.py --server.port 8502

Import errors?

pip install -r requirements.txt

Playwright browser missing?

playwright install chromium

📚 Next Steps

✅ Read README.md for full documentation
✅ Check EXAMPLE_USAGE.md for detailed examples
✅ Review API_DOCS.md for programmatic usage
✅ Try the sample CSV: sample_companies.csv

🚀 You're Ready!

That's it! You now have a working AI-powered web crawler.

Pro tip: Start with 2-3 companies to test before running larger batches.

Need help?

Check the logs in your terminal
Review the troubleshooting section in README.md
Make sure your API key is valid and has sufficient quota

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quick Start Guide

1️⃣ One-Command Setup (Recommended)

2️⃣ Add Your API Key

3️⃣ Run the Application

4️⃣ Try the Example

✅ Verify Installation

🎯 What You Should See

📊 First Results

🐛 Quick Troubleshooting

📚 Next Steps

🚀 You're Ready!

FilesExpand file tree

QUICKSTART.md

Latest commit

History

QUICKSTART.md

File metadata and controls

Quick Start Guide

1️⃣ One-Command Setup (Recommended)

2️⃣ Add Your API Key

3️⃣ Run the Application

4️⃣ Try the Example

✅ Verify Installation

🎯 What You Should See

📊 First Results

🐛 Quick Troubleshooting

📚 Next Steps

🚀 You're Ready!