Skip to content

contentstack-launch-examples/launch-edge-ai-crawlers-block-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Edge AI Block Example

A Next.js application demonstrating how to use Edge Functions to block AI crawlers and bots while allowing legitimate traffic to pass through.

πŸš€ Features

  • AI Crawler Detection: Automatically detects and blocks known AI training bots
  • Edge Function Protection: Runs at the edge for maximum performance and security
  • Configurable Bot List: Easy to add or remove bots from the blocklist
  • Real-time Logging: Detailed logs for monitoring blocked requests
  • Modern UI: Beautiful demonstration page with testing interface

πŸ›‘οΈ Protected Against

The Edge Function blocks the following bots and crawlers:

  • AI Training Bots: claudebot, gptbot
  • Search Engines: googlebot, bingbot, yandexbot
  • SEO Tools: ahrefsbot, semrushbot, mj12bot
  • Social Media: facebookexternalhit, twitterbot

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Client Request│───▢│  Edge Function   │───▢│  Your App/API   β”‚
β”‚                 β”‚    β”‚  (Bot Detection) β”‚    β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                       β”‚  403 Forbidden   β”‚
                       β”‚  (If Bot Found)  β”‚
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Getting Started

Prerequisites

  • Node.js 18+
  • npm, yarn, pnpm, or bun

Installation

  1. Clone the repository

    git clone <your-repo-url>
    cd edge-ai-block-example
  2. Install dependencies

    npm install
    # or
    yarn install
  3. Run the development server

    npm run dev
    # or
    yarn dev
  4. Open your browser Navigate to http://localhost:3000

πŸ§ͺ Testing

Test Bot Blocking

Use curl to test different user agents:

# Test AI crawler (should be blocked)
curl -A "gptbot" http://localhost:3000/

# Test regular browser (should pass through)
curl -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" http://localhost:3000/

# Test other bots
curl -A "claudebot" http://localhost:3000/
curl -A "googlebot" http://localhost:3000/
curl -A "facebookexternalhit" http://localhost:3000/

Expected Results

  • Blocked Bots: Return 403 Forbidden with message "Forbidden: AI crawlers are not allowed."
  • Legitimate Traffic: Pass through to your application normally

πŸ”§ Configuration

Adding/Removing Bots

Edit the KNOWN_BOTS array in functions/[proxy].edge.js:

const KNOWN_BOTS = [
  'claudebot',
  'gptbot',
  'googlebot',
  'bingbot',
  'ahrefsbot',
  'yandexbot',
  'semrushbot',
  'mj12bot',
  'facebookexternalhit',
  'twitterbot',
  // Add more bots here
];

Customizing Response

Modify the response in the Edge Function:

if (isBot) {
  console.warn(`:no_entry: Blocked bot: UA="${userAgent}"`);
  return new Response('Custom message here', { status: 403 });
}

πŸ“ Project Structure

edge-ai-block-example/
β”œβ”€β”€ functions/
β”‚   └── [proxy].edge.js    # Edge Function for bot detection
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ page.tsx           # Main demonstration page
β”‚   └── forbidden/
β”‚       └── page.tsx       # 403 error page
β”œβ”€β”€ public/
β”‚   └── robots.txt         # Sample robots.txt file
└── README.md

πŸš€ Deployment

Deploy to Vercel

  1. Push to GitHub

    git add .
    git commit -m "Initial commit"
    git push origin main
  2. Deploy to Vercel

    • Connect your GitHub repository to Vercel
    • Vercel will automatically detect the Edge Function
    • Deploy with zero configuration

Environment Variables

No environment variables required for basic functionality.

πŸ“Š Monitoring

View Logs

Check your deployment logs to see blocked requests:

:no_entry: Blocked bot: UA="gptbot"
:no_entry: Blocked bot: UA="claudebot"

Analytics

The Edge Function logs all blocked requests, making it easy to:

  • Monitor bot activity
  • Identify new bot patterns
  • Track protection effectiveness

πŸ”’ Security Features

  • Edge-level Protection: Blocks bots before they reach your application
  • Performance Optimized: Minimal latency impact
  • Configurable: Easy to customize for your needs
  • Reliable: Fallback mechanisms ensure continuous protection

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support


Built with ❀️ using Next.js and Vercel Edge Functions

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors