[Feature]Add Model-Level Rate Limiting

🎯 The Goal / Use Case

In real-world usage, different models and providers (e.g., OpenAI, Anthropic, or self-hosted models) enforce different rate limits.

Currently, PicoClaw does not provide a built-in way to control request rates at the model level, which may lead to:

- Frequent rate limit exceeded errors
- Unstable gateway behavior under high concurrency
- Difficulty managing multi-model or multi-tenant workloads

A common use case is to limit a specific model to a fixed rate, such as 50 requests per minute (RPM).

💡 Proposed Solution

Introduce model-level rate limiting configuration, allowing users to define limits per model.

Suggested capabilities:

- Configure RPM (requests per minute) per model (e.g., 50 RPM)
- Optionally support TPM (tokens per minute)
- Apply rate limiting automatically in the gateway before forwarding requests to providers

Example configuration:
```json
{
  "model": "gpt-4",
  "rate_limit": {
    "rpm": 50,
    "tpm": 10000
  }
}
```

🛠 Potential Implementation (Optional)
- Implement a token bucket or leaky bucket rate limiter per model
- Maintain a rate limiter map keyed by model name in the gateway layer
- Apply limiting before dispatching requests to the provider

Possible structure in Go:

- RateLimiterManager (map[model]limiter)
- Middleware in gateway request pipeline
- Config-driven initialization
🚦 Impact & Roadmap Alignment
- [x] This is a Core Feature
- [ ] This is a Nice-to-Have / Enhancement
- [x] This aligns with the current Roadmap

🔄 Alternatives Considered
- External rate limiting (e.g., Nginx, API Gateway)
- Not flexible for per-model control
- Client-side throttling
- Hard to maintain and not centralized
💬 Additional Context
-  Many providers enforce strict rate limits, making this feature essential for stable operation
- This would greatly improve reliability in multi-model and production deployments
- Could be extended in the future to support:
  - Per-user or per-API-key rate limiting
  - Burst control and priority queues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]Add Model-Level Rate Limiting #2029

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]Add Model-Level Rate Limiting #2029

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions