Skip to content

Issue with concurrent requests on AWS Fargate #22

Open
@eliran89c

Description

@eliran89c

Describe the Bug
I am encountering an issue where concurrent requests are being processed sequentially rather than simultaneously when deployed on AWS Fargate.
I suspect the problem is that boto3 runs synchronously, and its calls are blocking.

API Details

  • API Used: /chat/completions
  • Model Used: all of them

To Reproduce
Steps to reproduce the behavior:

  1. Deploy the service on AWS Fargate following the standard setup procedures.
  2. Send multiple concurrent requests (e.g., 10 concurrent requests) to the API.
  3. Observe that the requests are processed sequentially instead of concurrently.

Expected Behavior
I expected that when sending multiple concurrent requests to the API, all requests would be handled simultaneously or at least as many as the server can handle

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions