Skip to content

feat(server): enhance server concurrency with task queue, worker threads, and graceful shutdown #3181

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

sachaarbonel
Copy link
Contributor

@sachaarbonel sachaarbonel commented May 22, 2025

Enhancements in this update focus on making the server more robust, efficient, and easier to manage:

  • Concurrency improvements:
    • Requests are now handled through a task queue and a pool of worker threads, significantly boosting throughput and responsiveness.
  • Graceful shutdown and draining:
    • The server now ensures all in-flight requests are completed before shutting down, supporting safer deployments and restarts.
  • Configurable server parameters:
    • Listen backlog size
    • Read and write timeouts
    • Maximum requests per connection
    • Idle keep-alive timeout
    • Maximum upload size
    • Temporary upload directory
    • Inference timeout
    • Maximum tasks in the inference queue
  • Performance and resource management:
    • Improved request queuing, processing, and cleanup for better resource utilization and operational safety.

These changes collectively make the server more resilient under load, easier to tune for different environments, and safer to operate in production.

@sachaarbonel sachaarbonel changed the title feat(server): Enhance Server Concurrency with Task Queue, Worker Threads, and Graceful Shutdown feat(server): enhance server concurrency with task queue, worker threads, and graceful shutdown May 22, 2025
@sachaarbonel sachaarbonel marked this pull request as draft May 22, 2025 12:33
@sachaarbonel sachaarbonel marked this pull request as ready for review May 23, 2025 07:25
Copy link
Collaborator

@danbev danbev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments about the some minor nits and I'll go thought and the code more thoroughly, but might not have time until after the weekend.


// RAII guard for HTTP request counting
struct HttpRequestCounter {
explicit HttpRequestCounter(std::atomic<int>& counter) : ctr(counter) { ctr.fetch_add(1, std::memory_order_relaxed); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could you put the constructor implementation on a new line here:

 explicit HttpRequestCounter(std::atomic<int> & counter) : ctr(counter) { 
    ctr.fetch_add(1, std::memory_order_relaxed); 
}
And also prefer the reference to not lean to the left. 

// RAII guard for HTTP request counting
struct HttpRequestCounter {
explicit HttpRequestCounter(std::atomic<int>& counter) : ctr(counter) { ctr.fetch_add(1, std::memory_order_relaxed); }
~HttpRequestCounter() { ctr.fetch_sub(1, std::memory_order_relaxed); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: same as above perhaps put the destructor body on a new line.

fprintf(stderr, " --keep-alive-timeout N, [%-7d] Idle keep-alive timeout in seconds\n", sparams.keep_alive_timeout);
fprintf(stderr, " --backlog N, [%-7d] Listen() backlog (max pending connections)\n", sparams.listen_backlog);
fprintf(stderr, " --max-upload-size N, [%-7zu] Max upload size in bytes (default 104857600)\n", sparams.max_upload_size);
fprintf(stderr, " --temp-upload-dir DIR, [%-7s] Directory for temporary uploads\n", sparams.temp_upload_dir.c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: perhaps line up indentation.

case ErrorType::INVALID_REQUEST: return "invalid_request_error";
case ErrorType::SERVER_ERROR: return "server_error";
case ErrorType::NOT_FOUND: return "not_found_error";
default: return "unknown_error";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: perhaps line up the indentation here.

case ErrorType::INVALID_REQUEST: return 400;
case ErrorType::SERVER_ERROR: return 500;
case ErrorType::NOT_FOUND: return 404;
default: return 500;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: perhaps line up the indentation here.

ThreadGuard(const ThreadGuard&) = delete;
ThreadGuard& operator=(const ThreadGuard&) = delete;
ThreadGuard(ThreadGuard&& other) noexcept : t(std::move(other.t)) {}
ThreadGuard& operator=(ThreadGuard&& other) noexcept { if (t.joinable()) t.join(); t = std::move(other.t); return *this; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: implementation of function could be on a separate/new lines for readability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants