Skip to content

Commit d371408

Browse files
committed
Add configurable HTTP health-check server
1 parent bd6b377 commit d371408

File tree

6 files changed

+246
-0
lines changed

6 files changed

+246
-0
lines changed

README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Solid Queue can be used with SQL databases such as MySQL, PostgreSQL, or SQLite,
2828
- [Failed jobs and retries](#failed-jobs-and-retries)
2929
- [Error reporting on jobs](#error-reporting-on-jobs)
3030
- [Puma plugin](#puma-plugin)
31+
- [Health-check HTTP server](#health-check-http-server)
3132
- [Jobs and transactional integrity](#jobs-and-transactional-integrity)
3233
- [Recurring tasks](#recurring-tasks)
3334
- [Inspiration](#inspiration)
@@ -603,6 +604,32 @@ that you set in production only. This is what Rails 8's default Puma config look
603604

604605
**Note**: phased restarts are not supported currently because the plugin requires [app preloading](https://github.com/puma/puma?tab=readme-ov-file#cluster-mode) to work.
605606

607+
## Health-check HTTP server
608+
609+
Solid Queue can start a tiny HTTP server to respond to basic health checks in the same process. This is useful for container orchestrators (e.g. Kubernetes) and external monitoring.
610+
611+
- Endpoints:
612+
- `/` and `/health`: returns `200 OK` with body `OK`
613+
- Any other path: returns `404 Not Found`
614+
- Disabled by default. When enabled, defaults are:
615+
- host: `ENV["SOLID_QUEUE_HTTP_HOST"]` or `"0.0.0.0"`
616+
- port: `ENV["SOLID_QUEUE_HTTP_PORT"]` or `9393`
617+
618+
Enable and configure via `config.solid_queue`:
619+
620+
```ruby
621+
# config/initializers/solid_queue.rb or config/application.rb
622+
Rails.application.configure do
623+
config.solid_queue.health_server_enabled = true
624+
# Optional overrides (defaults already read the env vars above)
625+
# config.solid_queue.health_server_host = "0.0.0.0"
626+
# config.solid_queue.health_server_port = 9393
627+
end
628+
```
629+
630+
Note:
631+
- When the Puma plugin is active (`plugin :solid_queue` in `puma.rb`), Solid Queue will skip starting the health server even if `health_server_enabled` is set. A warning is logged instead. This prevents running multiple embedded servers in the same process tree.
632+
606633
## Jobs and transactional integrity
607634
:warning: Having your jobs in the same ACID-compliant database as your application data enables a powerful yet sharp tool: taking advantage of transactional integrity to ensure some action in your app is not committed unless your job is also committed and vice versa, and ensuring that your job won't be enqueued until the transaction within which you're enqueuing it is committed. This can be very powerful and useful, but it can also backfire if you base some of your logic on this behaviour, and in the future, you move to another active job backend, or if you simply move Solid Queue to its own database, and suddenly the behaviour changes under you. Because this can be quite tricky and many people shouldn't need to worry about it, by default Solid Queue is configured in a different database as the main app.
608635

lib/puma/plugin/solid_queue.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ def start(launcher)
1313

1414
if Gem::Version.new(Puma::Const::VERSION) < Gem::Version.new("7")
1515
launcher.events.on_booted do
16+
SolidQueue.puma_plugin = true
1617
@solid_queue_pid = fork do
1718
Thread.new { monitor_puma }
1819
SolidQueue::Supervisor.start
@@ -23,6 +24,7 @@ def start(launcher)
2324
launcher.events.on_restart { stop_solid_queue }
2425
else
2526
launcher.events.after_booted do
27+
SolidQueue.puma_plugin = true
2628
@solid_queue_pid = fork do
2729
Thread.new { monitor_puma }
2830
SolidQueue::Supervisor.start

lib/solid_queue.rb

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,33 @@ module SolidQueue
4141
mattr_accessor :clear_finished_jobs_after, default: 1.day
4242
mattr_accessor :default_concurrency_control_period, default: 3.minutes
4343

44+
mattr_accessor :health_server_enabled, default: false
45+
mattr_accessor :health_server_host, default: ENV.fetch("SOLID_QUEUE_HTTP_HOST", "0.0.0.0")
46+
mattr_accessor :health_server_port, default: (ENV["SOLID_QUEUE_HTTP_PORT"] || "9393").to_i
47+
48+
mattr_accessor :puma_plugin, default: false
49+
50+
def start_health_server
51+
return nil unless health_server_enabled
52+
53+
if puma_plugin
54+
logger.warn("SolidQueue health server is enabled but Puma plugin is active; skipping starting health server to avoid duplicate servers")
55+
return nil
56+
end
57+
58+
server = SolidQueue::HealthServer.new(
59+
host: health_server_host,
60+
port: health_server_port,
61+
logger: logger
62+
)
63+
64+
on_start { server.start }
65+
on_stop { server.stop }
66+
on_exit { server.stop }
67+
68+
server
69+
end
70+
4471
delegate :on_start, :on_stop, :on_exit, to: Supervisor
4572

4673
[ Dispatcher, Scheduler, Worker ].each do |process|

lib/solid_queue/engine.rb

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,5 +37,11 @@ class Engine < ::Rails::Engine
3737
include ActiveJob::ConcurrencyControls
3838
end
3939
end
40+
41+
initializer "solid_queue.health_server" do
42+
ActiveSupport.on_load(:solid_queue) do
43+
SolidQueue.start_health_server
44+
end
45+
end
4046
end
4147
end

lib/solid_queue/health_server.rb

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# frozen_string_literal: true
2+
3+
require "socket"
4+
require "logger"
5+
6+
module SolidQueue
7+
class HealthServer
8+
def initialize(host:, port:, logger: nil)
9+
@host = host
10+
@port = port
11+
@logger = logger || default_logger
12+
@server = nil
13+
@thread = nil
14+
end
15+
16+
def start
17+
return if running?
18+
19+
@thread = Thread.new do
20+
begin
21+
@server = TCPServer.new(@host, @port)
22+
log_info("listening on #{@host}:#{@port}")
23+
24+
loop do
25+
socket = @server.accept
26+
begin
27+
request_line = socket.gets
28+
path = request_line&.split(" ")&.at(1) || "/"
29+
30+
if path == "/" || path == "/health"
31+
body = "OK"
32+
status_line = "HTTP/1.1 200 OK"
33+
else
34+
body = "Not Found"
35+
status_line = "HTTP/1.1 404 Not Found"
36+
end
37+
38+
headers = [
39+
"Content-Type: text/plain",
40+
"Content-Length: #{body.bytesize}",
41+
"Connection: close"
42+
].join("\r\n")
43+
44+
socket.write("#{status_line}\r\n#{headers}\r\n\r\n#{body}")
45+
ensure
46+
begin
47+
socket.close
48+
rescue StandardError
49+
end
50+
end
51+
end
52+
rescue => e
53+
log_error("failed: #{e.class}: #{e.message}")
54+
ensure
55+
begin
56+
@server&.close
57+
rescue StandardError
58+
end
59+
end
60+
end
61+
end
62+
63+
def stop
64+
return unless running?
65+
66+
begin
67+
@server&.close
68+
rescue StandardError
69+
end
70+
71+
if @thread&.alive?
72+
@thread.kill
73+
@thread.join(1)
74+
end
75+
76+
@server = nil
77+
@thread = nil
78+
end
79+
80+
def running?
81+
@thread&.alive?
82+
end
83+
84+
private
85+
86+
def default_logger
87+
logger = Logger.new($stdout)
88+
logger.level = Logger::INFO
89+
logger.progname = "SolidQueueHTTP"
90+
logger
91+
end
92+
93+
def log_info(message)
94+
@logger&.info(message)
95+
end
96+
97+
def log_error(message)
98+
@logger&.error(message)
99+
end
100+
end
101+
end

test/unit/health_server_test.rb

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# frozen_string_literal: true
2+
3+
require "test_helper"
4+
require "net/http"
5+
require "socket"
6+
7+
class HealthServerTest < ActiveSupport::TestCase
8+
def setup
9+
@host = "127.0.0.1"
10+
@port = available_port(@host)
11+
@server = SolidQueue::HealthServer.new(host: @host, port: @port, logger: Logger.new(IO::NULL))
12+
@server.start
13+
wait_for_server
14+
end
15+
16+
def teardown
17+
@server.stop if defined?(@server)
18+
end
19+
20+
def test_health_endpoint_returns_ok
21+
response = http_get("/health")
22+
assert_equal "200", response.code
23+
assert_equal "OK", response.body
24+
end
25+
26+
def test_root_endpoint_returns_ok
27+
response = http_get("/")
28+
assert_equal "200", response.code
29+
assert_equal "OK", response.body
30+
end
31+
32+
def test_unknown_path_returns_not_found
33+
response = http_get("/unknown")
34+
assert_equal "404", response.code
35+
assert_equal "Not Found", response.body
36+
end
37+
38+
def test_stop_stops_server
39+
assert @server.running?, "server should be running before stop"
40+
@server.stop
41+
assert_not @server.running?, "server should not be running after stop"
42+
ensure
43+
# Avoid double-stop in teardown if we stopped here
44+
@server = SolidQueue::HealthServer.new(host: @host, port: @port, logger: Logger.new(IO::NULL))
45+
end
46+
47+
def test_engine_skips_starting_health_server_when_puma_plugin_is_active
48+
SolidQueue.health_server_enabled = true
49+
SolidQueue.puma_plugin = true
50+
51+
server = SolidQueue.start_health_server
52+
assert_nil server
53+
ensure
54+
SolidQueue.health_server_enabled = false
55+
SolidQueue.puma_plugin = false
56+
end
57+
58+
private
59+
def http_get(path)
60+
Net::HTTP.start(@host, @port) do |http|
61+
http.get(path)
62+
end
63+
end
64+
65+
def wait_for_server
66+
# Try to connect for up to 1 second
67+
deadline = Process.clock_gettime(Process::CLOCK_MONOTONIC) + 1.0
68+
begin
69+
Net::HTTP.start(@host, @port) { |http| http.head("/") }
70+
rescue Errno::ECONNREFUSED, Errno::EHOSTUNREACH
71+
raise if Process.clock_gettime(Process::CLOCK_MONOTONIC) > deadline
72+
sleep 0.05
73+
retry
74+
end
75+
end
76+
77+
def available_port(host)
78+
tcp = TCPServer.new(host, 0)
79+
port = tcp.addr[1]
80+
tcp.close
81+
port
82+
end
83+
end

0 commit comments

Comments
 (0)