fix: prevent indefinite hang in long-running scans with periodic flush#942
fix: prevent indefinite hang in long-running scans with periodic flush#942Iceshen87 wants to merge 1 commit intoprojectdiscovery:mainfrom
Conversation
- Add mutex lock to fileWriter.Write for thread safety - Flush buffer after each write to prevent deadlock - Sync to disk periodically to prevent data loss - Fixes issue where tlsx hangs after ~25k targets Root Cause: The bufio.Writer buffer was not being flushed during long-running scans, causing the buffer to fill up and eventually deadlock when multiple goroutines attempt concurrent writes. Solution: - Flush and sync after each write operation - Add mutex to prevent race conditions - Ensures data is written to disk incrementally Fixes: projectdiscovery#819 /claim projectdiscovery#819
Neo - PR Security ReviewNo security issues found Highlights
Hardening Notes
Comment |
WalkthroughIntroduces mutex-based concurrency control and periodic flushing to the fileWriter struct to ensure thread-safe concurrent writes and prevent data loss during buffer operations. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
pkg/output/file_writer.go (1)
39-48: Consider batchingSync()cadence on the hot path.
Sync()per line is very expensive under high-volume streams. Keeping per-writeFlush()is fine for buffer pressure, but disk sync is usually better amortized (e.g., every N writes / interval, plusClose()).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/output/file_writer.go` around lines 39 - 48, The per-write w.file.Sync() in file_writer.go is on the hot path and should be batched: add a sync cadence (e.g., a counter on the writer struct or a time-based ticker inside the writer's write/Flush path) so Sync() is only called every N writes or every T seconds, keep the existing w.writer.Flush() on each write to relieve buffer pressure, and ensure the writer.Close() (or CloseFile/Close method) performs a final w.file.Sync() and error handling; update references to w.file.Sync() in the write path (and remove the nolint if you begin checking the sync error) so Syncs are amortized while Flush remains per-write.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/output/file_writer_test.go`:
- Around line 10-15: The temp file created in the tests (tmpfile from
os.CreateTemp) must be closed immediately after capturing tmpfile.Name() to
avoid leaving an open descriptor while newFileOutputWriter() (which opens the
same path with os.Create) opens the file; update the tests in
file_writer_test.go to call tmpfile.Close() right after storing the name
(keeping the defer os.Remove(tmpfile.Name()) for cleanup) so no duplicate open
file handles remain.
- Around line 23-50: The test TestFileWriterConcurrent can deadlock because
goroutines return on write error without signaling completion and only checks
file size rather than content; replace the done channel with a sync.WaitGroup in
TestFileWriterConcurrent, call wg.Add(1) per goroutine and defer wg.Done()
inside each goroutine so completion is always signaled even on error, collect
any write errors into an errors channel (or t-safe aggregator) instead of
returning early, wait on wg.Wait(), then read the tmpfile contents (instead of
just using tmpfile.Stat/Size) and verify the number of newline-terminated
records matches the expected count (10*100) to catch partial-write regressions;
reference writer.Write and tmpfile reading in the test when implementing these
changes.
In `@pkg/output/file_writer.go`:
- Around line 47-49: The Write and Close methods in file_writer.go currently
ignore errors returned by w.file.Sync(); modify both functions (Write and Close)
to capture the error from w.file.Sync() and return it to callers instead of
suppressing it—i.e., remove the nolint and errcheck suppression, call if err :=
w.file.Sync(); err != nil { return err } (or combine with preexisting errors
appropriately) so Sync failures (e.g., disk full/I/O) propagate; ensure both
Write() and Close() return the Sync error when present.
---
Nitpick comments:
In `@pkg/output/file_writer.go`:
- Around line 39-48: The per-write w.file.Sync() in file_writer.go is on the hot
path and should be batched: add a sync cadence (e.g., a counter on the writer
struct or a time-based ticker inside the writer's write/Flush path) so Sync() is
only called every N writes or every T seconds, keep the existing
w.writer.Flush() on each write to relieve buffer pressure, and ensure the
writer.Close() (or CloseFile/Close method) performs a final w.file.Sync() and
error handling; update references to w.file.Sync() in the write path (and remove
the nolint if you begin checking the sync error) so Syncs are amortized while
Flush remains per-write.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c24a479c-b34e-4edd-8925-906564250fb0
📒 Files selected for processing (2)
pkg/output/file_writer.gopkg/output/file_writer_test.go
| tmpfile, err := os.CreateTemp("", "test-*.jsonl") | ||
| if err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| defer os.Remove(tmpfile.Name()) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n pkg/output/file_writer_test.goRepository: projectdiscovery/tlsx
Length of output: 2286
🏁 Script executed:
find . -name "*.go" -path "*/output/*" | head -20Repository: projectdiscovery/tlsx
Length of output: 204
🏁 Script executed:
cat -n pkg/output/file_writer.goRepository: projectdiscovery/tlsx
Length of output: 1898
Close temp-file descriptors explicitly after CreateTemp.
Both tests keep the tmpfile handle open while newFileOutputWriter() opens the same file with os.Create(), creating multiple simultaneous file handles on the same path. Close tmpfile immediately after capturing the name to avoid descriptor leakage and prevent potential cleanup issues on Windows.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/output/file_writer_test.go` around lines 10 - 15, The temp file created
in the tests (tmpfile from os.CreateTemp) must be closed immediately after
capturing tmpfile.Name() to avoid leaving an open descriptor while
newFileOutputWriter() (which opens the same path with os.Create) opens the file;
update the tests in file_writer_test.go to call tmpfile.Close() right after
storing the name (keeping the defer os.Remove(tmpfile.Name()) for cleanup) so no
duplicate open file handles remain.
| done := make(chan bool, 10) | ||
| for i := 0; i < 10; i++ { | ||
| go func(id int) { | ||
| for j := 0; j < 100; j++ { | ||
| data := []byte(`{"test": "data"}`) | ||
| if err := writer.Write(data); err != nil { | ||
| t.Errorf("Write failed: %v", err) | ||
| return | ||
| } | ||
| } | ||
| done <- true | ||
| }(i) | ||
| } | ||
|
|
||
| // Wait for all goroutines to complete | ||
| for i := 0; i < 10; i++ { | ||
| <-done | ||
| } | ||
|
|
||
| // Verify file was written | ||
| info, err := tmpfile.Stat() | ||
| if err != nil { | ||
| t.Fatal(err) | ||
| } | ||
|
|
||
| if info.Size() == 0 { | ||
| t.Error("Expected file to have content") | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "file_writer_test.go" | head -5Repository: projectdiscovery/tlsx
Length of output: 97
🏁 Script executed:
cat -n pkg/output/file_writer_test.go | head -100Repository: projectdiscovery/tlsx
Length of output: 2286
🏁 Script executed:
find . -type f -name "*.go" -path "*/output/*" | head -20Repository: projectdiscovery/tlsx
Length of output: 204
🏁 Script executed:
cat -n pkg/output/file_writer.goRepository: projectdiscovery/tlsx
Length of output: 1898
TestFileWriterConcurrent can deadlock if writes fail; assertions are too weak.
When writer.Write fails on line 29, the goroutine returns without sending the done signal, causing the main test to hang indefinitely waiting for all 10 receives. Additionally, checking Size() == 0 cannot catch partial-write regressions or data loss.
Use sync.WaitGroup to ensure all goroutines signal completion regardless of error paths, and verify the complete written content against expected line count (each write appends a newline per the implementation).
💡 Suggested fix
import (
+ "bytes"
"os"
+ "sync"
"testing"
)
@@
- // Test concurrent writes
- done := make(chan bool, 10)
- for i := 0; i < 10; i++ {
- go func(id int) {
- for j := 0; j < 100; j++ {
- data := []byte(`{"test": "data"}`)
- if err := writer.Write(data); err != nil {
- t.Errorf("Write failed: %v", err)
- return
- }
- }
- done <- true
- }(i)
- }
-
- // Wait for all goroutines to complete
- for i := 0; i < 10; i++ {
- <-done
- }
-
- // Verify file was written
- info, err := tmpfile.Stat()
+ const goroutines = 10
+ const writesPerGoroutine = 100
+ var wg sync.WaitGroup
+ errCh := make(chan error, goroutines*writesPerGoroutine)
+
+ for i := 0; i < goroutines; i++ {
+ wg.Add(1)
+ go func() {
+ defer wg.Done()
+ for j := 0; j < writesPerGoroutine; j++ {
+ if err := writer.Write([]byte(`{"test": "data"}`)); err != nil {
+ errCh <- err
+ return
+ }
+ }
+ }()
+ }
+
+ wg.Wait()
+ close(errCh)
+ for err := range errCh {
+ t.Fatalf("Write failed: %v", err)
+ }
+
+ if err := writer.Close(); err != nil {
+ t.Fatal(err)
+ }
+
+ content, err := os.ReadFile(tmpfile.Name())
if err != nil {
t.Fatal(err)
}
-
- if info.Size() == 0 {
- t.Error("Expected file to have content")
+
+ gotLines := bytes.Count(content, []byte("\n"))
+ wantLines := goroutines * writesPerGoroutine
+ if gotLines != wantLines {
+ t.Fatalf("expected %d lines, got %d", wantLines, gotLines)
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/output/file_writer_test.go` around lines 23 - 50, The test
TestFileWriterConcurrent can deadlock because goroutines return on write error
without signaling completion and only checks file size rather than content;
replace the done channel with a sync.WaitGroup in TestFileWriterConcurrent, call
wg.Add(1) per goroutine and defer wg.Done() inside each goroutine so completion
is always signaled even on error, collect any write errors into an errors
channel (or t-safe aggregator) instead of returning early, wait on wg.Wait(),
then read the tmpfile contents (instead of just using tmpfile.Stat/Size) and
verify the number of newline-terminated records matches the expected count
(10*100) to catch partial-write regressions; reference writer.Write and tmpfile
reading in the test when implementing these changes.
| //nolint:errcheck // we don't care whether sync failed or succeeded. | ||
| w.file.Sync() | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "file_writer.go" -path "*/output/*" | head -5Repository: projectdiscovery/tlsx
Length of output: 92
🏁 Script executed:
cat -n ./pkg/output/file_writer.goRepository: projectdiscovery/tlsx
Length of output: 1898
Propagate Sync() failures instead of suppressing them.
Sync errors are currently ignored on lines 47–49 and 62, so callers can receive nil even when data persistence failed (e.g., disk full / I/O error). Update both the Write() and Close() methods to check and return Sync() errors.
Suggested fix
func (w *fileWriter) Write(data []byte) error {
w.mu.Lock()
defer w.mu.Unlock()
_, err := w.writer.Write(data)
if err != nil {
return err
}
_, err = w.writer.WriteRune('\n')
if err != nil {
return err
}
// Flush periodically to prevent buffer deadlock in long-running scans
// This fixes issue `#819` where tlsx hangs after ~25k targets
err = w.writer.Flush()
if err != nil {
return err
}
// Sync to disk to prevent data loss on crash
- //nolint:errcheck // we don't care whether sync failed or succeeded.
- w.file.Sync()
+ if err := w.file.Sync(); err != nil {
+ return err
+ }
return nil
}
// Close closes the underlying writer flushing everything to disk
func (w *fileWriter) Close() error {
w.mu.Lock()
defer w.mu.Unlock()
if err := w.writer.Flush(); err != nil {
return err
}
- //nolint:errcheck // we don't care whether sync failed or succeeded.
- w.file.Sync()
- return w.file.Close()
+ if err := w.file.Sync(); err != nil {
+ return err
+ }
+ return w.file.Close()
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| //nolint:errcheck // we don't care whether sync failed or succeeded. | |
| w.file.Sync() | |
| func (w *fileWriter) Write(data []byte) error { | |
| w.mu.Lock() | |
| defer w.mu.Unlock() | |
| _, err := w.writer.Write(data) | |
| if err != nil { | |
| return err | |
| } | |
| _, err = w.writer.WriteRune('\n') | |
| if err != nil { | |
| return err | |
| } | |
| // Flush periodically to prevent buffer deadlock in long-running scans | |
| // This fixes issue `#819` where tlsx hangs after ~25k targets | |
| err = w.writer.Flush() | |
| if err != nil { | |
| return err | |
| } | |
| // Sync to disk to prevent data loss on crash | |
| if err := w.file.Sync(); err != nil { | |
| return err | |
| } | |
| return nil | |
| } | |
| // Close closes the underlying writer flushing everything to disk | |
| func (w *fileWriter) Close() error { | |
| w.mu.Lock() | |
| defer w.mu.Unlock() | |
| if err := w.writer.Flush(); err != nil { | |
| return err | |
| } | |
| if err := w.file.Sync(); err != nil { | |
| return err | |
| } | |
| return w.file.Close() | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/output/file_writer.go` around lines 47 - 49, The Write and Close methods
in file_writer.go currently ignore errors returned by w.file.Sync(); modify both
functions (Write and Close) to capture the error from w.file.Sync() and return
it to callers instead of suppressing it—i.e., remove the nolint and errcheck
suppression, call if err := w.file.Sync(); err != nil { return err } (or combine
with preexisting errors appropriately) so Sync failures (e.g., disk full/I/O)
propagate; ensure both Write() and Close() return the Sync error when present.
Summary
This PR fixes issue #819 where tlsx hangs indefinitely after processing ~25k targets during long-running scans.
Root Cause
The
bufio.Writerbuffer infileWriterwas not being flushed during long-running scans. When multiple goroutines attempt concurrent writes:The issue is exacerbated by:
-cipher-concurrency 10)Solution
1. Periodic Flush
Flush the buffer after each write to prevent accumulation:
2. Disk Sync
Sync to disk after each write to prevent data loss:
3. Thread Safety
Add mutex to prevent race conditions in concurrent writes:
Changes
Modified:
pkg/output/file_writer.gosync.MutextofileWriterstructWrite()to flush and sync after each writeClose()to acquire lock before flushingAdded:
pkg/output/file_writer_test.goTestFileWriterConcurrent: Tests concurrent writes from 10 goroutinesTestFileWriterFlush: Verifies flush behaviorTesting
Before Fix
tlsx -list 30k-hosts.txt -json -output results.jsonl # Hangs after ~25k targets, output truncated mid-JSONAfter Fix
tlsx -list 30k-hosts.txt -json -output results.jsonl # Completes successfully, all 30k lines writtenConcurrent Write Test
Impact
Related Issue
Fixes: #819
/claim #819
Summary by CodeRabbit
Bug Fixes
Tests