Skip to content

debug: capture taiko:intercept logs on Windows to diagnose navigation hang#2775

Closed
zabil wants to merge 3 commits into
masterfrom
debug/intercept-windows-hang
Closed

debug: capture taiko:intercept logs on Windows to diagnose navigation hang#2775
zabil wants to merge 3 commits into
masterfrom
debug/intercept-windows-hang

Conversation

@zabil

@zabil zabil commented Apr 25, 2026

Copy link
Copy Markdown
Member

Problem

The Windows CI job FTs - NodeJS 22 & windows-latest is consistently failing at intercept.spec:38:

Failed Step: Navigate to "https://localhost/employees/2/address"
Error Message: Navigation took more than 60000ms. Please increase the navigationTimeout.

Both retry attempts fail, so this is not random flakiness. The root cause is unknown — we don't know whether:

  • Scenario A: Fetch.requestPaused fires but fulfillRequest fails silently (request stays paused → navigation hangs)
  • Scenario B: Fetch.requestPaused never fires at all (Chrome bypasses the intercept for HTTPS on Windows)

Changes

packages/taiko/lib/logger.js

Adds a taiko:intercept debug namespace (uses the existing debug package — zero overhead when DEBUG is unset).

packages/taiko/lib/handlers/fetchHandler.js

Adds logIntercept calls at every key point in the intercept lifecycle:

  • When an interceptor is registered (addInterceptor)
  • When requestPaused fires (with the request URL)
  • When no matching interceptor is found
  • When a matching interceptor is found (with type)
  • When fulfillRequest/continueRequest is called (with response details)
  • When those CDP calls fail

.github/workflows/taiko.yml

  • Enables DEBUG=taiko:intercept on the Windows functional-test step so the logs appear inline in the CI job output on failure
  • Removes the nick-fields/retry workaround from both unit-tests and functional-tests on Windows — retries were hiding the failure signal

How to read the output

After this PR, the failing CI run will show one of:

Scenario A (requestPaused fires, fulfillRequest fails):

taiko:intercept requestPaused url=https://localhost/employees/2/address
taiko:intercept matched interceptor url=... action=object
taiko:intercept calling fulfillRequest (object) for url=...
taiko:intercept fulfillRequest/continueRequest failed for url=...

Scenario B (requestPaused never fires):

taiko:intercept requestPaused url=https://localhost/employees/1/address   ← appears
taiko:intercept requestPaused url=https://localhost/employees/1/address   ← appears (second nav)
                                                                          ← nothing for /2/address

This PR is diagnostic only — no behaviour change.

zabil and others added 3 commits April 25, 2026 16:54
Add DEBUG=taiko:intercept logging throughout fetchHandler.js to capture:
- Every requestPaused event with the request URL
- Whether a matching interceptor was found
- Which CDP call (fulfillRequest/continueRequest) was made, with response details
- When fulfillRequest/continueRequest fails

Enable DEBUG=taiko:intercept on Windows functional-test runs in CI so
the output is captured in the job log on failure.

Remove the nick-fields/retry workarounds from unit-tests and
functional-tests on windows-latest so failures surface immediately
without masking the real error.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@zabil

zabil commented Apr 28, 2026

Copy link
Copy Markdown
Member Author

Fixed via #2776

@zabil zabil closed this Apr 28, 2026
zabil pushed a commit that referenced this pull request Apr 29, 2026
…ect mock (#2776)

Fixes the Windows navigation hang diagnosed in #2775.

## What the debug logs showed

Using the `taiko:intercept` logging from #2775, captured this pattern
for the second navigation:

```
requestPaused url=https://localhost/employees/2/address    ✓ CDP event fired
matched interceptor ...                                    ✓ interceptor found
calling fulfillRequest (object) responseCode=200           ✓ called, no error
--- silence for 60 seconds ---
Navigation took more than 60000ms                          ✗ timeout
```

`Fetch.fulfillRequest` completes without error, so the hang is not a
network or interceptor issue.

## Root cause

`handleNavigation()` in `pageHandler.js` awaits a `responsePromise` that
resolves only when `Network.responseReceived` fires with a matching
`requestId`. On Windows, after `Fetch.fulfillRequest`, Chrome sometimes:

1. Skips `Network.responseReceived` entirely — `responsePromise` never
resolves
2. Skips `frameStoppedLoading` — a pending `frameEvent` promise in
`waitForNavigation`'s `Promise.all()` never resolves

Either way, navigation times out after 60 seconds.

The first navigation (`/employees/1/address`) avoids the race; the
second hits it because Chrome's internal state is slightly different
after the first completed navigation.

## Fix

**`fetchHandler.js`** — after `fulfillRequest` resolves for a `Document`
resource, emit two synthetic events:

- `interceptedNavigationResponse` with the URL and response details —
unblocks `responsePromise` in `handleNavigation` via URL matching
(avoids relying on `p.networkId` which is optional and may be undefined)
- `navigationFulfilledByIntercept` with `p.frameId` — signals
`pageHandler` to resolve any pending frame promises

**`pageHandler.js`** — two additions:
- `handleNavigation` now also listens for
`interceptedNavigationResponse` and resolves `responsePromise` when the
URL matches the navigation target
- Module-level listener for `navigationFulfilledByIntercept` that calls
`resolveFrameEvent` and `resolveFrameNavigationEvent`, allowing
`waitForNavigation` to proceed to the `document.readyState` check

## Also included

- Windows CI: retry wrappers restored (removed temporarily during
diagnosis to expose the failure signal, now restored for infrastructure
flakiness unrelated to this fix); added `on_retry_command: taskkill /F
/IM chrome.exe /T` for unit tests to kill lingering Chrome processes
between attempts and prevent cascade failures
- Unit test for `handleNavigation` updated to reflect the new
`interceptedNavigationResponse` listener

## Checks

Actions Run: [View GitHub Actions
Run](https://github.com/winst0niuss/taiko/actions/runs/24962353608/)

---------

Signed-off-by: winst0niuss <chumachenko.vadym@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant