You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Even though commit d915473 introduced a
socket-level TCP keepalive support into the server's implementation,
this was observed multiple times to not be enough to
**deterministically** fix the issues with the `CodeChecker store` client
hanging indefinitely when the server takes a long time processing the
to-be-stored data.
The underlying reasons are not entirely clear and the issue only pops
up sporadically, but we did observe a few similar scenarios (such as
multi-million report storage from analysing LLVM and then storing
between datacentres) where it almost reliably reproduces.
The symptoms (even with a configure `kepalive`) generally include the
server not becoming notified about the client's disconnect, while the
client process is hung on a low-level system call `read(4, ...)`, trying
to get the Thrift response of `massStoreRun()` from the HTTP socket.
Even if the server finishes the storage processing "in time" and sent
the Thrift reply, it never reaches the client, which means it never
exits from the waiting, which means it keeps either the terminal or,
worse, a CI script occupied, blocking execution.
This is the "more proper solution" foreshadowed in
commit 15af7d8.
Implemented the server-side logic to spawn a `MassStoreRun` task and
return its token, giving the `massStoreRunAsynchronous()` API call full
force.
Implemented the client-side logic to use the new `task_client` module
and the same logic as
`CodeChecker cmd serverside-tasks --await --token TOKEN...`
to poll the server for the task's completion and status.
0 commit comments