⚠️ NOTE: this document serves as a starting point for debugging and does not provide an exhaustive/definitive answer
The relay exports metrics and chain-specific errors. This document identifies common metrics/logs and potential reasons for behavior.
failed to enqeue tx for simulation
- indicates slow RPCs that are not responding quickly enough
original signature does not match retry signature
- this could indicate a race condition within the relayer code (please alert developers for investigation)
failed to find transaction within confirm timeout
- indicates network congestion or poor RPC performance (tx dropped)
- There is usually an additional output within the result parameter of the error:
InsufficientFundsForRent
: sender balance too lowAccountNotFound
: sender or used account does not exist (if previously existed, could have been garbage collected)- Additional errors + reasons can be found here: https://github.com/solana-labs/solana/blob/master/sdk/src/transaction/error.rs
- indicates slow RPC which does not respond quickly enough to keep up with the incoming stream of transactions
error in ReadAnswer: stale answer data, polling is likely experiencing errors
- indicates RPC issues (most likely down)
error in ReadState: stale state data, polling is likely experiencing errors
- indicates RPC issues (most likely down)
- provides the SOL balance for keys in the keystore
- low SOL balance will lead to the CL node stop transmitting
- tracks last update to cached data (unix timestamp)
- updates should occur at the configured rate (default: 1s), slower updates can indicate RPC latency issues
- tracks duration of each RPC request, separated via label + URLs
- spikes in latency can indicate RPC issues
- total of TXs that are confirmed and successfully executed on chain
- this value should consistently increase. If it does not, this could indicate RPC latency or funding issues.
- current TXs that are inflight (not confirmed success or error)
- this value should stay mostly constant - spikes could indicate lagging performance due to slow RPCs.
- sum of TXs that have errored for any reason
- depending on the network configuration, this value should either be constant or increase
- total of TXs that have been confirmed but error with a revert
- depending on the network configuration, this value should either be constant or increase
- total of TXs that have been immediately rejected by the RPC
- value should be near zero, TXs should not be immediately rejected by the RPC. this could indicate faulty RPC or
- total of TXs that have been broadcast to the network but was not confirmed within the configured timeout
- an increasing value can indicate RPC latency issues or network congestion
solana_txm_tx_error_sim_revert
- total of TXs that reverted during simulation
- value should not increase rapidly and should be low, if it does it may indicate misconfiguration on the CL node or onchain
- total of TXs that failed during simulation with an unrecognized error
- value should not increase rapdily and should be low, requires looking through logs for the unrecognized error and diagnosing further from there