[bug] GetBestBlockHeader panics with nil-pointer dereference on malformed blocks row

**Severity:** HIGH — entire blockchain service crash-loops, taking the whole Teranode stack down via docker-compose `depends_on` cascade
**Affected version:** Confirmed on v0.13.1 (commit `247dfd5fe`). Re-verified on v0.15.0 — the unsafe pattern is unchanged at source line 177 of `stores/blockchain/sql/GetBestBlockHeader.go`:
```go
bits, _ := model.NewNBitFromSlice(nBits)
blockHeader.Bits = *bits
```
The panic does not reproduce on v0.15.0 when postgres data is clean; it would reproduce identically if any operator hits the same malformed-bytea condition (or any other case where `NewNBitFromSlice` returns nil + non-nil error). The defensive nil-check fix is still applicable.

**Component:** `stores/blockchain/sql/GetBestBlockHeader.go`

## Summary

`GetBestBlockHeader` queries the `blocks` table for the chain tip and scans the result into Go structs. The function ignores errors from `model.NewNBitFromSlice(nBits)` and immediately dereferences the returned pointer (`blockHeader.Bits = *bits`). If `nBits` from the database is malformed (wrong length, ASCII-encoded hex stored as bytes, etc.), `NewNBitFromSlice` returns `(nil, error)` and `*bits` panics with nil-pointer dereference.

Because the panic happens in `startSubscriptions.func2` early in service startup, the blockchain service crashes immediately and is restarted by docker. The restart loop continued for 104+ cycles in the affected scenario before the underlying data was repaired.

## Stack trace (from running v0.13.1 binary)

```
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2006d26]

goroutine 226 [running]:
github.com/bsv-blockchain/teranode/stores/blockchain/sql.(*SQL).GetBestBlockHeader(0xc000511600, {0x5a924b0, 0x82727e0})
    github.com/bsv-blockchain/teranode/stores/blockchain/sql/GetBestBlockHeader.go:176 +0x786
github.com/bsv-blockchain/teranode/services/blockchain.(*Blockchain).startSubscriptions.func2({{0x5aa4e80, 0xc0019840f0}, {0xc00198e000, 0x3}, 0xc001a92000})
    github.com/bsv-blockchain/teranode/services/blockchain/Server.go:638 +0x67
created by github.com/bsv-blockchain/teranode/services/blockchain.(*Blockchain).startSubscriptions in goroutine 95
    github.com/bsv-blockchain/teranode/services/blockchain/Server.go:637 +0x386
```

## Trigger scenario (in the affected node)

An operator performed a `DELETE` + `INSERT` round-trip on a subset of `blocks` rows. The restoration SQL used `quote_literal(encode(bytea_col, 'hex'))` for bytea columns — which produced INSERT statements containing the ASCII representation of the hex string, NOT the binary value:

```sql
-- Wrong (what the operator did):
INSERT INTO blocks VALUES (..., '1b45fad0a6f21e9b2db0528e0e443cafb37dff8c536f1d000000000000000000', ...)
-- Postgres stored the 64 ASCII characters '1','b','4','5',... as the 64-byte content of the bytea column.

-- Right:
INSERT INTO blocks VALUES (..., decode('1b45fad0a6f21e9b2db0528e0e443cafb37dff8c536f1d000000000000000000', 'hex'), ...)
-- Postgres decodes the hex string to its 32-byte binary value before storing.
```

After the bad restore, `octet_length(hash)` returned 64 for restored rows vs the expected 32. `octet_length(n_bits)` returned 8 vs the expected 4. When `GetBestBlockHeader` read these rows and called `NewNBitFromSlice(<8-byte-buffer>)`, the function returned `(nil, error)`. The next line `blockHeader.Bits = *bits` then panicked.

This is an operator error (the SQL was wrong), but Teranode crash-looped instead of returning a clean error to higher layers OR refusing to start with an explanatory log message.

## Suggested fix

In `stores/blockchain/sql/GetBestBlockHeader.go`, replace the unsafe pattern:

```go
// Before — current code
bits, _ := model.NewNBitFromSlice(nBits)
blockHeader.Bits = *bits   // ← panics if bits is nil
```

```go
// After — handle the error
bits, err := model.NewNBitFromSlice(nBits)
if err != nil || bits == nil {
    return nil, nil, errors.NewStorageError(
        "GetBestBlockHeader: malformed n_bits (length=%d, expected 4) for block id=%d height=%d: %w",
        len(nBits), blockHeaderMeta.ID, blockHeaderMeta.Height, err)
}
blockHeader.Bits = *bits
```

Apply the same defensive pattern to subsequent calls in the function (`chainhash.NewHash`, `bt.NewTxFromBytes`, etc. — the error is checked for those but the surrounding flow still assumes valid binary data).

Optionally: add a sanity check at startup that scans `blocks` for any row with non-32-byte hash / non-4-byte n_bits / non-32-byte chain_work, and refuses to start with a clear error message pointing the operator at the corrupted rows.

## Why severity is HIGH

The panic in blockchain service triggers docker-compose `depends_on` cascade restarts: blockassembly, legacy, kafka-shared all stop being able to complete startup because their dependency (blockchain) is unavailable. In the observed incident, blockchain restarted 104 times in ~15 minutes; the entire Teranode stack was unable to make any progress until the underlying data was repaired by hand.

A defensive nil check + descriptive error would convert this from "stack crash-loop" to "service exits with operator-actionable log message."

## Workaround
- Detect via `octet_length` queries on bytea columns
- Repair via `DELETE` of affected rows + `COPY FROM` of the same data in proper CSV bytea format (`\xHEX` prefix)
- Restart services in dependency order (blockchain → blockassembly → legacy)

## Note on operator culpability vs Teranode robustness

The trigger was operator error in the restoration SQL. However:
- Teranode's CLI does not provide a `restore-blocks-from-csv` or similar admin path; operators are forced to write raw SQL
- The panic was indistinguishable from a real bug for the duration of the incident (operator had no signal that the corruption was self-inflicted vs an internal Teranode issue)
- A defensive nil check costs O(1) and turns a multi-cycle crash-loop into an immediate operator-actionable error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] GetBestBlockHeader panics with nil-pointer dereference on malformed blocks row #884

Summary

Stack trace (from running v0.13.1 binary)

Trigger scenario (in the affected node)

Suggested fix

Why severity is HIGH

Workaround

Note on operator culpability vs Teranode robustness

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[bug] GetBestBlockHeader panics with nil-pointer dereference on malformed blocks row #884

Description

Summary

Stack trace (from running v0.13.1 binary)

Trigger scenario (in the affected node)

Suggested fix

Why severity is HIGH

Workaround

Note on operator culpability vs Teranode robustness

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions