fix: retry parental control rule writes through firewall settle window#1264
Open
paul43210 wants to merge 1 commit into
Open
fix: retry parental control rule writes through firewall settle window#1264paul43210 wants to merge 1 commit into
paul43210 wants to merge 1 commit into
Conversation
After any PC rule write, the router returns restart_needed_time (~53s on BT8) indicating how long the firewall daemon needs to apply the change. Writes that arrive inside that window are accepted at the NVRAM layer but rejected at the firewall enforcement layer — the underlying asusrouter library's async_set_state(rule) returns False for them. Today the integration's response is to log a WARNING and move on, exposing the rejection to downstream consumers (the device_internet_access service, the ClientInternetSwitch entity, and any automation built on top) as a permanent failure when the write would have succeeded if retried after the settle window. This change adds async_set_pc_rule_with_retry on ARDevice that: - waits the settle window on rejection (default 60s) - refreshes update_pc_rules() so the cache reflects BT8's actual state - returns True if the cache now matches intent (BT8 applied late) - otherwise retries the write once before declaring genuine failure Routes both the device_internet_access service path (via bridge.async_pc_rule with new optional router= kwarg) and ClientInternetSwitch._set_state through the helper. Backward-compat preserved on bridge.async_pc_rule for callers that don't pass a router.
Author
|
Heads-up before review: same disclosure pattern as #1255 — this PR was largely AI-drafted (Claude, Anthropic). The code transform, prose, and reproducer text are AI-generated. The diagnosis (capturing the BT8 Happy to elaborate on the diagnosis, rework the change, or close if the timing / approach isn't a fit. 😅 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
After any parental control rule write, BT8 returns:
{ "modify": "1", "run_service": "restart_firewall", "restart_needed_time": 53 }The
restart_needed_timevalue (typically 53 seconds on BT8) is the window the router needs to settle a firewall restart before it'll accept another PC rule change. Writes inside that window are silently rejected at the firewall enforcement layer: the underlyingasusrouterlibrary'sasync_set_state(rule)returnsFalse, even though the previous write went through correctly.Today the integration's response on rejection is to log a WARNING and move on:
ClientInternetSwitch._set_statehas the same shape (silent failure on the switch entity's ownasync_turn_off).Consumers (the
device_internet_accessservice, the switch entity, downstream automations) see this as a permanent failure when the write would have succeeded if retried after the settle window.In practice this surfaces as:
switch.X_block_internetentities (on after a remove that "didn't take")Fix
Add a helper
async_set_pc_rule_with_retry()onARDevice(router.py) that:bridge.api.async_set_state(rule). On success, returns True.settle_seconds(default 60s — the observed 53s + margin).self.update_pc_rules()(the library's 5s cache is long expired after the sleep).Wired in three places:
bridge.async_pc_rule()— accepts an optionalrouter=kwarg. When provided, routes each rule's write through the helper. Without it, falls back to the current directapi.async_set_statecall (backward-compat for any external caller).router.async_service_device_internet_access()— passesrouter=selfwhen callingbridge.async_pc_rule().switch.ClientInternetSwitch._set_state()— callsself._router.async_set_pc_rule_with_retry(state)instead of going direct to the API.Why a helper on the router
The retry needs access to BOTH
bridge.api(to do the write) AND_pc_rules(to verify post-settle state). The router owns both. Placing the helper on the bridge would require a backref the bridge currently doesn't have, and breaks the bridge's "pure adapter to the library" role.Settle window source
The
restart_needed_timevalue is currently logged at DEBUG by the underlyingasusrouterlibrary (asusrouter.modules.service) but not surfaced to the integration. Reading it through the public library API would require a separate upstream change.This PR uses a hardcoded 60s default (observed 53s on BT8 + ~7s margin). A follow-up could surface
restart_needed_timefrom the library and feed it dynamically — out of scope here to keep the PR small.Empirical evidence
Captured during a live debug session:
Same rule, same payload, same MULTIFILTER format. Only difference: time-since-previous-write. Confirms the rejection is a transient settle-window issue, not a malformed-payload or auth issue.
Testing
Manual reproducer
Pick a throwaway MAC (e.g., an ESP32 you don't mind blocking briefly).
Fire a BLOCK from HA developer tools:
Within 60 seconds, fire a REMOVE for the same MAC.
Before this PR: WARNING "Cannot set parental control rule" in logs; switch entity stays
on; HA state stuck.After this PR: INFO log "PC rule write rejected by router (likely firewall settle window); waiting 60s ...". ~60s later, INFO log "PC rule removal already applied during settle window" OR "PC rule set on retry". Switch entity transitions correctly. HA state correct.
Switch entity reproducer
switch.<mac>_block_internetoff within 60s of the initial block.ClientInternetSwitch._set_state.Regression checks
Notes / open questions
switch.pybecomes a WARNING "State was not set after retry" when the retry also fails. Intentional — silent debug-level failure was part of the original bug.