-
Notifications
You must be signed in to change notification settings - Fork 44
multiprocess dendrite #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: testnet
Are you sure you want to change the base?
Conversation
WalkthroughAdds a Rust PyO3 extension exposing LightningDendrite, a Python wrapper that replaces previous dendrite usage, updates miner_response proof-size computation to be proof-system aware, and adds packaging/build files wiring the new lightning package into the project. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant V as Validator
participant LPy as LightningDendrite (Python)
participant LR as lightning (Rust/PyO3)
participant A as Target Axon (HTTP)
Note over V: Single-target call
V->>LPy: call(target_axon, synapse, timeout)
LPy->>LPy: preprocess synapse (timeout, terminals)
LPy->>LPy: sign synapse (wallet hotkey)
LPy->>LR: call(axon_info, headers, body, signature, timeout)
LR->>A: HTTP POST (JSON, headers)
A-->>LR: HTTP response (status, body)
LR-->>LPy: {status_code, status_message, response_data, process_time}
LPy->>LPy: process result (status, optional deserialize)
LPy-->>V: synapse (updated/decoded)
sequenceDiagram
autonumber
participant V as Validator
participant LPy as LightningDendrite (Python)
participant LR as lightning (Rust/PyO3)
participant A1 as Axon 1
participant A2 as Axon 2
participant AN as Axon N
Note over V: Parallel forward to many axons
V->>LPy: forward([axons], synapse, timeout)
LPy->>LPy: clone + sign per-axon synapse
LPy->>LR: forward(list of requests)
par concurrent POSTs
LR->>A1: HTTP POST
LR->>A2: HTTP POST
LR->>AN: HTTP POST
and
A1-->>LR: response
A2-->>LR: response
AN-->>LR: response
end
LR-->>LPy: [result dicts]
LPy->>LPy: per-result processing + optional deserialize
LPy-->>V: list of synapses (ordered by input)
sequenceDiagram
autonumber
participant MR as MinerResponse
participant C as Circuit
participant PC as proof_content
Note over MR: Compute proof_size
MR->>MR: if not proof_content => DEFAULT_PROOF_SIZE
MR->>C: ps = circuit.proof_system
alt ps == CIRCOM
MR->>PC: iterate pi_a, pi_b, pi_c
MR->>MR: sum lengths of all elements
else ps == EZKL
MR->>PC: len(proof_content["proof"])
else
MR->>MR: DEFAULT_PROOF_SIZE
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Pylint (3.3.7)neurons/_validator/utils/axon.pyTip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (3)
neurons/utils/mp_dendrite.py (3)
15-40: Global monkey-patching of threading/logging at import-time is risky; gate or scope it.Overriding
threading.excepthookandlogging.handlers.QueueListener._monitorglobally can produce hard-to-debug side effects across the app. Also,safe_monitorswallows exceptions silently.Consider gating via an env flag and log suppressed errors:
+import os @@ -threading.excepthook = silent_thread_hook +if os.getenv("OMRON_MP_DENDRITE_PATCH", "1") == "1": + threading.excepthook = silent_thread_hook @@ -def safe_monitor(self): +def safe_monitor(self): try: while True: try: record = self.dequeue(True) except EOFError: break except Exception: continue self.handle(record) - except Exception: - pass + except Exception as e: + logging.getLogger(__name__).debug("QueueListener monitor exited: %r", e) @@ -if hasattr(logging.handlers, "QueueListener"): - logging.handlers.QueueListener._monitor = safe_monitor +if os.getenv("OMRON_MP_DENDRITE_PATCH", "1") == "1" and hasattr(logging.handlers, "QueueListener"): + logging.handlers.QueueListener._monitor = safe_monitorIf you need these patches, consider applying them at process start-up (not import time) to keep scope explicit.
240-269: Ensure a consistent return type on EOF and don’t mask it by returning None.Returning None from
run_chunkon EOF leads to mixed types in aggregation. Return an empty list instead.try: return asyncio.run( worker( ss58_address, nonce, uuid, external_ip, synapse_headers, synapse_body, axon_sig_pairs, timeout, synapse_class, request_name, ) ) except EOFError: - pass + return []
196-203: Don’t return inside finally: it can swallow interrupts/cancellation.Returning in a finally block may suppress BaseException (KeyboardInterrupt, CancelledError in some versions), and hides unexpected errors.
Refactor to return after the finally block:
except Exception as e: synapse = process_error_message(synapse, request_name, e) finally: # flake8: noqa bt.logging.trace( f"dendrite | <-- | {synapse.get_total_size()} B | {synapse.name} | {synapse.axon.hotkey} | {synapse.axon.ip}:{str(synapse.axon.port)} | {synapse.dendrite.status_code} | {synapse.dendrite.status_message}" ) - return synapse + return synapse
🧹 Nitpick comments (6)
neurons/_validator/config/__init__.py (2)
7-7: Import path stability: ensure the module path is correct and robust.You're importing from
utils.mp_dendrite, while the file lives atneurons/utils/mp_dendrite.py. Ifutilsisn’t a top-level package on PYTHONPATH at runtime, this will fail. Prefer an absolute package import or an explicit relative import within theneuronspackage.Example diff (adjust if your packaging differs):
-from utils.mp_dendrite import MultiprocessDendrite +from neurons.utils.mp_dendrite import MultiprocessDendriteWould you like me to scan call sites and setup.cfg/pyproject.toml to verify the package import roots?
42-42: Passing an explicit external_ip is recommended to avoid misrouting (defaults to 127.0.0.1).
MultiprocessDendritedefaultsexternal_ipto"127.0.0.1". In mp_dendrite.get_endpoint_url, same-host routing uses a special-case that currently resolves to0.0.0.0(see separate comment) which is unroutable. To avoid surprises, provide a concrete external IP (or loopback) from config if available.For example:
- self.dendrite = MultiprocessDendrite(wallet=self.wallet) + self.dendrite = MultiprocessDendrite( + wallet=self.wallet, + external_ip=getattr(self.bt_config, "external_ip", None), + nprocs=getattr(self.bt_config, "mp_dendrite_nprocs", 8), + )Also verify downstream code assumes an async
.forward(...)and does not rely onbt.dendritecontext-manager semantics.neurons/utils/mp_dendrite.py (4)
165-166: Nit: call the classmethod directly instead of instantiating.No need to instantiate
synapse_class()to callfrom_headers.- synapse = synapse_class().from_headers(synapse_headers) + synapse = synapse_class.from_headers(synapse_headers)
191-191: Nit: process_time should likely be numeric, not a string.Unless downstream code expects a string, store the float for easier metrics/aggregation.
- synapse.dendrite.process_time = str(time.time() - start_time) + synapse.dendrite.process_time = time.time() - start_time
114-141: Broaden error messaging for JSON/content-type issues.If the server returns non-JSON,
response.json()raisesaiohttp.ContentTypeError. Include a tailored message.elif isinstance(exception, asyncio.TimeoutError): message = f"{status_message} after {synapse.timeout} seconds" + elif isinstance(exception, aiohttp.ContentTypeError): + message = f"{status_message}: non-JSON response from axon at {synapse.axon.ip}:{synapse.axon.port}/{request_name}"
350-355: External IP discovery defaults may misroute same-host traffic.Defaulting
external_ipto"127.0.0.1"is safe for local, but for multi-host networks you likely want the real external IP. Consider deriving it from config or via a helper (e.g., socket or bt utils) and allow overriding via env/config.I can provide a small helper to resolve and cache the external IP if you want this to auto-detect.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
neurons/_validator/config/__init__.py(2 hunks)neurons/utils/mp_dendrite.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
neurons/_validator/config/__init__.py (1)
neurons/utils/mp_dendrite.py (1)
MultiprocessDendrite(349-386)
🪛 GitHub Check: CodeQL
neurons/utils/mp_dendrite.py
[notice] 34-34: Empty except
'except' clause does nothing but pass and there is no explanatory comment.
[warning] 202-202: 'break' or 'return' statement in finally
'return' in a finally block will swallow any exceptions raised.
[notice] 240-251: Explicit returns mixed with implicit (fall through) returns
Mixing implicit and explicit returns may indicate an error, as implicit returns always return None.
🔇 Additional comments (1)
neurons/utils/mp_dendrite.py (1)
60-67: Confirmversion_as_intusage (callable vs. constant).If
version_as_intis a function, it should be invoked; if it’s a constant, the current code is fine.Optional defensive tweak:
- synapse.dendrite = bt.TerminalInfo( + synapse.dendrite = bt.TerminalInfo( ip=external_ip, - version=version_as_int, + version=version_as_int() if callable(version_as_int) else version_as_int, nonce=nonce, uuid=uuid, hotkey=ss58_address, )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
neurons/utils/mp_dendrite.py (1)
273-276: Bug: signing a str; encode to bytes before calling keypair.sign.Most crypto sign APIs expect bytes. Passing a str risks type errors or wrong signatures.
Apply:
def sign(synapse: bt.Synapse, keypair: bt.Keypair): message = f"{synapse.dendrite.nonce}.{synapse.dendrite.hotkey}.{synapse.axon.hotkey}.{synapse.dendrite.uuid}.{synapse.body_hash}" - signature = f"0x{keypair.sign(message).hex()}" + signature = f"0x{keypair.sign(message.encode('utf-8')).hex()}" return signature
🧹 Nitpick comments (14)
neurons/utils/mp_dendrite.py (14)
188-190: More robust response parsing: fall back to text when server returns non-JSON.Non-200 or error paths often send text/HTML.
response.json()will raiseContentTypeErrorand skip setting status/message. Handle gracefully.Apply:
- json_response = await response.json() - process_server_response(response, json_response, synapse) + try: + json_response = await response.json() + except aiohttp.ContentTypeError: + json_response = {"message": await response.text()} + process_server_response(response, json_response, synapse)
61-67: Guard: version_as_int may be a function; avoid passing a function object into TerminalInfo.Depending on bittensor version,
version_as_intcan be a callable. Normalize to an int.Apply:
- synapse.dendrite = bt.TerminalInfo( - ip=external_ip, - version=version_as_int, + version_value = version_as_int() if callable(version_as_int) else version_as_int + synapse.dendrite = bt.TerminalInfo( + ip=external_ip, + version=version_value, nonce=nonce, uuid=uuid, hotkey=ss58_address, )If you confirm
version_as_intis always an int in your pinned bittensor version, you can skip this change.
294-306: Avoid accumulating mutations on the same Synapse while generating per-axon signatures.Reusing and mutating the same
synapseinstance across axons can cause subtle bleed-over. Copy per axon.Apply:
- synapse = synapse.model_copy() - for axon in axons: - synapse = preprocess_synapse_for_request( - keypair.ss58_address, - nonce, - uuid, - external_ip, - axon, - synapse, - timeout, - ) - yield sign(synapse, keypair) + base_synapse = synapse.model_copy(deep=True) + for axon in axons: + local_synapse = preprocess_synapse_for_request( + keypair.ss58_address, + nonce, + uuid, + external_ip, + axon, + base_synapse.model_copy(deep=True), + timeout, + ) + yield sign(local_synapse, keypair)
318-343: Compute wire headers/body from an unmodified base before signing; pass them to workers.Currently, headers/body are derived after signing, when
synapsehas been mutated. Build them first from a clean copy to avoid leaking per-axon mutations into the shared payload.Apply:
- synapse = synapse.model_copy() + synapse = synapse.model_copy(deep=True) nonce = time.time_ns() request_name = synapse.__class__.__name__ - axon_dicts = [ax.to_parameter_dict() for ax in axons] + # Build wire payload from a clean base + synapse_headers = synapse.to_headers() + synapse_body = synapse.model_dump() + + axon_dicts = [ax.to_parameter_dict() for ax in axons] signatures = list( sign_axons(keypair, nonce, uuid, external_ip, axons, synapse, timeout) ) axon_sig_pairs = list(zip(axon_dicts, signatures)) chunks = list(chunkify(axon_sig_pairs, nprocs)) results = [] with concurrent.futures.ProcessPoolExecutor(nprocs) as executor: intermediate_results = executor.map( run_chunk, repeat(ss58_address), repeat(nonce), repeat(uuid), repeat(external_ip), - repeat(synapse.to_headers()), - repeat(synapse.model_dump()), + repeat(synapse_headers), + repeat(synapse_body), chunks, repeat(timeout), repeat(synapse.__class__), repeat(request_name), )
45-49: Nice fix on 0.0.0.0; also include IPv6 loopback (::1).Your self-routing to 127.0.0.1 resolves the earlier bug. Consider IPv6 loopback too.
Apply:
- is_self = target_axon.ip in {str(external_ip), "127.0.0.1", "0.0.0.0", "localhost"} + is_self = target_axon.ip in {str(external_ip), "127.0.0.1", "0.0.0.0", "localhost", "::1"}Optional: add IPv6 URL formatting when needed (e.g., bracketed host), if your stack supports IPv6.
220-229: Naming nit: singularizesessionstosessionfor clarity.It's a single ClientSession. Tiny readability win.
Apply:
- async with aiohttp.ClientSession(connector=conn) as sessions: + async with aiohttp.ClientSession(connector=conn) as session: return await asyncio.gather( *( call( @@ - session=sessions, + session=session,
41-43: Risk: monkeypatching logging.handlers.QueueListener._monitor affects global behavior.Overriding a private stdlib method globally can cause unexpected behavior across the process pool and other libs.
- Gate this behavior behind an env/config flag.
- Scope the patch to your process entrypoint rather than a utility module imported widely.
- At minimum, document why this is necessary and what exceptions are intentionally swallowed.
Do you want a patch that makes this opt-in via an env var?
27-39: Silent exception handling insafe_monitormay hide actionable errors.
except Exception: continuediscards errors without logging context. You log only outer exceptions.Consider logging at debug/trace for inner exceptions or narrowing the exception types you ignore to known benign cases.
165-166: API usage: is from_headers a classmethod?
synapse_class().from_headers(...)suggests an instance method returning a new instance. In many APIs it’s a classmethod (Synapse.from_headers(headers)).If it is a classmethod, switch to:
- synapse = synapse_class().from_headers(synapse_headers) + synapse = synapse_class.from_headers(synapse_headers)If not, ignore this.
352-356: Prefer UUIDs over time-based strings foruuid.
str(time.time_ns())can collide across processes on fast machines.uuid.uuid4()is safer.Apply:
+import uuid @@ - self.uuid = str(time.time_ns()) + self.uuid = str(uuid.uuid4())
78-112: Minor: avoid direct dict updates on pydantic-like models.
__dict__.update(...)bypasses validators. If these are pydantic models, prefer.model_copy(update=...)or set attributes explicitly.No change required if these are plain dataclasses or you intentionally skip validation.
144-157: Timeout handling is good; considerraise_for_status()if server uses HTTP codes consistently.You’re mapping errors via headers and body; if the server respects HTTP semantics, calling
response.raise_for_status()can shorten error paths. Not required if your protocol expects 200 with error payloads.No change required if current behavior is intentional.
1-16: Config: good use of env var for max concurrency; document expected range (>=1).After clamping fix, consider documenting MP_DENDRITE_MAX_CONNECTIONS in README/env sample so operators set sane values.
I can draft a short README section if helpful.
114-142: Unit tests suggestion: cover error mapping branches.Recommend tests for:
- aiohttp.ClientResponseError path sets int status,
- Timeout path formats message with elapsed seconds,
- ClientConnectorError path includes host:port.
I can scaffold pytest tests with mocked aiohttp responses and exceptions. Want me to open a follow-up PR?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
neurons/utils/mp_dendrite.py(1 hunks)
🧰 Additional context used
🪛 GitHub Check: Bandit
neurons/utils/mp_dendrite.py
[warning] 46-46:
Possible binding to all interfaces.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: CodeQL Analysis (python)
🔇 Additional comments (4)
neurons/utils/mp_dendrite.py (4)
373-382: Good: offloading mp_forward to a thread keeps the event loop responsive.Using
await asyncio.to_thread(...)addresses the blocking concern cleanly.
205-240: Concurrency model looks solid with chunking + bounded connector.The worker pattern and shared session per chunk is efficient and will reuse connections. After clamping the limit (see prior comment), this is in good shape.
308-349: Windows spawn caveat with ProcessPool in library code.On Windows (spawn), importing this module can re-run top-level code in child processes. Given you globally monkeypatch logging hooks at import time, verify there are no unintended side-effects during pool startup.
If this targets Linux-only infra, ignore. Otherwise, consider guarding global monkeypatches behind
if os.name != "nt"or an env flag.
381-389: Deserialize guard is pragmatic.Conditionally calling
deserialize()preserves backward compatibility with synapse types that don’t implement it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
neurons/utils/mp_dendrite.py (3)
137-142: Use aiohttp.ClientResponseError.status (int), not .code; keep status_code numeric.Aligns types and avoids attribute mismatch.
- if status_code: - synapse.dendrite.status_code = status_code - elif isinstance(exception, aiohttp.ClientResponseError): - synapse.dendrite.status_code = str(exception.code) + if status_code: + synapse.dendrite.status_code = int(status_code) + elif isinstance(exception, aiohttp.ClientResponseError): + status = getattr(exception, "status", None) + if status is not None: + synapse.dendrite.status_code = int(status)
234-237: Clamp TCPConnector limit to >=1 even if env var is 0.If MP_DENDRITE_MAX_CONNECTIONS=0, limit becomes 0 (unbounded), risking FD exhaustion. Clamp to at least 1.
- chunk_size = len(axon_sig_pairs) - connection_limit = min(MAX_CONCURRENT_CONNECTIONS, max(1, chunk_size)) - conn = aiohttp.TCPConnector(limit=connection_limit, limit_per_host=10) + chunk_size = len(axon_sig_pairs) + max_limit = MAX_CONCURRENT_CONNECTIONS if MAX_CONCURRENT_CONNECTIONS > 0 else 64 + connection_limit = max(1, min(max_limit, chunk_size)) + conn = aiohttp.TCPConnector(limit=connection_limit, limit_per_host=10)
374-377: Sign bytes, not str.Most crypto APIs expect bytes. Encode the message.
def sign(synapse: bt.Synapse, keypair: bt.Keypair): - message = f"{synapse.dendrite.nonce}.{synapse.dendrite.hotkey}.{synapse.axon.hotkey}.{synapse.dendrite.uuid}.{synapse.body_hash}" - signature = f"0x{keypair.sign(message).hex()}" + message = f"{synapse.dendrite.nonce}.{synapse.dendrite.hotkey}.{synapse.axon.hotkey}.{synapse.dendrite.uuid}.{synapse.body_hash}" + signature = f"0x{keypair.sign(message.encode('utf-8')).hex()}" return signature
🧹 Nitpick comments (5)
neurons/_validator/models/miner_response.py (1)
98-99: Confirm: correct source of proof_system on Circuit.This code reads response.circuit.proof_system, while elsewhere you use circuit.metadata.proof_system. Ensure Circuit exposes proof_system directly; otherwise, read from metadata to avoid AttributeError.
Proposed defensive read:
- ps = response.circuit.proof_system + ps = getattr(response.circuit, "proof_system", None) or getattr(response.circuit.metadata, "proof_system", None)neurons/utils/mp_dendrite.py (4)
262-266: Normalize status_code to integers across error paths.You set "408" and "500" as strings, but other paths use ints. Keep types consistent for downstream consumers and logging.
- error_synapse.dendrite = bt.TerminalInfo( - status_code="408", status_message="Worker timeout" - ) + error_synapse.dendrite = bt.TerminalInfo( + status_code=408, status_message="Worker timeout" + ) @@ - error_synapse.dendrite = bt.TerminalInfo( - status_code="500", status_message=f"Process error: {str(e)}" - ) + error_synapse.dendrite = bt.TerminalInfo( + status_code=500, status_message=f"Process error: {str(e)}" + ) error_synapse.axon = bt.TerminalInfo( ip=axon_dict.get("ip", "unknown"), port=axon_dict.get("port", 0), hotkey=axon_dict.get("hotkey", "unknown"), - status_code="500", + status_code=500, status_message=f"Process error: {str(e)}", )Also applies to: 355-363
320-328: SIGALRM is Unix-only; guard for cross-platform compatibility.Using SIGALRM breaks on Windows (no SIGALRM), and ProcessPool may use spawn. Guard with hasattr and skip alarm if unavailable. Consider relying solely on asyncio + future.result timeouts already present.
- # Set up timeout signal - signal.signal(signal.SIGALRM, timeout_handler) - signal.alarm(int(chunk_timeout)) + # Set up timeout signal (Unix-only); guard for cross-platform + if hasattr(signal, "SIGALRM"): + signal.signal(signal.SIGALRM, timeout_handler) + signal.alarm(int(chunk_timeout)) @@ - signal.alarm(0) + if hasattr(signal, "SIGALRM"): + signal.alarm(0)
479-487: Adjust call() return annotation to allow None.When results is empty you return None, but the signature advertises bt.Synapse. Make it Optional for correctness.
- ) -> bt.Synapse: + ) -> bt.Synapse | None: @@ - return results[0] if results else None + return results[0] if results else NoneAlso applies to: 493-494
17-24: Global monkey-patching of threading.excepthook and QueueListener._monitor—confirm scope/impact.These changes affect process-wide behavior and third-party libs. If intended, document rationale and provide a way to disable (env flag). Otherwise, scope locally.
Also applies to: 40-42
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
neurons/_validator/models/miner_response.py(1 hunks)neurons/utils/mp_dendrite.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
neurons/_validator/models/miner_response.py (1)
neurons/execution_layer/circuit.py (1)
ProofSystem(36-61)
🪛 GitHub Check: CodeQL
neurons/utils/mp_dendrite.py
[notice] 88-88: Empty except
'except' clause does nothing but pass and there is no explanatory comment.
[notice] 97-97: Empty except
'except' clause does nothing but pass and there is no explanatory comment.
🔇 Additional comments (3)
neurons/utils/mp_dendrite.py (3)
44-47: LGTM: loopback routing fix for self-targets.Using 127.0.0.1 for same-host calls avoids invalid 0.0.0.0 destinations. Good hardening.
510-519: LGTM: offloading mp_forward to a thread avoids blocking the event loop.Using asyncio.to_thread here is appropriate and keeps the loop responsive while processes execute.
60-66: Constantversion_as_intIs Correct Here
version_as_intinbittensor.core.settingsis defined as a constant integer (computed at import time), not a function. Passing it directly—without parentheses—is the intended usage. No changes are needed.Likely an incorrect or invalid review comment.
| elif ps == ProofSystem.EZKL: | ||
| proof_size = len(proof_content["proof"]) | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: EZKL proof_size indexing assumes dict but proof may be a str.
Upstream parsing sets proof_content to a raw string when it's hex-like; for EZKL this is likely and will raise a TypeError. Handle both dict and str.
- elif ps == ProofSystem.EZKL:
- proof_size = len(proof_content["proof"])
+ elif ps == ProofSystem.EZKL:
+ # Handle both dict payloads and raw string proofs
+ if isinstance(proof_content, dict):
+ proof_size = len(str(proof_content.get("proof", "")))
+ else:
+ proof_size = len(str(proof_content))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| elif ps == ProofSystem.EZKL: | |
| proof_size = len(proof_content["proof"]) | |
| else: | |
| elif ps == ProofSystem.EZKL: | |
| # Handle both dict payloads and raw string proofs | |
| if isinstance(proof_content, dict): | |
| proof_size = len(str(proof_content.get("proof", ""))) | |
| else: | |
| proof_size = len(str(proof_content)) | |
| else: |
🤖 Prompt for AI Agents
In neurons/_validator/models/miner_response.py around lines 106-108, the EZKL
branch assumes proof_content is a dict and does proof_size =
len(proof_content["proof"]), but upstream may supply proof_content as a raw
hex-like string causing a TypeError. Fix by branching: if proof_content is a
dict (and has "proof") compute len(proof_content["proof"]); elif it's a str or
bytes, detect hex-like strings (optional 0x prefix) and compute the byte length
via bytes.fromhex(...) when hex-like, otherwise use len(proof_content) (or
len(bytes(proof_content, "utf-8")) for strings) — ensure you guard bytes.fromhex
with try/except and fall back to plain length on failure.
| [[package]] | ||
| name = "pyo3" | ||
| version = "0.22.6" | ||
| source = "registry+https://github.com/rust-lang/crates.io-index" | ||
| checksum = "f402062616ab18202ae8319da13fa4279883a2b8a9d9f83f20dbade813ce1884" | ||
| dependencies = [ | ||
| "cfg-if", | ||
| "indoc", | ||
| "libc", | ||
| "memoffset", | ||
| "once_cell", | ||
| "portable-atomic", | ||
| "pyo3-build-config", | ||
| "pyo3-ffi", | ||
| "pyo3-macros", | ||
| "unindent", | ||
| ] |
Check notice
Code scanning / Trivy
PyO3 Risk of buffer overflow in `PyString::from_object` Low
Installed Version: 0.22.6
Vulnerability GHSA-pph8-gcv7-4qj5
Severity: LOW
Fixed Version: 0.24.1
Link: GHSA-pph8-gcv7-4qj5
| for key in synapse.model_dump().keys(): | ||
| try: | ||
| setattr(synapse, key, getattr(server_synapse, key)) | ||
| except Exception: |
Check notice
Code scanning / CodeQL
Empty except Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 3 months ago
To fix the problem, we should ensure that exceptions in the except Exception: block at line 184 are handled appropriately. The best way to do this without changing existing functionality is to log the exception, so that any errors encountered during attribute copying are recorded. This can be done by importing Python's built-in logging module and using logging.warning() or logging.error() to log the exception details. The log message should include the attribute key and the exception message for clarity. The same approach should be applied to the outer except Exception: block at line 187, which currently also swallows exceptions silently.
Required changes:
- Import the
loggingmodule at the top of the file. - Replace
passin bothexcept Exception:blocks (lines 185 and 188) with appropriate logging statements.
-
Copy modified line R6 -
Copy modified lines R183-R184 -
Copy modified lines R186-R187
| @@ -3,8 +3,8 @@ | ||
| import bittensor as bt | ||
| from bittensor.core.settings import version_as_int | ||
| from utils.lightning import lightning | ||
| import logging | ||
|
|
||
|
|
||
| class LightningDendrite: | ||
| """ | ||
| Drop-in replacement for bittensor dendrite using Rust for true parallelism. | ||
| @@ -181,11 +180,11 @@ | ||
| for key in synapse.model_dump().keys(): | ||
| try: | ||
| setattr(synapse, key, getattr(server_synapse, key)) | ||
| except Exception: | ||
| pass | ||
| except Exception as e: | ||
| logging.warning(f"Failed to set attribute '{key}' on synapse: {e}") | ||
|
|
||
| except Exception: | ||
| pass | ||
| except Exception as e: | ||
| logging.error(f"Failed to process rust response: {e}") | ||
|
|
||
| def __str__(self) -> str: | ||
| return f"lightning_dendrite({self.wallet.hotkey.ss58_address})" |
| except Exception: | ||
| pass | ||
|
|
||
| except Exception: |
Check notice
Code scanning / CodeQL
Empty except Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 3 months ago
To fix the problem, we should ensure that exceptions are not silently ignored. The best way to do this is to log the exception when it occurs, so that any issues can be traced and debugged. This can be done by importing the logging module and using logging.exception() to log the stack trace and error message. The change should be made in the except Exception: block on line 187, replacing pass with a logging statement. If the project already uses a logger, use it; otherwise, use the standard Python logging module. Additionally, if the logging module is not already imported in this file, add the import at the top.
-
Copy modified line R6 -
Copy modified line R189
| @@ -3,6 +3,7 @@ | ||
| import bittensor as bt | ||
| from bittensor.core.settings import version_as_int | ||
| from utils.lightning import lightning | ||
| import logging | ||
|
|
||
|
|
||
| class LightningDendrite: | ||
| @@ -185,7 +186,7 @@ | ||
| pass | ||
|
|
||
| except Exception: | ||
| pass | ||
| logging.exception("Exception occurred while processing Rust response in LightningDendrite._process_rust_response") | ||
|
|
||
| def __str__(self) -> str: | ||
| return f"lightning_dendrite({self.wallet.hotkey.ss58_address})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
♻️ Duplicate comments (1)
neurons/utils/lightning_dendrite.py (1)
182-188: Replace empty excepts with logging and contextSwallowing exceptions loses crucial debugging information and was flagged by CodeQL. Log with context at least.
- try: - setattr(synapse, key, getattr(server_synapse, key)) - except Exception: - pass + try: + setattr(synapse, key, getattr(server_synapse, key)) + except Exception as e: + logging.getLogger(__name__).debug( + "Failed to copy field '%s' from server synapse: %s", key, e + ) @@ - except Exception: - pass + except Exception as e: + logging.getLogger(__name__).warning( + "Failed to parse/construct server synapse from response_data: %s", e + )
🧹 Nitpick comments (12)
neurons/utils/lightning/Cargo.toml (3)
12-12: Trim Tokio features to reduce compile time and binary size.You're only using the runtime and
tokio::time::timeout.features = ["full"]pulls in a lot you don't need.Apply this diff:
-tokio = { version = "1.0", features = ["full"] } +tokio = { version = "1", features = ["rt-multi-thread", "time"] }
13-13: Consider preferring rustls for TLS to avoid OpenSSL headaches.If/when you talk to HTTPS endpoints,
reqwestdefaults to native-tls (OpenSSL on many platforms). Usingrustlstends to be simpler and more portable.For example:
-reqwest = { version = "0.11", features = ["json", "stream"] } +reqwest = { version = "0.11", default-features = false, features = ["json", "stream", "rustls-tls"] }If you truly only ever use plain HTTP, feel free to ignore.
19-19: Dead dependency:tracingis not used.I don't see any
tracingspans/logs insrc/lib.rs. Recommend removing until you actually emit traces.Apply this diff:
-tracing = "0.1"neurons/utils/lightning/src/lib.rs (3)
197-223: UseInstantfor elapsed time to avoid system clock jumps.
SystemTimecan go backwards/forwards;Instantis monotonic and intended for measuring durations.Apply this diff:
- let start_time = SystemTime::now(); + let start_time = std::time::Instant::now(); @@ - let process_time = start_time.elapsed().unwrap_or_default().as_secs_f64().to_string(); + let process_time = start_time.elapsed().as_secs_f64().to_string();
317-318: Header lookup is case-sensitive; normalize to avoid subtle bugs.HTTP header names are case-insensitive. If Python-land ever changes case, this
get("synapse-name")will miss. Normalize keys to lowercase on insert or search case-insensitively.Apply this diff to normalize on insertion:
- for (key, value) in headers.iter() { - let key_str: String = key.extract()?; + for (key, value) in headers.iter() { + let mut key_str: String = key.extract()?; + key_str.make_ascii_lowercase(); let value_str: String = value.extract()?; header_map.insert(key_str, value_str); }
31-38: Unused field:uuidnever read.
uuidis set innewbut isn't used elsewhere. Remove it or plumb it into headers/logs if it’s intended for request correlation.Apply this diff to remove until needed:
pub struct LightningDendrite { client: reqwest::Client, runtime: Arc<tokio::runtime::Runtime>, wallet_hotkey: String, external_ip: String, - uuid: String, } @@ - let uuid = uuid::Uuid::new_v4().to_string(); let external_ip = external_ip.unwrap_or_else(|| "127.0.0.1".to_string()); @@ Self { client, runtime, wallet_hotkey, external_ip, - uuid, }neurons/utils/lightning/__init__.py (1)
3-5: Add a clearer failure mode if the extension isn't built/installed.Right now, import errors will be cryptic. Provide a helpful message with build/install hint.
Apply this diff:
-import lightning +try: + import lightning +except ImportError as e: + raise ImportError( + "Failed to import the Rust extension module 'lightning'. " + "Ensure the 'lightning' wheel is built and installed (e.g., `uv pip install -e neurons/utils/lightning` " + "or let uv resolve it via pyproject tool.uv.sources)." + ) from eneurons/utils/lightning/pyproject.toml (2)
15-17: Explicitly set the module name for the wheel if you change the Rust lib name.If you adopt a new module name, set
module-nameto keep wheel contents aligned.For example:
[tool.maturin] features = ["pyo3/extension-module"] module-name = "omron_lightning_dendrite"
18-21: Dev-dependency duplication.You already require maturin in
[build-system]. Keeping another pin in a dev group is fine, but consider aligning versions to avoid confusion.No code changes required; just a heads-up to avoid drift.
neurons/utils/lightning_dendrite.py (3)
17-17: Prefer UUID4 over time_ns for dendrite.uuid
time_ns()is monotonic per-process but not universally unique. Use a random UUIDv4.- self.uuid = str(time.time_ns()) + self.uuid = str(uuid.uuid4())
60-63: Don’t overwrite process_time after processing the Rust responseYou set
process_timefrom Rust in_process_rust_response, then immediately overwrite it with a locally computed total here. This loses server timing. Keep the Rustprocess_time, or store your total under a different field if available.self._process_rust_response(result, synapse) - synapse.dendrite.process_time = str(time.time() - start_time) return synapse.deserialize() if deserialize else synapse
136-149: Optional: include version on axon TerminalInfo if availableIf
bt.TerminalInfosupportsversion, consider setting it onsynapse.axonas well to mirror dendrite metadata. If not available, ignore.If
AxonInfoincludes the version, you can propagate it like:synapse.axon = bt.TerminalInfo( ip=target_axon_info.ip, port=target_axon_info.port, hotkey=target_axon_info.hotkey, # version=getattr(target_axon_info, "version", None), )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (3)
neurons/utils/lightning/Cargo.lockis excluded by!**/*.lockneurons/utils/lightning/uv.lockis excluded by!**/*.lockuv.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
neurons/_validator/config/__init__.py(2 hunks)neurons/utils/lightning/Cargo.toml(1 hunks)neurons/utils/lightning/__init__.py(1 hunks)neurons/utils/lightning/pyproject.toml(1 hunks)neurons/utils/lightning/src/lib.rs(1 hunks)neurons/utils/lightning_dendrite.py(1 hunks)pyproject.toml(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- neurons/_validator/config/init.py
🧰 Additional context used
🧬 Code Graph Analysis (3)
neurons/utils/lightning_dendrite.py (1)
neurons/utils/lightning/src/lib.rs (3)
lightning(338-341)call(69-115)forward(117-181)
neurons/utils/lightning/__init__.py (1)
neurons/utils/lightning/src/lib.rs (1)
lightning(338-341)
neurons/utils/lightning/src/lib.rs (1)
neurons/utils/lightning_dendrite.py (3)
LightningDendrite(8-194)call(23-63)forward(65-126)
🪛 GitHub Check: CodeQL
neurons/utils/lightning_dendrite.py
[notice] 184-184: Empty except
'except' clause does nothing but pass and there is no explanatory comment.
[notice] 187-187: Empty except
'except' clause does nothing but pass and there is no explanatory comment.
🔇 Additional comments (1)
neurons/utils/lightning/src/lib.rs (1)
327-335: IPv6 and loopback heuristics might need expansion.If axons can be IPv6, this URL builder will produce invalid URLs (missing brackets). Also, matching
external_ipequality is brittle. Consider:
- Detect IPv6 and format
http://[addr]:port/…- Treat private/loopback ranges explicitly
I can propose an IPv6-aware
build_urlif IPv6 is in scope. Is IPv6 expected for axons in your deployment?
| from typing import Union, List | ||
| import bittensor as bt | ||
| from bittensor.core.settings import version_as_int | ||
| from utils.lightning import lightning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incorrect import path to the Rust extension module
from utils.lightning import lightning will fail outside a src-root that happens to expose utils at top-level. The extension is under neurons.utils.lightning, and you also need asyncio and logging imports for the async/threading and error handling below.
Apply this diff:
+import asyncio
+import logging
+import uuid
-from utils.lightning import lightning
+from neurons.utils.lightning import lightning📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from utils.lightning import lightning | |
| import asyncio | |
| import logging | |
| import uuid | |
| from neurons.utils.lightning import lightning |
🤖 Prompt for AI Agents
In neurons/utils/lightning_dendrite.py around line 5, the import path and
missing imports are incorrect: replace the top-level import from utils.lightning
with the correct package path neurons.utils.lightning and add imports for
asyncio and logging; specifically, change the import to import the lightning
extension from neurons.utils.lightning and add "import asyncio" and "import
logging" at the top so the async/threading logic and error logging work
correctly.
| result = self.rust_dendrite.call( | ||
| target_axon=axon_dict, | ||
| synapse_headers=synapse_headers, | ||
| synapse_body=synapse_body, | ||
| signature=signature, | ||
| timeout=timeout, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid blocking the event loop: run the Rust call in a thread
self.rust_dendrite.call(...) is synchronous and uses block_on internally. Calling it directly inside async def call() blocks the event loop, defeating concurrency. Use asyncio.to_thread.
- result = self.rust_dendrite.call(
- target_axon=axon_dict,
- synapse_headers=synapse_headers,
- synapse_body=synapse_body,
- signature=signature,
- timeout=timeout,
- )
+ result = await asyncio.to_thread(
+ self.rust_dendrite.call,
+ target_axon=axon_dict,
+ synapse_headers=synapse_headers,
+ synapse_body=synapse_body,
+ signature=signature,
+ timeout=timeout,
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| result = self.rust_dendrite.call( | |
| target_axon=axon_dict, | |
| synapse_headers=synapse_headers, | |
| synapse_body=synapse_body, | |
| signature=signature, | |
| timeout=timeout, | |
| ) | |
| result = await asyncio.to_thread( | |
| self.rust_dendrite.call, | |
| target_axon=axon_dict, | |
| synapse_headers=synapse_headers, | |
| synapse_body=synapse_body, | |
| signature=signature, | |
| timeout=timeout, | |
| ) |
🤖 Prompt for AI Agents
In neurons/utils/lightning_dendrite.py around lines 52 to 58, the synchronous
call self.rust_dendrite.call(...) is being invoked inside an async function
which blocks the event loop; change it to run in a thread and await it using
asyncio.to_thread (e.g. await asyncio.to_thread(self.rust_dendrite.call,
target_axon=axon_dict, synapse_headers=synapse_headers,
synapse_body=synapse_body, signature=signature, timeout=timeout)), propagating
return value and exceptions unchanged so the async function stays non-blocking.
| processed_synapses = [] | ||
| signatures = [] | ||
| axon_dicts = [] | ||
|
|
||
| for axon_info in axon_infos: | ||
|
|
||
| synapse_copy = synapse.model_copy() | ||
| synapse_copy = self._preprocess_synapse_for_request( | ||
| axon_info, synapse_copy, timeout | ||
| ) | ||
| signature = self._sign_synapse(synapse_copy) | ||
|
|
||
| processed_synapses.append(synapse_copy) | ||
| signatures.append(signature) | ||
| axon_dicts.append( | ||
| { | ||
| "ip": axon_info.ip, | ||
| "port": axon_info.port, | ||
| "hotkey": axon_info.hotkey, | ||
| } | ||
| ) | ||
|
|
||
| synapse_headers = processed_synapses[0].to_headers() | ||
| synapse_body = processed_synapses[0].model_dump() | ||
|
|
||
| results = self.rust_dendrite.forward( | ||
| axons=axon_dicts, | ||
| synapse_headers=synapse_headers, | ||
| synapse_body=synapse_body, | ||
| signatures=signatures, | ||
| timeout=timeout, | ||
| ) | ||
|
|
||
| final_results = [] | ||
| for i, (result, synapse_copy) in enumerate(zip(results, processed_synapses)): | ||
| self._process_rust_response(result, synapse_copy) | ||
|
|
||
| if deserialize: | ||
| final_results.append(synapse_copy.deserialize()) | ||
| else: | ||
| final_results.append(synapse_copy) | ||
|
|
||
| return final_results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Critical: per-axon signature/header mismatch in forward()
You sign each synapse with the axon-specific hotkey (good), but then send a single synapse_headers/synapse_body (from the first synapse) to all axons. The Rust forward implementation reuses that single synapse for each axon. If the signature message includes axon.hotkey (it does), signature verification will fail for all but the first axon.
Two viable paths:
- Option A (preferred): change the Rust
forwardAPI to accept per-axon synapse headers/bodies (lists aligned with axons/signatures) and construct requests accordingly. - Option B (Python-only fallback): issue per-axon calls concurrently from Python, each with its own headers/body/signature, using
asyncio.to_thread. This preserves correctness now without touching Rust, at the cost of N separate FFI calls.
Below is Option B for immediate correctness.
processed_synapses = []
signatures = []
axon_dicts = []
+ headers_list = []
+ bodies_list = []
for axon_info in axon_infos:
synapse_copy = synapse.model_copy()
synapse_copy = self._preprocess_synapse_for_request(
axon_info, synapse_copy, timeout
)
signature = self._sign_synapse(synapse_copy)
processed_synapses.append(synapse_copy)
signatures.append(signature)
axon_dicts.append(
{
"ip": axon_info.ip,
"port": axon_info.port,
"hotkey": axon_info.hotkey,
}
)
+ headers_list.append(synapse_copy.to_headers())
+ bodies_list.append(synapse_copy.model_dump())
- synapse_headers = processed_synapses[0].to_headers()
- synapse_body = processed_synapses[0].model_dump()
-
- results = self.rust_dendrite.forward(
- axons=axon_dicts,
- synapse_headers=synapse_headers,
- synapse_body=synapse_body,
- signatures=signatures,
- timeout=timeout,
- )
+ tasks = [
+ asyncio.to_thread(
+ self.rust_dendrite.call,
+ target_axon=axon_dicts[i],
+ synapse_headers=headers_list[i],
+ synapse_body=bodies_list[i],
+ signature=signatures[i],
+ timeout=timeout,
+ )
+ for i in range(len(axon_dicts))
+ ]
+ results = await asyncio.gather(*tasks, return_exceptions=True)
final_results = []
- for i, (result, synapse_copy) in enumerate(zip(results, processed_synapses)):
- self._process_rust_response(result, synapse_copy)
+ for i, synapse_copy in enumerate(processed_synapses):
+ result = results[i]
+ if isinstance(result, Exception):
+ result = {
+ "status_code": "500",
+ "status_message": f"Client error: {result}",
+ "response_data": None,
+ "process_time": "0.0",
+ }
+ self._process_rust_response(result, synapse_copy)
if deserialize:
final_results.append(synapse_copy.deserialize())
else:
final_results.append(synapse_copy)
return final_resultsFollow-up (Option A): I can propose coordinated changes to the Rust forward signature to accept synapse_headers_list and synapse_body_list and loop over them; want me to draft that?
Committable suggestion skipped: line range outside the PR's diff.
| def _sign_synapse(self, synapse: bt.Synapse) -> str: | ||
| """Sign the synapse request.""" | ||
| # flake8: noqa | ||
| message = f"{synapse.dendrite.nonce}.{synapse.dendrite.hotkey}.{synapse.axon.hotkey}.{synapse.dendrite.uuid}.{synapse.body_hash}" | ||
| signature = f"0x{self.wallet.hotkey.sign(message).hex()}" | ||
| synapse.dendrite.signature = signature | ||
| return signature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sign with bytes, not str
wallet.hotkey.sign() typically expects bytes. Passing a Python str risks a runtime error or unintended encoding. Encode explicitly before signing.
- message = f"{synapse.dendrite.nonce}.{synapse.dendrite.hotkey}.{synapse.axon.hotkey}.{synapse.dendrite.uuid}.{synapse.body_hash}"
- signature = f"0x{self.wallet.hotkey.sign(message).hex()}"
+ message = f"{synapse.dendrite.nonce}.{synapse.dendrite.hotkey}.{synapse.axon.hotkey}.{synapse.dendrite.uuid}.{synapse.body_hash}"
+ message_bytes = message.encode("utf-8")
+ signature = f"0x{self.wallet.hotkey.sign(message_bytes).hex()}"
synapse.dendrite.signature = signature
return signature📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _sign_synapse(self, synapse: bt.Synapse) -> str: | |
| """Sign the synapse request.""" | |
| # flake8: noqa | |
| message = f"{synapse.dendrite.nonce}.{synapse.dendrite.hotkey}.{synapse.axon.hotkey}.{synapse.dendrite.uuid}.{synapse.body_hash}" | |
| signature = f"0x{self.wallet.hotkey.sign(message).hex()}" | |
| synapse.dendrite.signature = signature | |
| return signature | |
| def _sign_synapse(self, synapse: bt.Synapse) -> str: | |
| """Sign the synapse request.""" | |
| # flake8: noqa | |
| message = f"{synapse.dendrite.nonce}.{synapse.dendrite.hotkey}.{synapse.axon.hotkey}.{synapse.dendrite.uuid}.{synapse.body_hash}" | |
| message_bytes = message.encode("utf-8") | |
| signature = f"0x{self.wallet.hotkey.sign(message_bytes).hex()}" | |
| synapse.dendrite.signature = signature | |
| return signature |
🤖 Prompt for AI Agents
In neurons/utils/lightning_dendrite.py around lines 152 to 158, the code builds
message as a Python str and calls self.wallet.hotkey.sign(message) but sign()
expects bytes; change the call to sign an explicit bytes object (e.g.,
message.encode('utf-8')) so the signer receives bytes, keep the signature
formatting the same, assign synapse.dendrite.signature from the hex-prefixed
result, and return the signature.
| synapse.dendrite.status_code = rust_response.get("status_code", "500") | ||
| synapse.dendrite.status_message = rust_response.get( | ||
| "status_message", "Unknown error" | ||
| ) | ||
| synapse.dendrite.process_time = rust_response.get("process_time", "0.0") | ||
|
|
||
| synapse.axon.status_code = synapse.dendrite.status_code | ||
| synapse.axon.status_message = synapse.dendrite.status_message | ||
|
|
||
| response_data = rust_response.get("response_data") | ||
| if response_data and synapse.dendrite.status_code == "200": | ||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normalize status_code type and fix success-path gating
Rust returns an int on success and a string "500" on failure. Comparing to the string "200" will skip deserialization on successful responses. Coerce to int first, store as string (to keep current schema), and compare numerically.
- synapse.dendrite.status_code = rust_response.get("status_code", "500")
- synapse.dendrite.status_message = rust_response.get(
- "status_message", "Unknown error"
- )
- synapse.dendrite.process_time = rust_response.get("process_time", "0.0")
+ status_code_raw = rust_response.get("status_code", 500)
+ try:
+ status_code = int(status_code_raw)
+ except (ValueError, TypeError):
+ status_code = 500
+ synapse.dendrite.status_code = str(status_code)
+ synapse.dendrite.status_message = rust_response.get("status_message", "Unknown error")
+ synapse.dendrite.process_time = str(rust_response.get("process_time", "0.0"))
@@
- response_data = rust_response.get("response_data")
- if response_data and synapse.dendrite.status_code == "200":
+ response_data = rust_response.get("response_data")
+ if response_data and status_code == 200:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| synapse.dendrite.status_code = rust_response.get("status_code", "500") | |
| synapse.dendrite.status_message = rust_response.get( | |
| "status_message", "Unknown error" | |
| ) | |
| synapse.dendrite.process_time = rust_response.get("process_time", "0.0") | |
| synapse.axon.status_code = synapse.dendrite.status_code | |
| synapse.axon.status_message = synapse.dendrite.status_message | |
| response_data = rust_response.get("response_data") | |
| if response_data and synapse.dendrite.status_code == "200": | |
| try: | |
| status_code_raw = rust_response.get("status_code", 500) | |
| try: | |
| status_code = int(status_code_raw) | |
| except (ValueError, TypeError): | |
| status_code = 500 | |
| synapse.dendrite.status_code = str(status_code) | |
| synapse.dendrite.status_message = rust_response.get("status_message", "Unknown error") | |
| synapse.dendrite.process_time = str(rust_response.get("process_time", "0.0")) | |
| synapse.axon.status_code = synapse.dendrite.status_code | |
| synapse.axon.status_message = synapse.dendrite.status_message | |
| response_data = rust_response.get("response_data") | |
| if response_data and status_code == 200: | |
| try: |
🤖 Prompt for AI Agents
In neurons/utils/lightning_dendrite.py around lines 163 to 174, the code treats
status_code inconsistently (Rust returns int on success and string on failure)
causing the success-path gate to fail; coerce the incoming rust_response
status_code to int safely (defaulting to 500 on parse errors), then store
synapse.dendrite.status_code as a string (str(int_code)) and set
synapse.axon.status_code to the same string; use the numeric int_code (== 200)
to decide whether to attempt deserialization of response_data so successful int
200 is recognized correctly.
| [project] | ||
| name = "lightning" | ||
| requires-python = ">=3.8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename the distribution to avoid collision with Lightning.ai’s lightning.
Publishing or installing a package named lightning will collide with the existing project on PyPI. Use a unique name (e.g., omron-lightning-dendrite) and a distinct module name (e.g., omron_lightning_dendrite).
Apply this diff:
-[project]
-name = "lightning"
+[project]
+name = "omron-lightning-dendrite"Follow-ups (in other files; happy to provide full diffs):
- Change
[lib].namein Cargo.toml and the#[pymodule]toomron_lightning_dendrite. - Update Python imports (
from utils.lightning import lightning→import omron_lightning_dendrite as lightningor adjust wrapper accordingly).
🤖 Prompt for AI Agents
In neurons/utils/lightning/pyproject.toml around lines 5-7, the project
distribution name is currently "lightning" which will collide with Lightning.ai
on PyPI; change the name to a unique distribution (e.g.,
"omron-lightning-dendrite") and also update the internal module/package name to
a distinct identifier (e.g., "omron_lightning_dendrite"). Ensure corresponding
changes are made in related files: update [lib].name in Cargo.toml and the
#[pymodule] name to "omron_lightning_dendrite", and then update Python
imports/usages to import the new module name (e.g., import
omron_lightning_dendrite as lightning) or adjust wrappers to preserve
backward-compatible aliases if needed.
| let result = self.runtime.block_on(future); | ||
|
|
||
| match result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Release the GIL while blocking on the Tokio runtime (prevents Python event-loop starvation).
Both call and forward hold the GIL while performing network I/O via runtime.block_on(...). This will block the Python event loop and other Python threads. Wrap the blocking section in py.allow_threads(...).
Apply this diff:
- let result = self.runtime.block_on(future);
+ let result = py.allow_threads(|| self.runtime.block_on(future));and
- let results = self.runtime.block_on(future);
+ let results = py.allow_threads(|| self.runtime.block_on(future));Also applies to: 156-156
🤖 Prompt for AI Agents
In neurons/utils/lightning/src/lib.rs around lines 95-97 (and also at line
~156), the code calls self.runtime.block_on(...) while holding the Python GIL;
change it to release the GIL by wrapping the blocking call in
py.allow_threads(|| { ... }). Concretely, replace let result =
self.runtime.block_on(future); with let result = py.allow_threads(||
self.runtime.block_on(future)); (do the same for the other occurrence), ensuring
you capture the returned result and keep the surrounding match logic unchanged.
| // Convert PyDict to serde_json::Value manually | ||
| let mut body_map = serde_json::Map::new(); | ||
| for (key, value) in body.iter() { | ||
| let key_str: String = key.extract()?; | ||
| // For simplicity, convert all values to strings - could be enhanced for type preservation | ||
| let value_str: String = if value.is_none() { | ||
| String::new() | ||
| } else { | ||
| format!("{}", value) | ||
| }; | ||
| body_map.insert(key_str, Value::String(value_str)); | ||
| } | ||
| let body_value = Value::Object(body_map); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve request body types; current conversion coerces everything to strings.
Serializing every value to String will break servers expecting proper JSON types and nested structures. Use Python's json.dumps to stringify the dict and parse into serde_json::Value.
Apply this diff:
- // Convert PyDict to serde_json::Value manually
- let mut body_map = serde_json::Map::new();
- for (key, value) in body.iter() {
- let key_str: String = key.extract()?;
- // For simplicity, convert all values to strings - could be enhanced for type preservation
- let value_str: String = if value.is_none() {
- String::new()
- } else {
- format!("{}", value)
- };
- body_map.insert(key_str, Value::String(value_str));
- }
- let body_value = Value::Object(body_map);
+ // Convert PyDict to serde_json::Value while preserving types via Python's json module
+ let json_mod = pyo3::types::PyModule::import_bound(body.py(), "json")?;
+ let dumps = json_mod.getattr("dumps")?;
+ let json_str: String = dumps.call1((&body,))?.extract()?;
+ let body_value: Value = serde_json::from_str(&json_str).map_err(|e| {
+ PyErr::new::<pyo3::exceptions::PyValueError, _>(format!("Invalid JSON body: {}", e))
+ })?;Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In neurons/utils/lightning/src/lib.rs around lines 300 to 313, the current loop
coerces every PyDict value into a Rust String, losing JSON types and nesting;
replace it by calling Python's json.dumps on the PyDict to produce a JSON
string, then parse that string into serde_json::Value with serde_json::from_str,
handling and propagating any errors appropriately (acquire the GIL, import or
use the json module/function, call dumps(body), then parse the resulting String
into Value) so the resulting body_value preserves proper JSON types and
structure.
| "fastapi==0.110.3", | ||
| "gitpython>=3.1.44", | ||
| "jsonrpcserver>=5.0.9", | ||
| "maturin>=1.0.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Do not install maturin as a runtime dependency.
maturin is a build tool for the Rust extension, not needed at runtime for omron. Keeping it in [project.dependencies] bloats install and can confuse non-uv installers.
Apply this diff:
- "maturin>=1.0.0",Optionally, add it to a dev group (you already have one in the subproject).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "maturin>=1.0.0", |
🤖 Prompt for AI Agents
In pyproject.toml around line 22, remove "maturin>=1.0.0" from
[project.dependencies] (it is a build tool, not a runtime dependency) and
instead add it to your development dependencies or an optional dev group (e.g.,
your project's dev/ci/dev-dependencies section or the subproject dev group) so
it is available for building/testing but not installed at runtime; ensure any
build-system or packaging config still declares the appropriate build
requirements if needed.
| "lightning", | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Package name collision and resolver portability risk for lightning.
lightningis already a popular PyPI package (Lightning.ai). Depending on a barelightningname will likely pull the wrong distribution under pip/poetry (outside uv), breaking imports.[tool.uv.sources]is uv-specific and won’t help pip; installs from source will accidentally fetch PyPI’slightning.
Strongly recommend renaming your extension distribution to a unique name and updating references.
Proposed plan:
- Rename the Rust/Python project to
omron-lightning-dendrite(distribution) and the module toomron_lightning_dendrite. - Update imports in Python code accordingly.
Diff for this file:
- "lightning",
+ "omron-lightning-dendrite",And:
-[tool.uv.sources]
- torch = { index = "pytorch" }
-lightning = { path = "neurons/utils/lightning" }
+[tool.uv.sources]
+torch = { index = "pytorch" }
+omron-lightning-dendrite = { path = "neurons/utils/lightning" }I can provide coordinated diffs for neurons/utils/lightning/pyproject.toml, neurons/utils/lightning/Cargo.toml, and imports once you confirm the intended new name.
Also applies to: 75-78
🤖 Prompt for AI Agents
In pyproject.toml around lines 44-45 (and also 75-78) the project declares the
distribution name "lightning", which collides with the existing PyPI package;
rename the distribution to "omron-lightning-dendrite" and the Python
package/module to "omron_lightning_dendrite" in this file, update any
[tool.uv.sources] entries to reference the new name, and then update the Rust
Cargo.toml (package.name) and all Python import paths to use
omron_lightning_dendrite so installs via pip/poetry pull your package rather
than PyPI's lightning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
neurons/_validator/utils/axon.py (1)
33-37: Ensure response_time is numeric (LightningDendrite sets process_time as string)LightningDendrite sets
synapse.dendrite.process_timeto a string; yourRequest.response_timeis typed asfloat | None. Without casting, you’ll store a string here and risk downstream type errors.- request.response_time = ( - result.dendrite.process_time - if result.dendrite.process_time is not None - else request.circuit.timeout - ) + try: + request.response_time = float(result.dendrite.process_time) + except (TypeError, ValueError): + request.response_time = request.circuit.timeoutIf you prefer to keep the existing shape, alternatively normalize
process_timeat the LightningDendrite level.
🧹 Nitpick comments (3)
neurons/_validator/utils/axon.py (3)
5-5: Prefer package-absolute or explicit relative import to avoid fragile sys.path assumptionsCurrent import assumes
utilsis importable from the working directory. To make this robust across packaging, tests, and tools, use an absolute package import or an explicit relative import from withinneurons._validator.utils.Apply one of the following diffs (choose one based on your packaging preference):
Option A: Absolute import
-from utils.lightning_dendrite import LightningDendrite +from neurons.utils.lightning_dendrite import LightningDendriteOption B: Explicit relative import (from neurons._validator.utils.axon -> neurons.utils.lightning_dendrite)
-from utils.lightning_dendrite import LightningDendrite +from ...utils.lightning_dendrite import LightningDendrite
15-16: Update docstring to reflect the new accepted dendrite typesDocstring still documents
bt.dendriteonly.- dendrite (bt.dendrite): The dendrite to use for querying. + dendrite (LightningDendrite | bt.dendrite): The dendrite to use for querying.
30-31: Use explicit None check instead of truthinessSome Synapse objects can define truthiness in surprising ways. Be explicit.
- if not result: + if result is None: return None
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
neurons/_validator/utils/axon.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
neurons/_validator/utils/axon.py (2)
neurons/utils/lightning_dendrite.py (1)
LightningDendrite(8-194)neurons/_validator/core/request.py (1)
Request(11-26)
🔇 Additional comments (1)
neurons/_validator/utils/axon.py (1)
39-40: Verify type alignment forresult.deserialize()
Inneurons/_validator/utils/axon.py(lines 39–40), you assignrequest.deserialized = result.deserialize()but
Request.deserializedis annotated asdict[str, object] | None(seeneurons/_validator/core/request.py:24), whileresultis of typebt.Synapse | None. We didn’t locate an in-repodeserialize()on anySynapsesubclass, so:
- What does
bt.Synapse.deserialize()actually return?- If it returns a
Synapse(or another object), consider:
- Moving that object into
request.resultand keepingdeserializedstrictly for dict payloads- Or updating the
deserializedannotation to match the real return typePlease confirm and adjust accordingly.
| async def query_single_axon( | ||
| dendrite: LightningDendrite | bt.dendrite, request: Request | ||
| ) -> Request | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Type-hint expansion looks good—ensure both dendrite types share the same call(...) signature
Allowing LightningDendrite | bt.dendrite is sensible. Please confirm the standard bittensor dendrite’s call(...) supports the target_axon=..., synapse=..., timeout=..., deserialize=... signature, as used below. If not, consider an adapter or a small compatibility shim for the classic dendrite.
Run this script to inspect in-repo usage patterns of dendrite.call(...) and confirm argument shapes are consistent:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Call sites of dendrite.call(...):"
rg -nP -C2 '\bdendrite\.call\s*\('
echo
echo "Occurrences of target_axon= named argument at call sites:"
rg -nP -C2 'target_axon\s*='Length of output: 1683
Ensure uniform call(...) signature across all dendrite implementations
The query_single_axon helper unconditionally invokes:
await dendrite.call(
target_axon=request.axon,
synapse=request.synapse,
timeout=request.circuit.timeout,
deserialize=True,
)but the LightningDendrite.call implementation only declares and forwards:
def call(self, *, target_axon, synapse_headers, synapse_body):
return self.rust_dendrite.call(
target_axon=axon_dict,
synapse_headers=synapse_headers,
synapse_body=synapse_body,
)– it neither accepts nor ignores timeout or deserialize.
Action items:
- neurons/_validator/utils/axon.py (around line 23): currently assumes every
dendrite.callsupportstimeoutanddeserialize. - neurons/utils/lightning_dendrite.py (definition of
def call): extend the signature to accepttimeoutanddeserialize(even if they’re no-ops) or provide a wrapper that normalizes both classic and lightning dendrites to a common API. - (Optionally) verify the external
bt.dendrite.callsignature to ensure it aligns with your unified interface.
Without this shim, calls against LightningDendrite will raise on unexpected keyword arguments.
Summary by CodeRabbit
New Features
Refactor
Bug Fixes
Chores