Skip to content

Commit fff9c5c

Browse files
Technologicatclaude
andcommitted
briefs: design brief for #129 and its prequisite
Documents the two-PR plan: - PR 1 (prequisite): generalise the analyzer's notion of "defined Node" to cover module-level name bindings. Renames Flavor.NAMESPACE to Flavor.SCOPE for clarity. - PR 2 (#129): NAMESPACE_OBJECT overlay with constructor registry, literal/name-bound/imported setattr resolution, and CLI extensibility. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent f6bd828 commit fff9c5c

1 file changed

Lines changed: 133 additions & 0 deletions

File tree

briefs/namespace-objects-brief.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# Namespace Objects — Design Brief
2+
3+
This brief documents the design behind the work tracked in **#129** (resolve attribute access on namespace-constructor bindings to the specific target) and its prequisite — generalising the analyzer's notion of "defined Node" to cover module-level name bindings.
4+
5+
Audience: future archaeology. If you're trying to figure out *why* `Flavor.SCOPE` and `Flavor.NAMESPACE_OBJECT` exist as separate flavors, or why module-level `x = 5` produces a defined Node where it previously did not, this is the document.
6+
7+
## Context
8+
9+
### The asymmetry
10+
11+
Until this work, the analyzer treated only a subset of Python's named entities as graph Nodes:
12+
13+
- `class Foo: ...` → defined `CLASS` Node at `mymod.Foo`, with a populated scope.
14+
- `def foo(): ...` → defined `FUNCTION` / `METHOD` Node at `mymod.foo`, with a scope.
15+
- `import foo` / `from x import foo``IMPORTEDITEM` Node, remapped during postprocessing.
16+
- `mymod.x = 5` (plain module-level assignment) → **no Node**. Just a `set_value("x", ...)` into the module's scope's `defs` dict. Invisible from outside the module.
17+
18+
The asymmetry has two consequences:
19+
20+
1. **`from mymod import x` for a plain assignment can't resolve.** The IMPORTEDITEM at `mymod.x` has nothing to remap to — there is no defined Node at that path. Postprocessing contracts it to a wildcard. The edge to the actual binding is lost.
21+
2. **Attribute resolution on namespace-like values fails.** `config = env(thingy=baa)` followed by external `config.thingy` cannot resolve, because `config` is not a Node and has no scope. #127 patched this with a module-level fallback edge, which is honest but coarse — the edge lands on the *module* containing `config`, not on `config` itself or its kwarg target.
22+
23+
#129 set out to fix the second consequence for namespace-constructor bindings specifically. In design, we found that the cleaner shape is to fix the underlying asymmetry first, and let #129 fall out as a small overlay.
24+
25+
### The principle
26+
27+
> Every named entity reachable from outside its definition site is a Node.
28+
29+
"Reachable from outside" is the cut. Module-level and class-level bindings are reachable (via import, via attribute access). Function locals are not — promoting every loop variable to a Node would make the graph unusable. Closures are the existing exception, already handled by the nested-def machinery.
30+
31+
Under this principle, the analyzer's existing handling of classes / functions / modules is the rule, and "plain module-level assignment is invisible" was the deviation. We bring assignments into line.
32+
33+
## Two-PR plan
34+
35+
The work splits into a **[prequisite](https://github.com/Technologicat/substrate-independent/blob/main/glossary.md#prequisite)** (architectural cleanup, no new feature) and **#129** (the targeted overlay). Done in this order so each diff is reviewable on its own and the test churn from the prequisite doesn't co-mingle with the feature's logic.
36+
37+
### PR 1 — Prequisite: module-level NAME Node-ification
38+
39+
**What changes**:
40+
41+
- Every module-level binding produces a defined `NAME`-flavored Node, addressable via the LHS dotted path. (`Flavor.NAME` was previously used only for PEP 695 type aliases at `analyzer.py:1337`. We extend its remit.)
42+
- Class-level bindings already produce attribute Nodes via the existing class-scope machinery; no change there.
43+
- Function-locals still go through `set_value`-only — no Node, no graph clutter. The cut is module-level + class-level.
44+
- A flavor rename for clarity: **`Flavor.NAMESPACE``Flavor.SCOPE`**. The existing `NAMESPACE` flavor is the synthetic "this dotted prefix represents a namespace" marker used for structural bookkeeping (module / class / function scope objects). Calling it `SCOPE` keeps it distinguishable from #129's runtime-namespace flavor (`NAMESPACE_OBJECT`). The Node represents the scope; the `Scope` class implements one — same concept at two layers.
45+
- Default-suppress NAME Nodes that have no incoming or outgoing edges. Module constants like `__version__ = "..."` would otherwise add visual noise to the default graph output. Suppression happens in the visgraph layer (filtering before render), not the analyzer (the Node still exists for cross-module resolution). A future CLI flag can opt back in if anyone wants the noise.
46+
47+
**Behavioral consequences**:
48+
49+
- `from mymod import CONSTANT` now creates an edge to the actual `mymod.CONSTANT` Node instead of contracting to a wildcard. Latent precision win across the codebase.
50+
- Test fixture churn: any fixture with module-level constants will gain new defines edges. Mechanically tedious but bounded — each test's regenerated graph is mechanically derivable from its source.
51+
- Public API: no changes to `create_callgraph` / `create_modulegraph` signatures.
52+
53+
### PR 2 — #129: NAMESPACE_OBJECT overlay
54+
55+
Layered onto PR 1's foundation. The framing changes from "create a new kind of Node" to "upgrade a NAME Node's flavor and populate its scope."
56+
57+
**What changes**:
58+
59+
1. **New flavor `Flavor.NAMESPACE_OBJECT`** in `node.py`, with a comment cross-linking to `Flavor.SCOPE` (which up to pyan3 2.5.0 used to be named `Flavor.NAMESPACE`) to disambiguate. Semantically: a runtime namespace value (an `env` instance, a `SimpleNamespace` instance), populated with statically-visible kwargs.
60+
61+
2. **Constructor registry** — module-level frozenset in `anutils.py`:
62+
```python
63+
NAMESPACE_CONSTRUCTORS = frozenset({
64+
"unpythonic.env.env",
65+
"unpythonic.env", # top-level re-export (yes, the module shadows the class)
66+
"types.SimpleNamespace",
67+
"argparse.Namespace",
68+
})
69+
```
70+
Merged at analyzer-construction time with user additions from CLI (`--namespace-constructor FQN`, `action="append"`, accepts comma-split). One-shot stderr nudge fires the first time the option is supplied, inviting the user to file an issue if their constructor is reasonably common — we want to grow the built-in registry from observed real-world use.
71+
72+
3. **Single recognition helper**, called from the four binding sites (`visit_Assign`, `visit_AnnAssign`, `visit_NamedExpr`, `_visit_with`). Detects `Call(func=...)` rhs whose resolved func has a fully-qualified import origin (`namespace + "." + name`) in the merged registry. On hit:
73+
- Upgrades the LHS Node's flavor from `NAME` to `NAMESPACE_OBJECT`.
74+
- Ensures `self.scopes[lhs.get_name()]` exists.
75+
- Registers `{kwarg.arg: visit(kwarg.value)}` into that scope's `defs`.
76+
77+
4. **Attribute writes (`e.k = v`) need no new code.** `_bind_target`'s Attribute branch already calls `set_attribute`, which writes into the obj's scope's `defs` if the scope exists (`analyzer.py:2326-2331`). PR 1's Node creation + PR 2's scope creation is sufficient; later writes are picked up automatically. This covers the staged form `config = env(); config.a = baa` for free.
78+
79+
5. **`setattr(target, name, value)` recognition.** Symmetric counterpart to point 4 for the dynamic form. A helper `_try_register_setattr_call` in `visit_Call` checks two structural preconditions (`func` resolves to FQN `"builtins.setattr"` — handles aliased imports for free via scope-chain resolution; `target` resolves to a `NAMESPACE_OBJECT`-flavored Node) and then resolves `name` through three concentric levels of static knowability:
80+
81+
- **Level 1 — literal string.** `name` is `Constant(value=str)`. Use the value directly.
82+
- **Level 2 — name-bound literal.** `name` is `Name(id=k)` where the scope chain has `k` bound to a string literal. Requires a parallel tracking state: a `name_literals` dict (or extension of `Scope`) populated in `visit_Assign` / `visit_AnnAssign` whenever a `Name` target gets a string-`Constant` rhs. Flow-insensitive (latest-seen wins) — same posture pyan takes elsewhere.
83+
- **Level 3 — cross-module name-bound literal.** `name` is `Name(id=k)` where `k` resolves through an import to a string literal in another module. The `name_literals` machinery is per-module (keyed by namespace); cross-module lookup follows the same path import resolution does. PR 1's module-level Node-ification is what makes this addressable.
84+
85+
On match (any level): write `{resolved_name: visit(value)}` into target's scope's `defs`. On miss: no-op — same floor as today.
86+
87+
The symmetric `delattr(target, name)` is intentionally not handled: pyan is flow-insensitive and already chooses not to clear bindings on `del obj.attr` (see comment in `visit_Delete` — clearing in a branch that doesn't always execute would be wrong as often as right). Same reasoning applies.
88+
89+
6. **Pass timing.** Recognition runs idempotently in both passes — no `current_pass` attribute, no gating. Pass 1 catches the common case (imports typically precede bindings in source order); pass 2 corrects any forward-import edge cases. `get_node` is idempotent, scope `defs[k] = v` is overwrite-with-same-value, defines edges deduplicate. The simplicity wins over a more controlled split.
90+
91+
## Scope and non-goals
92+
93+
Out of scope for this work — degrades to #127's module-level fallback, which is the right floor:
94+
95+
- **Factory-returned namespaces** (`config = make_config()` where `make_config` returns an env). Requires return-value type tracking. Big.
96+
- **Splat construction** (`env(**kwargs)`). Kwargs not statically visible.
97+
- **Genuinely dynamic writes** (`for k, v in source.items(): setattr(config, k, v)`, or `config.__dict__[k] = v`). The `setattr` form is covered when `k` is a literal string, a name bound to a string literal, or an imported name resolving to one — see point 5 in PR 2. Anything beyond that (loop variables, function-returned strings, computed strings) requires data-flow analysis — out of scope by design for a static analyzer.
98+
- **Aliasing through function returns / parameter passing / container indexing.** Pyan already shares Nodes across direct name aliasing (`f = e`); deeper data-flow tracking is its own large project.
99+
100+
For each of these, #127's fallback continues to emit the module-level edge, which is honest about what static analysis can know.
101+
102+
## Walrus and other corners
103+
104+
- `(obj.attr := v)` is a `SyntaxError` per PEP 572 — walrus targets are restricted to `Name`. No attribute-walrus path to handle.
105+
- `(config := env(thingy=baa))` is legal walrus (`Name := Call`) and routes through `visit_NamedExpr`.
106+
- Context-manager construction (`with env(thingy=baa) as e:`) registers identically to assignment. The runtime lifecycle differences between constructors (`unpythonic.env.env` clears bindings on scope exit; `SimpleNamespace` and `argparse.Namespace` don't even implement the CM protocol) are flow-sensitive concerns out of scope for a static analyzer.
107+
108+
## Aliasing
109+
110+
Pyan already shares Nodes across name aliases — `f = e` makes both names resolve to the same Node. So `NAMESPACE_OBJECT` flavor and its populated scope are inherited by aliases for free, without any explicit alias-tracking code in this work.
111+
112+
## Test plan
113+
114+
- **PR 1**: regenerate expected graphs in `tests/test_features.py`, `tests/test_writers.py`, etc. for fixtures with module-level constants. Add a coverage test for `from mymod import CONSTANT` resolution. Verify that function-local bindings still don't create Nodes.
115+
- **PR 2**:
116+
- Direct construction: `config = env(thingy=baa); use config.thingy` resolves to `baa`'s Node, no module-level fallback.
117+
- Cross-module: separate-module fixture with `config = env(...)`, consumer doing `from app_state import config; config.dataset` — verify the edge bypasses #127's fallback.
118+
- Staged form: `config = env(); config.a = baa` — point 4 picks up the later write.
119+
- `setattr` form, level 1: `config = env(); setattr(config, "a", baa)` — literal-string write.
120+
- `setattr` form, level 2: `k = "a"; setattr(config, k, baa)` — name-bound literal in same scope.
121+
- `setattr` form, level 3: cross-module fixture with `KEY = "a"` in one module, `setattr(config, KEY, baa)` in another after `from constants import KEY`.
122+
- `setattr` form, negative: `for k in keys: setattr(config, k, baa)` — loop-bound key, confirming graceful no-op.
123+
- Walrus: `(config := env(thingy=baa))`.
124+
- Context manager: `with env(thingy=baa) as e:` body uses `e.thingy`.
125+
- Negative: factory-returned `config = make_config()` should still hit #127's fallback (regression guard against over-firing).
126+
- CLI: `--namespace-constructor my.custom.NS` with corresponding fixture, plus the stderr-nudge test.
127+
- All four constructors in the built-in registry (env, SimpleNamespace, argparse.Namespace, top-level `unpythonic.env`) get at least one fixture each.
128+
129+
## Cross-references
130+
131+
- Predecessor: #127 (module-level fallback for unresolvable attribute access).
132+
- Tracking issue: #129.
133+
- Touches: `pyan/analyzer.py`, `pyan/anutils.py`, `pyan/node.py`, `pyan/visgraph.py` (NAME-Node suppression), `pyan/main.py` (CLI option), tests under `tests/`.

0 commit comments

Comments
 (0)