diff --git a/python/src/hhat_lang/README.md b/python/src/hhat_lang/README.md new file mode 100644 index 0000000..d6ae013 --- /dev/null +++ b/python/src/hhat_lang/README.md @@ -0,0 +1,26 @@ +# H‑hat Python Package +High-level container for the Python implementation of the H‑hat language stack. Provides: +* Core language substrate (IR, type system, memory / scope, import + linking, execution contracts, error model, backend abstraction). +* Dialect hosting surface (example dialect(s), parsing + lowering entry points, dialect-specific assets). +* Low-level target integration (quantum assembly language representations, backend adapters / emitters). +* User tooling (CLI, project scaffolding, notebook integration, auxiliary developer utilities). + +## 1. Architectural Overview +End-to-end flow (conceptual pipeline): +Dialect Frontend (parse / build) → Core IR (modules, blocks, instructions) → Linking / Imports → Execution (interpreter / evaluator) → Low-Level Backend Adapter → Target Runtime (simulator / hardware / serialization) + +## 2. Directory Topology +``` +hhat_lang/ +├── __init__.py # Package marker / version surface (keep lightweight) +├── core/ # Stable substrate: IR, types, memory, execution, imports, error model, low-level abstraction +├── dialects/ # Dialect implementations + their parsers / lowerers / dialect-specific artifacts +├── low_level/ # Target-facing quantum language abstractions + backend adapter layering +└── toolchain/ # User & developer tooling (CLI, project scaffolds, notebook helpers) +``` + +Subdirectory scope (conceptual — refer to local READMEs for expansion): +* `core/`: Owns fundamental invariants. Defines data & control abstractions consumed by all other layers. Treated as the most stable boundary. +* `dialects/`: Houses one or more domain-specific syntactic/semantic layers built atop `core`. Each dialect lowers into the same IR to ensure uniform backend interoperability. +* `low_level/`: Encapsulates translation to concrete quantum (or hybrid) target languages and hardware/software backends. Keeps vendor / platform specifics out of `core` & dialect logic. +* `toolchain/`: Provides operational entry points: command-line interface, project creation, optional notebook integration, and any workflow utilities. diff --git a/python/src/hhat_lang/core/README.md b/python/src/hhat_lang/core/README.md new file mode 100644 index 0000000..47e5c49 --- /dev/null +++ b/python/src/hhat_lang/core/README.md @@ -0,0 +1,46 @@ +# Core Layer Overview + +Foundational abstractions and interfaces that define the program model, its intermediate representation (IR), the runtime memory model, evaluator contracts, cross-module linking, backend integration points, and error semantics. This layer is dialect-agnostic and backend-agnostic; it bounds the stable contracts used by compilers and tools within the project. + +## 1. Purpose +Provide a coherent kernel for building, linking, and evaluating programs across classical and quantum paradigms. Emphasis is on: +* Ordered maps and explicit equality/hash semantics facilitate separate compilation. Python’s `hash()` is process‑local; cross‑run stability requires explicit identifiers (e.g., UUID v5 from canonical strings). +* Clear separation of concerns: syntax lives in dialects; execution/backends implement pluggable interfaces; core holds types, IR shape, and runtime contracts. +* Employ typed result propagation (`Ok`/`Error`) with explicit error codes along runtime paths; validation errors and precondition violations may raise exceptions. + +## 2. Subsystem Layout +High‑level roles of the immediate subdirectories (detailed specifications appear in each subdirectory’s README): +* `code/`: Structural IR substrate (modules, blocks, instructions, symbol and reference tables, hashing helpers). +* `compiler/`: Lowering contracts from dialect-specific parses/builders into Core IR. +* `data/`: Canonical symbolic entities (symbols, literals), function signatures/definitions, and value containers. +* `types/`: Type-system primitives, built-ins, and size/compatibility utilities. +* `memory/`: Runtime memory model (stack/heap/scopes) and allocation/index management. +* `execution/`: Evaluator traits and program orchestration interfaces. +* `imports/`: Cross-IR linking and reference resolution protocols. +* `lowlevel/`: Backend adapter interfaces for emitting device/runtime instructions. +* `error_handlers/`: Centralized error codes and typed error handlers. + +This README intentionally omits per‑file details for these directories; refer to each subdirectory’s README for specifications. + +## 3. Processing Flow +Dialect Source → (Dialect Parser) → Compiler Lowering → Core IR Module(s) → Imports/Linking (external symbol refs) → Execution (evaluators + memory) → Low-level Emission (backend adapters). + +Types and data entities propagate along this path: the compiler populates symbol/reference tables; the imports layer binds external entries; the execution layer materializes runtime values via the memory model; low‑level back ends consume resolved operations. + +## 4. File Inventory +Technical description of the files in this directory (excluding subdirectories): + +* `__init__.py`: Defines `DataParadigm` (`StrEnum`) with members `classical` and `quantum`. This enum provides an explicit, comparable tag used across core subsystems (types, data containers, evaluators) to select paradigm‑specific behavior. Invariants: the set of paradigms is fixed; clients must not rely on implicit truthiness or ad hoc strings. + +* `namespace.py`: Namespacing utilities for stable, fully-qualified identifiers. + - `Namespace`: Tuple‑backed namespace; supports membership tests and compact `repr` via dot‑separated segments. Serves as the canonical container for hierarchical scopes (e.g., module, package, dialect qualifiers). + - `FullName`: Couples a `Namespace` with a terminal name; supports membership checks against the enclosing namespace and renders as `namespace.name`. Used wherever stable, human-readable, and hashable identifiers are required without embedding type information. + +* `utils.py`: Core utilities used across IR construction and evaluation. + - `gen_uuid(obj)`: UUID version 5 (OID namespace) converted to an integer; determinism assumes the input representation (`str(obj)`) is stable across runs (avoid ephemeral object representations) to ensure reproducible layout and indexing. + - `SymbolOrdered`: Ordered mapping specialized for symbol-like keys. Accepts `str`, `Symbol`, `CompositeSymbol`, `WorkingData`, or `int` and normalizes to canonical keys, preserving insertion order. Contract: key normalization is lossless for symbol types; iteration preserves deterministic ordering; suitable for building symbol tables and composite data structures. + The `keys()` method yields normalized values (e.g., `Symbol.value`) rather than typed key objects; use `items()` to retrieve typed keys. + - `Result`/`Ok`/`Error`: Minimal typed result wrapper used by evaluators and instruction executions. `Ok` yields the successful payload; `Error` carries an `ErrorHandler`. Encourages explicit, inspectable handling of success and failure without raising exceptions through core layers. + +## 5. Status +The core package provides stable scaffolding and directory-level READMEs. File-by-file documentation lives in each subdirectory. This document covers only the files defined directly in `core/` and the architectural role of its subdirectories. diff --git a/python/src/hhat_lang/core/code/README.md b/python/src/hhat_lang/core/code/README.md new file mode 100644 index 0000000..dccdcd4 --- /dev/null +++ b/python/src/hhat_lang/core/code/README.md @@ -0,0 +1,94 @@ +# Code Layer Overview + +Structural substrate for the core intermediate representation. This layer defines the unit of compilation, the shape of executable blocks and instructions, the symbol tables for types and functions, cross module reference tables, an explicit program graph over intermediate representation units, and supporting utilities for perfect hashing and instruction lifecycle. The layer is paradigm aware across classical and quantum computation while remaining dialect neutral and backend neutral. + +## 1. Purpose +Provide a precise model for representing, linking, and interrogating programs after lowering from dialects and before evaluation or backend emission. Emphasis is on explicit identity, deterministic ordering where required such as symbol and function tables, and symmetry between type entities and function entities. Cross module dependencies are recorded and validated in a graph that supports constant time lookup via a generated perfect hash function. + +## 2. Design Overview +* Unit of compilation is identified by a filesystem path and exposes a main executable block together with a symbol table. Hash and equality for the unit are defined in terms of path identity and table contents. +* Symbol tables are split by kind. Type entries map a symbol or composite symbol to a concrete type structure. Function entries map a function name to a set of overloads keyed by signature. Signatures are defined by the function name and an ordered tuple of argument types. Equality and hashing for signatures are derived from these fields. +* Reference tables encode imports across units. One table maps type names to the unit that defines them. Another maps function signatures to the unit that defines them. Both support membership queries by name based keys and by signature based keys. +* The program graph stores units as nodes. Nodes carry both the unit and a key object derived from the unit path. A build procedure locks the graph by generating a perfect hash function over the nodes and by validating that all recorded references resolve to present nodes. +* Instructions are modeled with an abstract callable interface, an explicit lifecycle status, and an annotation of the data paradigm. Quantum specific instructions may carry a flag that indicates argument generation should be skipped during preparation. +* Utilities provide a perfect hashing scheme with tunable parameters for collision free indexing of finite tuples, and a validator that enforces containment constraints between quantum and classical data. + +## 3. Structural Units +The compilation unit encapsulates three concerns: identity, declarations, and entry block. +* **Identity**: a path value yields a numeric identifier via the implementation defined hash primitive. This identifier is process local and intended for in memory indexing. It is not a persistent identifier across runs. +* **Declarations**: a table of types and a table of functions. Queries by name return either a concrete definition for types or the mapping from signatures to definitions for functions. Queries by full signature return the single definition for that signature when present. +* **Entry block**: a block level container whose tag classifies the block kind. Blocks are iterable over their arguments which can include values, composite values, nested blocks, or instructions. + +## 4. Symbol and Reference Tables +**Type table** +* Backed by an ordered mapping to preserve insertion order, which allows deterministic iteration and reproducible printing. +* Keys are symbols or composite symbols. Values are concrete type structures that are expected to be hashable and comparable. +* Addition is idempotent for the same key and does not overwrite an existing entry. Retrieval by key returns either a definition or a default value when specified. Removal is out of scope for this layer. + +**Terminology** +* Symbol denotes an atomic program identifier. +* Composite symbol denotes a qualified identifier formed from multiple segments. +* Function signature denotes a function name paired with an ordered tuple of argument types. + +**Function table** +* A two level mapping. The outer mapping keys by function name. The inner mapping keys by signature and stores the definition. This supports multiple overloads for the same name. +* A signature is constructed from the function name and the ordered tuple of argument types. A companion key built from the same fields supports query by name and argument types without requiring argument names. +* Signature construction enforces that names are symbols and argument types are symbols or composite symbols. Violations raise errors during signature creation. +* Queries support three modes. By name to return all overloads for that name. By signature to return a single definition when present or no result when absent. By membership to test whether a name or a signature is present. Membership by signature presumes that the function name exists in the table. + +**Reference tables** +* Two independent mappings record imports. One for type names to defining unit. One for function signatures to defining unit. +* Each entry maps a name like key to a key object that encapsulates the defining unit path and derives a numeric identifier from it. Membership supports checking by names or signatures. Equality and hashing for the key object are defined to enable direct comparison with node keys in the program graph. + +## 5. Program Graph +**Nodes and keys** +* Each node wraps a unit. The node stores the unit path, a numeric identifier derived from the unit path, and a key object that exposes both the path and the numeric identifier. +* Membership over a node supports queries by symbol, composite symbol, function signature, or by path value. This allows testing whether a declaration belongs to a particular unit. +* Membership over the node set also permits queries that pair a unit path with a declaration name or signature. +* Signature based membership relies on function table semantics and presumes the name is present. + +**Graph construction** +* Nodes are accumulated in a staging collection. The last added node can be designated as the main unit of the program. +* A build procedure constructs an immutable set of nodes ordered by a perfect hash function over the node keys. During the build, every reference from every node must resolve to an existing node. If a reference is missing, construction fails with an error. +* After build, nodes can be addressed in constant time by applying the generated perfect hash function to the key object. +* Constant time addressing assumes the provided key belongs to the built set of nodes. + +**Complexity** +* Addressing a node by key runs in constant time after the graph is built. +* Graph construction enumerates parameter values for the perfect hash function and is proportional to the parameter search bounds times the number of nodes. + +**Lookup helpers** +* A utility constructs a reference table from two mappings provided by the compiler. One mapping associates type names with defining unit paths. The other associates function signatures with defining unit paths. +* Two procedures implement import semantics over the graph. Given a node key for the importing unit and a type name the type definition is returned if present. Given a node key and a function signature the single definition is returned if present. Listing the available overloads by name is performed through the function table rather than this helper. +* If the function name is not present the signature lookup fails with an error rather than returning no result. + +**Error semantics** +* A missing function name renders signature based lookup ill formed and raises an error. +* A missing signature under a present name yields no result. + +## 6. Instruction Model +**Abstract instruction** +* Instructions are callable and carry a lifecycle status. The status admits the following values in increasing order of progress: not started, running, timeout, interrupted, done, error. Status begins at not started. +* Each instruction exposes its data paradigm. A query determines whether the instruction is quantum or classical. + +**Paradigm specific behavior** +* Quantum instructions can carry a flag that indicates argument generation should be skipped during preparation. Classical instructions never set this flag. +* A validator enforces that classical data cannot contain quantum attributes while quantum data can contain classical attributes. + +## 7. Utilities and Perfect Hashing +**Perfect hash function** +* The generator searches for parameters a and r that yield a collision free arrangement of a given finite tuple under a chosen prime and tuple size. The search space for a is bounded by a large constant. The search space for r depends on the machine word size and is compatible with word sizes of sixty four or one hundred twenty eight bits. +* The resulting parameters are returned together with the ordered tuple. A companion evaluator computes the position for any value under the returned parameters in constant time. + +**General hashing** +* A key object encapsulates a path and exposes both the path and a numeric identifier derived from it. Equality supports comparison with another such key as well as with a unit by comparing the unit path. +* Key domain objects define hashing and equality through their observable fields. The design avoids relying on object identity for program semantics. + +## 8. File Inventory + +* `abstract.py`: abstract definitions for units, reference tables for types and functions, and the small key object used as the node key in the program graph. Hashing and equality are specified for the key object and for the module abstraction, and are provided where semantic identity matters elsewhere. +* `base.py`: base structures for function signature keys and function queries by signature, together with abstract blocks, instruction flags, and instruction containers. Hashing and equality are derived from semantic fields. +* `instructions.py`: abstract instruction interface with lifecycle status and paradigm attribute, plus classical and quantum specializations and the quantum only argument generation flag. +* `new_ir.py`: concrete node wrapper, node set with perfect hash indexing, program graph construction with validation of cross unit references, helpers to build reference tables, and import helpers for types and functions. +* `symbol_table.py`: ordered tables for types and for functions, including overload support under a two level mapping and membership semantics for names and signatures. +* `utils.py`: lifecycle status enumeration, validator for quantum and classical attribute composition, and primitives to generate and evaluate perfect hash functions with parameters a and r under a chosen prime. diff --git a/python/src/hhat_lang/core/compiler/README.md b/python/src/hhat_lang/core/compiler/README.md new file mode 100644 index 0000000..3e3e59c --- /dev/null +++ b/python/src/hhat_lang/core/compiler/README.md @@ -0,0 +1,46 @@ +# Compiler Layer Overview + +Abstract compiler interface and coordination context for lowering into core representations and for driving evaluation across classical and quantum programs. This directory defines contracts for parsing and evaluation. Implementations carry references to cooperating compilers, evaluators, quantum device specifications, backends, and language descriptors. + +## 1. Purpose +1. Define a stable abstract interface for compile and run phases. +2. Centralize references to cooperating compilers across dialects and paradigms. +3. Provide consistent entry points for parsing and for evaluation. + +## 2. Architectural Role +This layer sits between dialect front ends and runtime evaluators. The compiler consumes source artifacts and produces an intermediate representation or an abstract syntax tree. The compiler invokes evaluation through registered evaluators. The layer does not prescribe the IR shape or the evaluator API. It fixes method names and minimal contracts that implementations must satisfy. Implementations may delegate evaluation to interpreter and evaluator contracts in hhat_lang.core.execution. + +## 3. Processing Flow +Dialect source → BaseCompiler.parse → IR or AST → BaseCompiler.evaluate → result value or runtime effect. + +## 4. File Inventory +Technical description of files in this directory. Subdirectory contents are documented in their own locations. No subdirectories are present in this path. + +### core.py +Module path: `hhat_lang.core.compiler.core` + +Public API and contracts: + +1. Class `BaseCompiler` + Abstract base class for compilers. Holds compilation information including cooperating compilers for classical and quantum paradigms, evaluators that execute intermediate code for those paradigms, quantum device specifications, backends, and quantum language descriptors. Inherits from `abc.ABC`. + +2. Method `parse(self)` + Abstract method. Calling this method on the abstract class raises `NotImplementedError`. A concrete implementation ingests program source or a builder context captured at construction time and produces an intermediate artifact such as a module graph or an abstract syntax tree. Implementations should document the concrete return type. Implementations may populate symbol tables or analysis caches as a side effect. Implementations should avoid global state. + +3. Method `evaluate(self)` + Abstract method. Calling this method on the abstract class raises `NotImplementedError`. A concrete implementation consumes the artifact produced by `parse` and drives evaluation through available evaluators. Implementations should document the concrete return type and the meaning of the result. Implementations should define how runtime errors are surfaced. + +### __init__.py +Module path: `hhat_lang.core.compiler` + +Package marker with no runtime behavior. Establishes the package boundary. Reexports can be added if required. + +## 5. Integration Points +Concrete compilers reference the following concepts from other core layers. This document does not specify those layers. The list clarifies roles and boundaries. +1. Cooperating compilers: classical compilers, compilers for other dialects, and quantum compilers. +2. Evaluators: components that evaluate intermediate artifacts for classical and quantum programs. +3. Quantum device specifications: descriptions of available devices and their capabilities. +4. Backends: adapters that translate intermediate artifacts into device or vendor instruction streams. +5. Quantum languages: descriptors that define available instruction sets or syntactic forms for quantum programs. + +Interpreter and evaluator contracts are defined in `hhat_lang.core.execution.abstract_base`. Backend adapters are defined in hhat_lang.core.lowlevel. \ No newline at end of file diff --git a/python/src/hhat_lang/core/data/README.md b/python/src/hhat_lang/core/data/README.md new file mode 100644 index 0000000..954aeee --- /dev/null +++ b/python/src/hhat_lang/core/data/README.md @@ -0,0 +1,38 @@ +# Data Layer Overview + +Canonical data abstractions that model symbols, literals, composite values, function definitions, and variable containers across classical and quantum paradigms. This layer provides deterministic identity and comparison, paradigm tagging, structured assignment semantics, and well specified interfaces with the Core IR and the type system. The layer is dialect agnostic and back end agnostic. It defines data contracts consumed by compilers, linkers, and evaluators in the project. + +## 1. Purpose +Provide precise and inspectable data entities that can be created by compilers, linked across modules, and consumed by evaluators. Design objectives are: +1. Deterministic equality and hashing for symbols, composite symbols, composite values, and signature descriptors within a process. +2. Explicit classical or quantum tagging with validation that prevents accidental mixing of paradigms. +3. Structured containers for variables that enforce mutability policies and data structure shape during assignment and retrieval. +4. A function definition object that normalizes argument representation and exposes a value based signature for storage and lookup. +5. Efficient string and binary representations for literals to support debugging and analysis. + +## 2. Scope +1. Symbolic identifiers and composite symbolic forms with qualified names and attribute chains. +2. Literals with defined ordering and inequality when the type supports them and cached binary encoding. +3. Variable containers for constant, immutable, mutable, and appendable policies with quantum aware behavior. +4. A factory for variable containers that selects the correct policy from requested properties and paradigm. +5. A function definition container with argument and body blocks and a signature descriptor used by symbol tables. +6. Utilities for paradigm detection and comparison. + +## 3. Core Concepts +1. **Symbolic identity**: A symbol is a value that names a variable, function, type, argument, or parameter. It carries a type tag and a boolean tag that marks the quantum paradigm. Composite symbols represent qualified names and attribute or method chains as an ordered tuple of segments. Equality and hashing are value based and do not depend on object identity. +2. **Literals**: A literal couples a textual value with a type tag that can be a single tag or an ordered tuple of tags for composite cases. Quantum and classical markers must agree between the textual value and the type tag. Ordering and inequality comparisons are defined when the literal type supports them and they respect the represented value category and encoding. The literal exposes a cached binary encoding that yields bits for integers, double precision for floating point values in network byte order, and code points for strings. +3. **Composite data**: Composite value objects support qualified attribute chains. Array composition is mediated by the type system and container assignment. They expose the group of segments, the group kind for classification, and a type tag. Iteration yields the underlying segments. String presentation uses dot separation and an optional colon followed by the type tag. +4. **Paradigm discipline**: A small set of helpers detects the quantum paradigm from the at sign convention and checks that two entities share the same paradigm. These helpers are used by type system validation and by container construction. Variable assignment enforces paradigm consistency through type checks. +5. **Variable containers**: The base container associates a name, a declared type, a description of the data structure layout, and a store of runtime values. The container tracks mutability policy, quantum status, assignment state, transfer and borrow state, and an instruction counter that orders side effects important for quantum and appendable containers. Assignment accepts positional values that align with the declared layout or keyword values addressed by member names. For array like or quantum members the container appends to per member lists and increases the instruction counter. Retrieval returns a member by name with a default to the first declared member. Implementations return typed error values for invalid operations. Unsupported paths may raise exceptions during development. +6. **Function definitions**: A function definition object stores a name, an optional declared return type, an argument block, and a body block. The argument block is normalized to a single block that holds paired sequences of names and types. A signature descriptor is derived from the function name and the ordered list of argument types and it is suitable for table based storage and exact lookup. + +## 4. File Inventory +1. `__init__.py`: Package marker without runtime behavior. +2. `core.py`: Defines the data building blocks used across the core layer. Provides symbols and composite symbols with qualified names and attribute chains, literal values with value based equality and cached binary encoding, and composite value objects for attribute chains and mixed compositions. Also defines a classification of composite groups and enforces compatibility between quantum markers in textual values and type tags. +3. `fn_def.py`: Provides a container for function definitions that binds a function name, an optional declared return type, a normalized argument block, and a body block. Exposes cached accessors for argument names and argument types and a signature descriptor used for storage and retrieval in symbol tables. +4. `utils.py`: Collects utilities for this layer. Provides an enumeration of variable policies that covers constant, immutable, mutable, and appendable forms. Supplies a predicate that detects the quantum paradigm for strings and for objects that advertise a paradigm tag. Supplies a predicate that checks if two data entities share the same paradigm. Declares an abstract base for data containers used to avoid circular imports. +5. `variable.py`: Implements the variable container hierarchy and a factory that selects a container based on requested policy and paradigm. The base container ties a name and a declared type to a data structure description and a store of values and it provides assignment, retrieval, iteration, and freeing. Concrete containers implement constant behavior with no assignment, immutable behavior with single assignment, mutable behavior with repeated assignment, and appendable behavior that appends values per member and advances the instruction counter. The factory ensures that quantum variables are always appendable. Assignment checks declared layout and type for each member. When using keyword style assignment the mutable and appendable containers map names to members and accept a conventional prefix that denotes quantum arguments. Operational errors return typed error handlers. Unsupported paths may raise exceptions during development. + +## 5. Processing Flow + +Compilers or dialect builders create symbols and literals for names and values. Variable containers are constructed from a name, a declared type, a layout description from the type system, and a requested policy. Assignment is performed either positionally in the order of the declared layout or by member names. Quantum or array members are appended and the instruction counter is increased. Function definitions are built from a name, a normalized argument block, and a body block. The compiler inserts the signature descriptor derived from the name and argument types into the function table for exact lookup. Evaluators can query containers for values and may consult the instruction counter to preserve ordering when needed. diff --git a/python/src/hhat_lang/core/error_handlers/README.md b/python/src/hhat_lang/core/error_handlers/README.md new file mode 100644 index 0000000..9a036c7 --- /dev/null +++ b/python/src/hhat_lang/core/error_handlers/README.md @@ -0,0 +1,81 @@ +# Error Handlers Overview + +Centralized error semantics for the core layer. This directory defines a catalog of error conditions with stable identifiers and typed handlers that carry structured context. The design favors explicit and inspectable failure values over implicit exception flows. It supports precise pattern matching in evaluators, memory management, typing, and instruction orchestration across classical and quantum paradigms. + +## 1. Purpose +Provide a coherent error model that +1. Assigns each failure mode a unique code and a human readable rendering. +2. Encodes contextual data inside the handler for diagnostic precision. +3. Enables propagation through typed results or direct handler values in normal control flow. +4. Escalation to exceptions or process termination is used for contract violations or boundary unwinding. + +## 2. Design Overview +* A finite enumeration defines the error code space. Codes are grouped conceptually by subsystem such as indexing, typing, variables and containers, casting, calling, runtime stacks and heaps, symbol tables, instruction lifecycle, and quantum computation results. The set admits extension by appending new codes while preserving existing meanings. +* A typed handler binds one code to payload fields such as names, values, limits, kinds, or signatures. The handler is callable to render a message that embeds these payloads in a canonical sentence suitable for logging and user display. A compact representation string provides a stable process local summary for tracing and metrics. +* Handlers inherit from a common base that is also compatible with exception semantics. Normal usage returns either a handler value or a typed result wrapper rather than raising. Internal subsystems and adapters may raise or terminate for contract violations or when termination is desired. +* The interface is value based. Routing decisions are performed by inspecting the handler type or the code and optional payloads rather than matching on free form strings. Pattern matching on the handler type is standard and equivalent to matching on the code. + +## 3. Error Taxonomy +**Index management** +* Unknown index state when no specific condition applies. +* Allocation failure with both requested count and maximum available count recorded. +* Variable already associated with indexes when attempting to assign new ones. +* Invalid variable reference that is not registered with the index manager. + +**Type discipline** +* Quantum data nested inside classical data is rejected. The converse is permitted. +* Paradigm mismatch between a declared data kind and a member or value. +* Failure while adding a new member to a composite type declaration. +* Cardinality violation for single member forms. +* Incompatible member assignments for structured records, unions, or enumerations. + +**Containers and variables** +* Assignment failure for a container in general. +* Mutation attempted on an immutable variable. +* Access to a wrong member for a variable container. +* Failure to create a variable given a name and an intended type. +* Freeing a variable that currently borrows data is refused. + +**Casting** +* Casting a negative value to an unsigned target is refused. +* Casting an integer that exceeds the representable limit is refused. +* General cast failure for incompatible source and target kinds. + +**Function invocation** +* Argument types do not match the expected signature derived from declaration. + +**Runtime stacks** +* Retrieval failure for requested data from a frame. +* Misuse of a stack frame that is not defined for functions. +* Empty stack underflow. +* Stack overflow. + +**Heaps** +* Invalid key for heap access. +* Empty heap when access is attempted. + +**Symbol tables** +* Invalid key in the context of type lookup or function lookup. The rendering clarifies the context. + +**Quantum computation** +* A quantum data value produced an invalid computed result. + +**Instruction resolution** +* An instruction with a given name was not found. +* An instruction is in an invalid lifecycle status for the requested operation. + +## 4. Interaction with Core Subsystems +* **Types and data**: Variable containers use these handlers for assignment validation, retrieval, and freeing. Paradigm checks and composite layout checks report typed failures rather than raising. +* **Memory**: The stack and heap models expose key errors through this catalog and provide codes for underflow and overflow. Evaluators can branch on codes to decide recovery or termination. +* **Code and execution**: Instruction lookup and lifecycle enforcement return typed failures on missing instructions or invalid statuses. Symbol table operations surface invalid key conditions and can raise standard exceptions for internal invariants. +* **Results**: Evaluators and helpers propagate success and failure with typed results or handler values. Callers render messages on demand or match on handler type or code for programmatic handling. + +## 5. Usage Pattern +* Return a typed result or a handler value on failure in normal control flow. Avoid raising in normal control flow. +* Generate handlers as close as possible to the detection site and include the minimal payload needed to diagnose the issue. Examples are requested versus maximum counts, offending names, expected versus observed kinds, and architectural limits. +* For logging, call the handler to obtain the full message. For compact traces use the representation string which includes the code and a process local numeric identifier. +* At process boundaries where unwinding is required a handler may be raised. The same handler remains inspectable by upstream code. + +## 6. File Inventory +1. `__init__.py`: Package marker without runtime behavior. +2. `errors.py`: Defines the error code enumeration and the family of typed handlers. Handlers encode contextual payloads for indexing, typing, containers and variables, casting, function invocation, stack and heap operations, symbol table validation, quantum result validation, and instruction lookup or status. Each handler exposes a callable renderer for human readable messages and a compact representation for tracing. All handlers share a common base compatible with exception semantics while remaining designed for explicit result based propagation. \ No newline at end of file diff --git a/python/src/hhat_lang/core/execution/README.md b/python/src/hhat_lang/core/execution/README.md new file mode 100644 index 0000000..6f88ff3 --- /dev/null +++ b/python/src/hhat_lang/core/execution/README.md @@ -0,0 +1,57 @@ +# Execution Layer Overview + +Contracts and orchestration for evaluation over the Core IR. This layer specifies an IR management model based on a program graph of compilation units with dependency relations recorded in module reference tables, an interpreter interface that owns execution configuration and parsing, and an evaluator interface that realizes execution over an intermediate program representation. It also defines a program level assembly that binds working data, indexing and stack based memory control, symbol resolution, and integration with low level quantum targets. The design is backend neutral and dialect aware. + +## 1. Purpose +Provide precise execution contracts and runtime orchestration for programs represented in the Core IR. Objectives are: +1. A graph based organization of intermediate units with explicit linking by named references. +2. An interpreter interface that converts source text to an intermediate form and drives evaluation under a consistent configuration. +3. An evaluator interface that defines a single entry procedure for single invocation execution and a recursive traversal routine for structural walking. +4. A program assembly that couples evaluator, memory control, a symbol table, and low level quantum language integration in a single executable object. +5. Deterministic scope and depth accounting to support recursion and well defined resource lifetimes. + +The layer specifies interfaces and orchestration boundaries rather than concrete evaluator or parser algorithms. + +## 2. Scope +1. Management of IR units in a program graph of compilation units with dependency relations recorded in module reference tables, with addition and linking by reference. Replacement is part of the interface and realized by concrete managers. +2. Parsing of source text into an intermediate form suitable for evaluation. +3. Evaluation over the intermediate form with an entry procedure that coordinates memory and a recursive routine that walks the structure. +4. Program time wiring of working data for quantum operations, index based addressing, a stack for quantum frames, and a symbol table for name resolution. +5. Integration points for device specifications, quantum target backend selection, and target language specifics used during evaluation and emission. + +## 3. Core Concepts +1. **IR management**: Intermediate units are modeled as nodes keyed by a path like identity. References that originate in one unit and name types or functions in another unit induce a dependency relation recorded in the importing module reference table. The manager accepts addition of units, linking that registers reference table entries based on simple names or qualified names, and a replacement operation defined at the interface level that concrete managers can realize while preserving graph invariants. +2. **Interpreter role**: The interpreter owns execution configuration. It tracks quantum device characteristics, target backend selection, target language description, and dialect rules for parsing and evaluation. It exposes a parsing capability that converts source text to an intermediate form. It maintains a non negative counter for call depth to coordinate scope creation and destruction during recursion. Violations that produce a negative value are reported as errors. +3. **Evaluator role**: The evaluator realizes execution over the intermediate form. It provides a single entry procedure that prepares memory context and invokes evaluation, and a recursive traversal routine that performs structured walking over blocks and expressions. The evaluator object is callable to compose with higher level drivers. +4. **Program assembly**: A program instance binds the prepared intermediate block, working quantum data, an index based addressing facility, a quantum frame stack, an evaluator instance, a low level quantum language interface, and a symbol table. Running a program returns either a computed result or a typed error value. The assembly is minimal and aims to keep evaluation state explicit and inspectable. + +## 4. Execution Model +Typical execution proceeds as follows. +1. Source text is parsed under the interpreter into an intermediate representation. Dialect rules and target language specifics guide the parse. +2. Intermediate units are added to the graph. For each import a reference is recorded in the importing unit reference table that points to the defining unit. Graph build finalizes the node set and validates that all recorded references resolve prior to evaluation. +3. A program instance is assembled from the main intermediate block, the evaluator, working quantum data, index based addressing, a quantum stack, and a symbol table. +4. The evaluator entry procedure receives the intermediate form and a memory control object. It sets up context for the current depth and delegates to the recursive routine which walks blocks and expressions. Name resolution is mediated by the program assembly and its symbol table. The recursive routine may interact with the quantum stack and the low level language interface when quantum instructions are encountered. +5. Upon function calls the interpreter increments the depth counter. After return it decrements the counter. The counter must not drop below zero. +6. Results and domain errors are surfaced as values where possible. Internal contract violations raise exceptions. + +## 5. IR Graph Management +**Node identity and keys** +* Each unit carries a path like identity. Nodes refer to units and comparisons use value based equality of the identity. + +**Dependencies and linking** +* A reference to a type or a function in another unit is represented by entries in module reference tables. References can be names or qualified names. Linking records these references and validation of dependencies occurs during graph build. Linking can be carried out by dialect specific managers or at IR construction time. + +**Update operations** +* A replacement operation is part of the management contract. Concrete managers may swap the underlying unit and preserve incident dependency relations so subsequent queries observe the new unit. + +## 6. Memory and Scope Discipline +**Depth management** +* A counter records the current dynamic depth of calls. The interpreter increases the counter upon entry to a call and decreases it upon return. The counter must remain non negative and an error is raised if it would become negative. + +**Scope coordination** +* The evaluator creates or selects memory context based on the current depth. This interacts with frame management for quantum operations and with the index based addressing facility. + +## 7. File Inventory +1. `__init__.py`: Package marker without runtime behavior. +2. `abstract_base.py`: Defines the IR management interface over a program graph of intermediate units, the interpreter interface with parsing and depth accounting, and the evaluator interface with an entry procedure, a recursive traversal routine, and a callable protocol. +3. `abstract_program.py`: Defines the program assembly that joins the evaluator, intermediate block, quantum working data, index based addressing, quantum stack, low level quantum language integration, and a symbol table, and that exposes a single run capability which returns a result or a typed error value. diff --git a/python/src/hhat_lang/core/imports/README.md b/python/src/hhat_lang/core/imports/README.md new file mode 100644 index 0000000..0eda3bb --- /dev/null +++ b/python/src/hhat_lang/core/imports/README.md @@ -0,0 +1,58 @@ +# Imports Layer Overview + +Facilities for discovery, parsing, and linkage of external type and function artifacts across intermediate representation units. The layer provides name to path mapping, on demand materialization of modules into the program graph, and population of reference structures used during validation and linking. It is dialect agnostic and delegates syntax to an injected grammar and start rule. + +## 1. Purpose +Provide deterministic and validated resolution of cross unit references. The layer maps qualified names to module files under the project source tree, parses modules when needed, and returns pairs that associate logical keys with defining module paths. These pairs are consumed by the program graph to build reference tables for types and for functions. + +## 2. Project Layout Assumptions +* Project root contains a source tree named `src`. +* Type definitions live under `src/hat_types`. +* Function and program files are addressed under `src`. +* Modules are plain text files with extension `.hat`. +* Importers compute module paths relative to these directories. +* Importers compute module paths under these source subtrees. Inputs must avoid traversal segments and remain confined to the project root. + +## 3. Name to Path Mapping +**Terminology**: Artifact denotes a declared type or function inside a module. Function descriptor denotes a pair of a function name and an ordered tuple of argument types. Staging set denotes the collection of nodes accumulated before graph finalization. +Qualified names are treated as ordered sequences of segments. +* **Single segment**: the file name equals the segment and the artifact name also equals the same segment. +* **Two or more segments**: directory path is the prefix that excludes the last two segments, the file name equals the penultimate segment, and the artifact name equals the final segment. +* The physical path is computed by joining the base directory, the derived directory path, and the file name plus the `.hat` extension. + +This mapping is uniform for both type references and function references. The final segment denotes the artifact name within the addressed module file. + +## 4. Loading and Graph Interaction +* For each computed module path, if the module is not already present in the program graph, the importer reads the file and invokes the provided parser pipeline that consists of a grammar provider and a start rule. The parser pipeline materializes the resulting intermediate representation unit into the graph as a new node. +* Addition is idempotent with respect to a file path during discovery before graph finalization. Repeated requests for the same path do not duplicate nodes within this phase. +* Import operations must run before graph finalization. Finalization moves accumulated nodes into an immutable set and computes a perfect hash for constant time addressing. +* Finalization is a required step before the graph is used for constant time addressing or validation. +* Function discovery operates over the prebuild staging set of nodes. Post finalization discovery is out of scope for this layer. +* Parse failures in the injected pipeline propagate to the caller. Higher layers may surface them as typed errors. +* Existence checks during discovery consult only the prebuild staging set and do not consult the finalized node set. +* Typical workflow: request names for types or functions, load any missing modules, aggregate pairs, populate reference tables with the program graph routine, then finalize the graph. + +## 5. Type Resolution +* For each requested type name, the importer returns a pair that associates the logical type key with the module file path that is addressed by the name to path mapping. +* Discovery triggers loading of any not yet materialized module files that are required to resolve the request. +* At this stage the importer does not verify that the addressed module declares the requested type. Name level validation remains the responsibility of consumers until a dedicated check is introduced in the graph. +* Returned keys for types are artifact names rather than fully qualified names. Consumers that require uniqueness across modules must avoid collisions or enforce qualification at a higher layer. +* When multiple requested type names are identical across modules, later pairs overwrite earlier ones due to mapping semantics. +* Cycles during discovery are tolerated. Completeness is enforced when the program graph is finalized. Missing modules yield file not found or validation errors. +* Overwrite behavior for duplicate type names follows the request sequence and is a contract. + +## 6. Function Resolution +* For each requested function name, the importer queries the accumulated nodes for definitions with the requested name in the addressed module. The search scope is the addressed module only. If multiple overloads exist in that module, one pair is returned per overload. +* Each pair associates a function descriptor formed from the name and ordered argument types with the module file path that defines it. +* If no definition with the requested name is present in the addressed module, an error is raised by the program graph utility used to query functions. +* When multiple modules define the same function descriptor, later entries overwrite earlier ones in the aggregated mapping. + +## 7. Reference Table Construction +* A routine in the program graph consumes the returned pairs to populate the reference structures for types and for functions. +* References are stored as logical keys that map to node keys derived from module paths and stable within a process. +* During graph finalization, validation verifies that referenced modules are present in the graph. It does not recheck that the target name exists inside the module. If any referenced module is missing, finalization fails. + +## 8. File Inventory +* `__init__.py`: Package marker that exposes the type import facility for convenience. +* `importer.py`: Common importer infrastructure that derives module paths from qualified names, performs file loading and parser invocation, inserts resulting units into the program graph, and provides two concrete mechanisms for resolving types and functions. The function mechanism returns one entry per overload present in the addressed module. +* `utils.py`: Base protocol for aggregated import results expressed as two mappings, one for types and one for functions. diff --git a/python/src/hhat_lang/core/lowlevel/README.md b/python/src/hhat_lang/core/lowlevel/README.md new file mode 100644 index 0000000..bb22312 --- /dev/null +++ b/python/src/hhat_lang/core/lowlevel/README.md @@ -0,0 +1,61 @@ +# Low Level Backends Overview + +Adapter interfaces that translate resolved program operations into target device or runtime instructions for quantum execution. This layer connects Core IR and evaluated state to concrete backend languages and drivers while preserving determinism, correctness, and clear error propagation. It is dialect agnostic and backend agnostic and it exposes a precise contract that concrete adapters implement. + +## 1. Purpose +Provide a single abstraction that receives quantum data, program structure, allocation indices, runtime evaluator context, a quantum oriented stack, and symbol information, then emits target specific instructions and an assembled program artifact. The design emphasizes explicit ownership of resources, stable ordering of effects, and typed error reporting without relying on exceptions for normal control flow. + +## 2. Context and Scope +**Position in the pipeline** +* Input comes from linked Core IR and from evaluated state that determines data values and allocation state. +* Output is a backend program string or equivalent instruction stream suitable for a target language or driver. +* The layer neither defines type rules nor performs linking. It consumes already validated structures and focuses on emission. + +**Responsibilities** +* Initialize target language environment by emitting headers, pragmas, or capability declarations as required by the target. +* Map abstract operations to concrete target instructions while honoring quantum and classical boundaries. +* Assemble a complete program artifact with a deterministic layout derived from the program structure and allocation indices. +* Integrate with the evaluator to support optional execution or submission to a runtime after emission. + +## 3. Abstraction Model +The base adapter aggregates the following runtime and structural inputs +* Quantum data reference that identifies the working quantum entity or aggregate under emission. +* Program block from the intermediate representation that supplies the ordered sequence of operations and nested structure. +* Index manager that exposes which indices are currently bound to the quantum data reference and that yields a deterministic ordering of these indices. +* Evaluator handle used to interact with the execution subsystem when emission must coordinate with runtime state or when optional execution follows emission. +* Quantum oriented stack used for scoped resources or for staging values across emission steps where stack discipline is required. +* Symbol table view that allows resolution of names during emission in cases where target languages require explicit bindings or declarations. + +The adapter validates at construction that the quantum data reference has an assigned index set. On failure the adapter aborts emission by raising a typed handler or a runtime error according to policy. No implicit recovery is performed. Otherwise the adapter caches the number of indices and retains references to all inputs. No implicit global state is used for these concerns. + +## 4. Interface Contract +Concrete backends implement the following routines while preserving the semantics described here. Names of routines are abstracted in this document. The contract specifies behavior rather than identifiers. +* Initialization routine returns an ordered tuple of target specific prologue elements such as headers or capability declarations. The tuple order is stable and deterministic for a given program and configuration. Prologue content is deterministic for a fixed program and configuration. +* Epilogue routine can be present in concrete backends and emits target specific footers such as measurements or synchronization directives. When present epilogue content is deterministic. +* Instruction generation routine accepts operation descriptors and optional adapter parameters and returns either a typed success result or a typed error. Success yields a collection of target instructions that are suitable for later assembly. Errors include structured information for diagnostics and are not delivered via exceptions during normal operation. +* Program assembly routine produces a textual program artifact or equivalent final representation required by the target driver. The routine may consume previously generated instructions or may traverse the program block directly. The output is pure with respect to the adapter state other than deterministic counters or caches that do not affect semantics. Assembly composes the prologue, the body, and an optional epilogue in a deterministic order for a fixed input. +* Invocation routine provides a callable entry point that orchestrates initialization, instruction generation, and program assembly. The return value is defined by the adapter and can be the final artifact or a driver facing result. + +## 5. Processing Flow +Emission proceeds in the following canonical order +1. Query the index manager for the set of indices that the quantum data reference currently occupies and cache their count. Ordering is provided by the indexing subsystem and is read as needed. +2. Produce target prologue elements through the initialization routine. These elements can include pragmas, version markers, or allocation statements required before body emission. +3. Traverse the program block in a deterministic order. For each operation produce target instructions using the instruction generation routine. Instruction emission consults the symbol table when a mapping from abstract names to concrete identifiers is necessary. +4. Assemble the final program using the program assembly routine. Assembly composes the prologue, the body, and an optional epilogue in a deterministic order. Program assembly can return an empty artifact when the program has no target instructions. +5. Optionally pass the program to the evaluator for execution or submission. Side effects on the evaluator are explicit and occur only through the adapter handle. + +## 6. Resource and Index Discipline +* Indices originate from the memory subsystem. The adapter does not allocate or free indices and only queries and reads them. +* Emission reads the current assignment supplied by the indexing subsystem and uses it consistently. Mutation of the assignment during a single emission pass is outside the contract. Adapters prevent it by snapshot and validation or abort emission. +* Ordering of indices follows the order provided by the indexing subsystem. Instruction templates must not reorder bits or wires implicitly. Any reordering required by a target must be explicit. +* The quantum oriented stack is used only for structured scopes within emission. The adapter does not leak stack frames across calls. + +## 7. Error Model and Results +* The instruction generation routine returns a typed result on success and a typed handler on failure. Initialization returns a prologue tuple and program assembly returns a program artifact. Adapters avoid exceptions during normal flow and reserve raising for boundary integration scenarios. Unsupported instruction fallback is backend specific and may not be present. When no fallback exists, adapters raise a not implemented error. +* Construction may escalate a typed handler when the quantum data reference has no assigned index set. This protects against emitting instructions that reference unmanaged resources. +* Instruction generation reports invalid operations such as unsupported gates, arity mismatches, paradigm violations, or references to missing symbols. Errors carry structured context for diagnostics. +* Program assembly reports format violations such as conflicting declarations or missing prologue requirements for the chosen target. + +## 8. File Inventory +1. `__init__.py`: Package marker with no runtime behavior. +2. `abstract_qlang.py`: Defines the base adapter for quantum low level emission. The adapter aggregates a quantum data reference, a program block from the intermediate representation, an index manager, an evaluator handle, a quantum oriented stack, and a symbol table view. Construction queries the index manager to validate that the quantum data reference is in use and caches the number of indices. It specifies the four routines described in the interface contract section that concrete backends must implement. Instruction generation relies on a typed result container to return success and failure in a uniform way. diff --git a/python/src/hhat_lang/core/memory/README.md b/python/src/hhat_lang/core/memory/README.md new file mode 100644 index 0000000..863bc49 --- /dev/null +++ b/python/src/hhat_lang/core/memory/README.md @@ -0,0 +1,75 @@ +# Memory Layer Overview + +Runtime memory model for evaluation across classical and quantum computation. This directory specifies stack based frames for lexical and call scopes, a heap for dynamic storage, an index accounting facility for quantum resources, scope tracking keyed by stable identifiers, and orchestration that binds these mechanisms into a coherent manager for program execution. The design is value oriented and integrates with Core IR, the type system, and error semantics. + +## 1. Purpose +Provide precise memory control with deterministic behavior: +1. Track lifetimes through explicit scopes keyed by stable values derived from intermediate blocks and dynamic depth. +2. Model function frames with ordered parameters, argument type validation, return channels, and last in first out discipline. +3. Offer a heap for dynamic entries addressed by symbols with strict key validation. +4. Manage a fixed budget of indices for quantum operations with reservation by owner, allocation, and release. +5. Present a classical manager and a quantum aware extension that adds index accounting without changing core stack and heap semantics. + +## 2. Scope +1. Stable scope identifiers with process local determinism and equality comparability against integers for convenience in tables and traces. +2. Stack frames for general blocks and for function calls with two entry modes by position and by name, ordered insertion, and a return slot. +3. Heap entries indexed by symbols with typed retrieval and explicit freeing on scope exit. +4. Index accounting for a finite pool with tracking of available positions, allocated positions, resource declarations per owner, and an in use mapping from owner to positions. +5. Orchestration that creates and frees scopes, advances current scope, and exposes stack and heap to evaluators. + + +## 3. Core Concepts +**Scope value** +* A numeric value created from a stable function of a block identity and a depth counter. The value is deterministic within a process for the same inputs. It serves as the key for scope tables. Equality supports comparison with the same numeric form. A textual representation shortens the value for debugging. + +**Stack frame** +* An ordered mapping that holds declarations and values within a scope. Keys are symbols and qualified names and a header descriptor when the frame represents a function call. Entries may be declared without assignment and later filled. Retrieval yields either a stored value or a typed failure value that carries the missing key. +* A function frame validates argument types against a header descriptor. Two entry modes are supported. Position only mode consumes values in the declared order. Named mode consumes pairs of argument name and value. A dedicated channel stores the return and allows the caller to retrieve it before frame teardown. + +**Stack** +* A last in first out collection of frames. Frames are created by the evaluator at scope and call boundaries. The active frame is always the last one. Pushing a value associates a symbol with a container or associates a literal with itself. Membership tests query the active frame. Freeing removes the last frame. + +**Heap** +* A dictionary of dynamic entries addressed by symbols. Setting requires a symbol key and a container value. Getting returns the stored value or a typed failure value for an invalid key. Freeing removes an entry by key. + +**Index accounting** +* A manager for a fixed pool of indices used by quantum operations. Internal state consists of a double ended queue of available positions, a double ended queue of allocated positions, a resources map from owners to requested counts, and a mapping from owners to the positions currently in use. Owners are variable members or composite working entities. +* Reservation declares the future need of an owner by count. Allocation reads the declared count, assigns that many positions if sufficient capacity exists, and records ownership. Freeing returns positions to the available pool in deterministic order and updates counters. +* Operations return typed results. Allocation failure encodes both requested and available counts. Request for an unknown owner yields a typed failure value. Duplicate reservation for the same owner yields a typed failure value. Unknown conditions produce a generic typed failure value. No exceptions are used for normal flow in index operations. + +## 5. Processing Flow +* Create a manager from a block identity and a depth counter and establish the initial scope. The scope table is an ordered mapping keyed by scope values and each entry owns a distinct heap. +* On scope or call entry the evaluator creates a frame and may stage function arguments. +* For quantum programs declare index requirements per owner, request indices when execution reaches the owning operation, and free them upon completion. +* On scope or call exit retrieve any staged return then remove the frame and free the scope heap as required. + +## 6. Resource and Scope Discipline +**Lifetimes** +* The evaluator creates a frame when entering a scope and removes it on scope exit. Heaps are created per scope and are removed as a unit at scope exit. A return slot is consumed on retrieval and does not persist across frames. + +**Depth counter** +* The interpreter maintains a non negative depth counter that increases on call entry and decreases on return. Scope values capture the counter at creation time to distinguish nested and recursive scopes that share the same block identity. + +**Determinism** +* Insertion order in frames is preserved for iteration and printing. Heap iteration follows the host dictionary order and is not used for program meaning. Index allocation preserves the order in which positions are pulled from the available pool and returned upon freeing. Scope selection uses the last created scope as the current one. + +## 7. Function Entry and Return +**Entry preparation** +* A function frame receives a header descriptor. Arguments are provided either as a sequence of values in the declared order or as name value pairs. The frame validates types against the header. On type mismatch evaluation terminates with a typed error in development configurations. Argument staging materializes bindings. In development configurations control may terminate upon misuse. In production configurations control continues after successful staging. + +**Return handling** +* The callee writes the return value into the frame return slot. The caller retrieves it and clears the slot before the frame is freed. The return slot holds a single value. + +## 8. Quantum Index Lifecycle +**Reservation** +* The owner declares the count of required positions. The declaration succeeds only if the requested count does not exceed the remaining capacity after current allocations. Reservation does not reduce capacity. Ownership is established on allocation. + +**Allocation** +* A request reads the declared count for the owner. If the owner is known and sufficient capacity exists the manager assigns the positions and records the assignment under the owner. If capacity is insufficient a typed failure value reports both requested and maximum available counts. If the owner is unknown a typed failure value is returned. + +**Release** +* Freeing by owner returns positions to the available pool and updates counters. The order of returned positions is preserved within a request. + +## 9. File Inventory +* `__init__.py`: Package marker without runtime behavior. +* `core.py`: Implements stack frames and stack for scoped storage with function aware behavior, heap for dynamic entries per scope, scope values and scope tables for lifetime control, index accounting for quantum resources with reservation and request semantics, and memory orchestration that binds these mechanisms into classical and quantum aware managers. diff --git a/python/src/hhat_lang/core/types/README.md b/python/src/hhat_lang/core/types/README.md new file mode 100644 index 0000000..b98c65e --- /dev/null +++ b/python/src/hhat_lang/core/types/README.md @@ -0,0 +1,87 @@ +# Types Layer Overview + +Type system foundation for classical and quantum data. This directory defines the kinds of type structures, the representation of size in bits and in qubit counts, the discipline for member composition across paradigms, a catalog of built in types, conversion rules among compatible built ins, and the interface by which types produce variable templates for the data layer. The design is value oriented and interoperates with the Core IR and symbol tables for cross module resolution. + +## 1. Purpose +Provide precise and inspectable typing constructs that +1. Classify types by structural kind and encode their semantics for membership and construction. +2. Represent size with explicit units for bits and qubits, including exact and bounded forms. +3. Enforce paradigm discipline so that classical declarations cannot contain quantum members while quantum declarations may contain classical members when allowed by structure. +4. Offer a variable creation protocol that yields templates used by the data layer to instantiate containers under a chosen mutability policy. +5. Supply a stable catalog of built in classical and quantum types with well defined sizes and conversion rules. + +## 2. Scope +1. Structural kinds for single member, record, enumeration, union and a reserved remote union. +2. Size descriptors for bits and for qubits with lower bound and optional upper bound. +3. Rules for member addition, temporary staging of forward referenced members, and validation of paradigm compatibility. +4. A callable interface on types that returns a variable template keyed by a name and a mutability policy. +5. A catalog of built in types for integers, booleans, and floating point values together with quantum counterparts measured in qubits. +6. Conversion relations among compatible built ins and a concrete integer to unsigned casting routine with overflow and negativity checks. +7. Utilities that expose structural kind classification and a minimal abstract base to avoid circular dependencies. + +## 3. Core Concepts +**Type identity** +* A type is named by a symbol or a composite symbol that carries a quantum marker when applicable. Name equality is value based within a process. The name is the stable handle used by symbol tables, imports, and variable templates. + +**Structural kinds** +* The directory defines distinct structural categories. A single member kind models an alias like structure that refers to exactly one member of the same structural kind. A record kind maps member names to member types. An enumeration kind maps names to alternatives. A union kind is reserved for disjoint alternatives with a shared storage model. A remote union kind is reserved for future quantum data across process boundaries. + +**Size semantics** +* Bit size is represented by an explicit integer value in bits. Quantum size is represented by a pair consisting of a lower bound and an optional upper bound measured in qubits. Quantum size supports deferred completion: if only a lower bound is known initially the upper bound can be computed later from members and cached on the descriptor. A constant for pointer size in bits provides a portable default when a structure does not specify a more precise size. Classical types carry a quantum size descriptor with minimum zero and a computed maximum of zero after resolution to support uniform handling by the quantum size resolver. + +**Membership discipline** +* Member addition observes two invariants. First the member must match the structural kind required by the container. Second a classical container must not accept a quantum member. A quantum container may accept classical members where the structure allows it. Violations produce typed error values for reporting through evaluators and tools. + +**Invocation and variables** +* Single member, record, enumeration, and built in single types implement a call protocol that accepts a variable name and a mutability policy and yields a variable template. Union, remote union, and array are reserved and do not implement variable construction. The template carries the declared type name, the structural description for members, and the requested policy. The data layer consumes this template to construct a concrete container and to enforce assignment and retrieval rules. + +**Temporary staging** +* During IR construction some member types may be declared in other files or in later positions. A staging area records these members as pairs of names and unresolved type references. Resolution is performed by compiler or linking logic using the type table and import resolution rather than by this directory. Only record structures stage unresolved members. Built in types and enumeration and union and array do not stage members. + +## 4. Structural Families +**Single member** +* Models an alias like structure with exactly one member of the same structural kind as the container. The internal mapping associates the container name to the referenced member name. Variable templates produced by this kind describe a single entry layout. + +**Record** +* Associates member names to member types in an ordered mapping. Addition checks paradigm compatibility and preserves insertion order for deterministic iteration and printing. A staging operation accepts unresolved member pairs that are later validated and committed into the mapping. + +**Enumeration** +* Associates alternatives keyed by names. Alternatives can be provided by name or by a reference to another type object. Addition checks paradigm compatibility at the level of names. Variable templates describe the set of alternatives for downstream consumers. + +**Union and remote union** +* Reserved kinds for disjoint alternatives and for remote quantum composition. These kinds are not specified and define no member operations. + +**Array** +* Reserved for repeated element structures. The current implementation records an array flag and size defaults and does not define member operations, length metadata, or a structural kind enumerant. + +## 5. Built in Catalog and Conversion +**Built in catalog** +* Classical types include signed and unsigned integers with sizes 16, 32, and 64 bits, a boolean of 8 bits, and floating point values of 32 and 64 bits. Quantum types include a boolean measured as one qubit, fixed width quantum integers that require two, three, or four qubits, and a generic quantum integer whose quantum size is bounded between the smallest and largest fixed width quantum integers. Quantum built ins use the pointer size constant as their bit size. + +**Compatibility and casting** +* A relation maps three generic built ins to their compatible targets. The generic integer maps to all signed and unsigned integer widths in this directory. The generic floating point maps to both floating point widths. The generic quantum integer maps to all fixed width quantum integers available here. +* A casting routine implements integer to unsigned conversion for both literal values and variable containers. It rejects negative values and values that exceed the representable range computed as two to the power of the bit size. Errors are reported as typed values that distinguish negativity and overflow from general cast incompatibility. +* Only integer to unsigned conversion is implemented at present. Other relations are compatibility specifications and not casting implementations. + +## 6. Size Resolution +**Bit size** +* The bit size descriptor is a simple wrapper around an integer in bits. It is set on construction for built in types and defaults to the pointer size constant for user declared structures when a specific size is not known. + +**Quantum size** +* The quantum size descriptor stores a minimum and an optional maximum. A resolver derives a maximum after members are known. For record like composition the intended maximum equals the sum of member maxima. For enumeration the intended maximum equals the maximum across alternatives. The present implementation computes the sum across members for all composite kinds which is a conservative upper bound for enumeration. The first resolution fixes the maximum on the descriptor and subsequent calls return the cached value. Quantum structures must carry a descriptor and missing descriptors are errors. + +**Complexity** +* Resolution runs in time linear in the number of referenced members visited in the intermediate representation graph. + +**Resolution phases** +* Compile time functions are reserved for computing bit and quantum sizes from declarations and from the type table when full information is present. Runtime functions are reserved for interpreting dynamic sizes if the language gains features that depend on runtime values. Placeholders exist for these routines and are intended to be completed when the language specifies such features. + +## 7. File Inventory +1. `__init__.py`: Defines a constant with the size of a pointer in bits. This value is used as the default bit size for built in quantum types and for user declared structures when a precise size is not known. +2. `abstract_base.py`: Declares the abstract base for type structures together with size descriptors for bits and qubits. Provides storage for the type name, the structural kind, quantum status, built in status, array flag, the ordered member mapping, and a staging area used by records for unresolved members. Exposes iteration over ordered members and membership queries over the internal mapping. The abstract call protocol returns a variable template and defers container construction to the data layer. +3. `builtin_base.py`: Defines the concrete built in structural form for a single member type. The form fixes the structural kind, sets bit size and quantum size defaults, and exposes a casting entry point that delegates to conversion routines. Built in types are fully specified at declaration time and reject staging of unresolved members. The call protocol returns a variable template with a single entry layout. This file also introduces classical and quantum integer families grouped for convenient checks. +4. `builtin_types.py`: Constructs the catalog of built in classical and quantum types with explicit bit sizes and quantum sizes. The generic quantum integer receives a quantum size interval whose lower bound equals the smallest fixed width quantum integer and whose upper bound equals the largest fixed width quantum integer. A mapping from symbolic names to the corresponding built in instances enables table driven lookup. +5. `builtin_conversion.py`: Specifies compatibility relations among generic built ins and implements the integer to unsigned casting routine. The routine supports literals and variable containers, computes the maximum representable value from the target bit size, and returns typed errors for negativity, overflow, or general incompatibility. +6. `core.py`: Implements concrete structural families for single member, record, enumeration, union, remote union, and array. Member addition enforces kind matching and paradigm discipline. Record types preserve insertion order, support staging of unresolved members, and produce variable templates carrying ordered layouts. Enumeration types accept alternatives by name or by reference to other types while enforcing paradigm compatibility at the name level. Union and array are reserved for future completion. A helper tests membership validity by checking classical versus quantum constraints. +7. `utils.py`: Defines the finite classification of structural kinds and a minimal abstract base used to avoid circular imports. A reserved kind is included for remote union to enable future expansion without breaking existing code. +8. `resolve_sizes.py`: Provides a resolver for quantum size that walks the intermediate representation graph from a declaring node, resolves member types through the symbol table, computes the sum of member maxima, and fixes the maximum on the descriptor. Placeholders exist for compile time and runtime size resolution for both bits and qubits.