Skip to content

Commit 431d0d8

Browse files
committed
feat: add C backend and improve compiler validation
1 parent d595118 commit 431d0d8

33 files changed

Lines changed: 4065 additions & 221 deletions

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
3939
- Populated by preprocessor, read by backends
4040

4141
### Changed
42+
- **Language Documentation Completion (Site)**
43+
- Expanded `site/src/pages/Language.tsx` into a full reference covering lexical rules, literals, declarations, expressions, control flow, functions, memory, generics, modules, intrinsics, operators, and grammar quick-reference.
44+
- Added explicit implementation-state callouts linking language docs to current semantic gaps tracked in `MISSING_FEATURES.md` / site status.
45+
4246
- **Current Test Baseline**
4347
- `PYTHONPATH=. uv run pytest` now reports `1039 passed, 7 failed, 0 skipped`.
4448
- Remaining failures are documented in `MISSING_FEATURES.md` and mirrored in docs/site status pages.

MISSING_FEATURES.md

Lines changed: 38 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,63 +1,47 @@
1-
# A7 Compiler — Missing Features Snapshot
1+
# A7 Compiler — Language-Core Gap Snapshot
22

3-
**Compiler Status**: Full pipeline runs (Tokenizer Parser Semantic Preprocessor → Zig Codegen).
4-
**Current Test Status** (`PYTHONPATH=. uv run pytest`, 2026-02-21): **1039 passed, 7 failed, 0 skipped**.
3+
**Compiler Status**: Full pipeline runs (Tokenizer -> Parser -> Semantic -> Preprocessor -> Codegen).
4+
**Current Test Status** (`PYTHONPATH=. uv run pytest -q`, 2026-02-24): **1067 passed, 0 failed, 0 skipped**.
55
**Examples**: 36/36 pass end-to-end compile + build + run + output verification.
66

77
---
88

9-
## Newly Unskipped Semantic Gaps
10-
11-
The 9 previously skipped semantic tests were unskipped. Two now pass; seven fail and define the current missing work. Those failures collapse into six implementation areas below.
12-
13-
1. **Match expressions are parsed but not type-checked**
14-
- Failing test: `test/test_semantic_control_flow.py:434`
15-
- Current error: `Unknown expression kind: NodeKind.MATCH_EXPR`
16-
- Needed:
17-
- Add `MATCH_EXPR` handling in `TypeCheckingPass._visit_expression_impl`.
18-
- Type-check each case arm expression and `else` expression.
19-
- Enforce arm type compatibility and return unified expression type.
20-
21-
2. **`@type_set(...)` in value context does not parse correctly**
22-
- Failing test: `test/test_semantic_generics.py:159`
23-
- Current error: `Expected expression` at first type argument token (`i8`).
24-
- Needed:
25-
- Treat `@type_set` as a type-taking builtin when parsed as expression, or
26-
- Parse `@type_set(...)` declarations through a dedicated type-set declaration path.
27-
28-
3. **Generic arithmetic in function bodies is too strict without constraint flow**
29-
- Failing test: `test/test_semantic_generics.py:176`
30-
- Current error: `Requires numeric type` for `$T * 2`.
31-
- Needed:
32-
- Propagate generic constraints (or inferred concrete type bindings) into expression checking.
33-
- Allow arithmetic when `$T` is known numeric by constraint or call-site inference.
34-
35-
4. **Generic struct literal field checks do not substitute type arguments**
36-
- Failing test: `test/test_semantic_generics.py:248`
37-
- Current error: `Type mismatch: expected '$T', got 'i32'`.
38-
- Needed:
39-
- Instantiate/substitute struct field types for `Pair(i32, string){...}` before field validation.
40-
41-
5. **Field access on generic struct instances is unresolved**
42-
- Failing tests:
43-
- `test/test_semantic_generics.py:263`
44-
- `test/test_semantic_generics.py:434`
45-
- Current error: `Cannot access field on non-struct type: got 'Box(i32)'` / `Node(i32)`.
46-
- Needed:
47-
- Resolve `GenericInstanceType` to concrete `StructType` during field access.
48-
- Support recursive generic instantiation safely (cycle-aware resolution for recursive types).
49-
50-
6. **Literal initialization for generic locals lacks inference/coercion**
51-
- Failing test: `test/test_semantic_generics.py:311`
52-
- Current error: `Type mismatch: expected '$T', got 'i32'` for `total: $T = 0`.
53-
- Needed:
54-
- Permit numeric literal initialization for generic numeric variables, or
55-
- Specialize generic function body typing from call-site type mapping before local checks.
9+
## Recently Completed (Language Core)
10+
11+
1. `match` expressions are type-checked and participate in expression typing.
12+
2. `@type_set(...)` parses in value context.
13+
3. Generic arithmetic and generic local literal initialization are relaxed where valid.
14+
4. Generic struct literal field checks substitute concrete type arguments.
15+
5. Field access resolves concrete struct layout for generic instances.
16+
6. Match semantics now enforce:
17+
- pattern type compatibility with the scrutinee,
18+
- bool/enum exhaustiveness (or explicit `else` / wildcard),
19+
- wildcard pattern parsing (`case _:`),
20+
- return-path correctness for exhaustive enum/bool `match` without `else`.
21+
22+
---
23+
24+
## Remaining Language-First Gaps
25+
26+
1. **`fall` statement semantics**
27+
- `fall` is parsed (`NodeKind.FALL`) but not yet validated or lowered in semantic/codegen passes.
28+
29+
2. **Advanced match diagnostics**
30+
- No overlap/redundancy diagnostics for case patterns.
31+
- No unreachable-branch detection for wildcard-first or fully-covered prior patterns.
32+
33+
3. **Memory/lifetime model**
34+
- Current validation covers basic `del` reference checks.
35+
- Ownership/borrow-style lifetime guarantees are not implemented.
36+
37+
4. **Generic constraint internals**
38+
- Inline type-set constraint resolution in `src/generics.py` is still placeholder-level (`resolve_generic_constraint`).
39+
40+
5. **Backend semantic parity hardening**
41+
- Core conformance is green, but differential/backend-equivalence checks should be expanded and kept mandatory for new language features.
5642

5743
---
5844

59-
## Deferred (Still Planned)
45+
## Out of Scope for This Snapshot
6046

61-
1. Labeled loops (`outer: for ...`) with syntax disambiguation.
62-
2. Array-programming stdlib (tensor/broadcast/linear algebra features).
63-
3. Alternative backends (C/native) using `src/backends/base.py`.
47+
- Package ecosystem, registry/distribution workflows, and broader tooling are intentionally secondary to language-core correctness.

README.md

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This whole application was vibecoded using Codex and Claude Code.
44

55
A Python-based compiler for A7, a statically-typed systems programming language. A7 combines the simplicity of C-style syntax with modern features like generics, type inference, and property-based pointer operations.
66

7-
The compiler features a complete pipeline: tokenizer, parser, 3-pass semantic analysis, AST preprocessing, and Zig code generation.
7+
The compiler features a complete pipeline: tokenizer, parser, 3-pass semantic analysis, AST preprocessing, and pluggable code generation backends (Zig and C).
88

99
## Inspired By
1010

@@ -30,12 +30,18 @@ uv sync
3030

3131
## Usage
3232

33-
Compile an A7 program to Zig:
33+
Compile an A7 program to Zig (default backend):
3434
```bash
3535
uv run python main.py examples/001_hello.a7
3636
# Output: examples/001_hello.zig
3737
```
3838

39+
Compile an A7 program to C:
40+
```bash
41+
uv run python main.py --backend c examples/001_hello.a7
42+
# Output: examples/001_hello.c
43+
```
44+
3945
Modes and output formats:
4046
```bash
4147
uv run python main.py --mode tokens examples/006_if.a7 # Tokens only
@@ -60,20 +66,21 @@ PYTHONPATH=. uv run pytest # All tests
6066
PYTHONPATH=. uv run pytest test/test_tokenizer.py # Specific test file
6167
PYTHONPATH=. uv run pytest -k "generic" -v # Targeted tests
6268
uv run python scripts/verify_examples_e2e.py # Compile/build/run + output checks for all examples
69+
uv run python scripts/verify_examples_e2e_c.py # Same flow via C backend + zig cc
6370
uv run python scripts/verify_error_stages.py # Error-stage audit across modes and formats
6471
```
6572

6673
## Compilation Pipeline
6774

6875
```
69-
Source (.a7) → Tokenizer → Parser → Semantic Analysis (3-pass) → AST Preprocessing → Zig Codegen → Output (.zig)
76+
Source (.a7) → Tokenizer → Parser → Semantic Analysis (3-pass) → AST Preprocessing → Backend Codegen (Zig/C) → Output (.zig/.c)
7077
```
7178

7279
1. **Tokenizer**. Lexes source into tokens. Handles single-token generics (`$T`), nested comments, and number formats.
7380
2. **Parser**. Uses recursive descent with precedence climbing. Parses all A7 constructs.
7481
3. **Semantic Analysis**. Runs name resolution, type checking with inference, and control flow validation.
7582
4. **AST Preprocessing**. Runs 9 sub-passes: sugar lowering, stdlib resolution, mutation and usage analysis, type inference, shadowing resolution, function hoisting, and constant folding.
76-
5. **Zig Code Generation**. Translates AST to valid Zig source code.
83+
5. **Backend Code Generation**. Translates AST to valid Zig or C source code.
7784

7885
All AST traversals are iterative with no recursion. The pipeline works with Python's recursion limit set to 100.
7986

@@ -86,16 +93,17 @@ All AST traversals are iterative with no recursion. The pipeline works with Pyth
8693
- **Memory**: Property-based pointer syntax (`.adr`, `.val`), new/delete, defer cleanup
8794
- **Imports**: Module system with named imports, using imports, aliased imports
8895
- **Generics**: Type parameters (`$T`), constraints, type sets, generic structs
89-
- **Code Generation**: Full A7 → Zig translation with smart var/const inference, function hoisting, shadowing prevention
96+
- **Code Generation**: A7 → Zig and A7 → C backends
9097
- **Standard Library**: Registry with io and math modules, backend-specific mappings
9198
- **Error Messages**: Rich formatting with source context and structured error types
9299

93100
## Project Status
94101

95-
- **983 tests passing**, 9 skipped
96-
- **36/36 examples** pass end-to-end compile + build + run + golden-output verification
102+
- Test status depends on current branch state. Check with `PYTHONPATH=. uv run pytest --tb=no -q`.
103+
- Example end-to-end verification is available via `uv run python scripts/verify_examples_e2e.py`.
97104
- Parser is 100% complete for the A7 specification
98105
- Zig backend handles all AST node types
106+
- C backend targets C11 and is validated with `zig cc`
99107

100108
## Learn More
101109

File renamed without changes.

docs/SPEC.md

Lines changed: 17 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# A7 Programming Language Specification
22

3-
> **Implementation Status (2026-02-21)**: This specification describes the complete A7 language design. The current Python implementation (`a7-py`) provides:
3+
> **Implementation Status (2026-02-24)**: This specification describes the complete A7 language design. The current Python implementation (`a7-py`) provides:
44
> -**Tokenizer/Lexer**: implemented
55
> -**Parser**: implemented
66
> -**AST generation**: implemented
7-
> -**Semantic pipeline**: implemented (name resolution, type checking, semantic validation), with known generic/match-expression gaps
7+
> -**Semantic pipeline**: implemented (name resolution, type checking, semantic validation), including match expressions, pattern type checks, and bool/enum exhaustiveness checks
88
> -**Zig code generation**: implemented
9-
> - 📊 **Current tests**: `1039 passed, 7 failed, 0 skipped` after unskipping previously deferred semantic tests
9+
> - 📊 **Current tests**: `1067 passed, 0 failed, 0 skipped`
1010
>
1111
> See `MISSING_FEATURES.md` for detailed feature status and `CLAUDE.md` for development guide.
1212
@@ -534,7 +534,7 @@ match color {
534534
}
535535
case Color.Green: {
536536
print("Green")
537-
fall // Explicit fallthrough
537+
fall // Parsed; full semantic/codegen fallthrough behavior is still being finalized
538538
}
539539
case Color.Blue: {
540540
print("Green or Blue")
@@ -1787,7 +1787,7 @@ match color {
17871787
}
17881788
case Color.Green: {
17891789
print("Green")
1790-
fall // Explicit fallthrough
1790+
fall // Parsed; full semantic/codegen fallthrough behavior is still being finalized
17911791
}
17921792
case Color.Blue: {
17931793
print("Green or Blue")
@@ -2831,34 +2831,28 @@ A7 supports the full ASCII character set (0-127) only. Characters outside this r
28312831

28322832
## Appendix E: Implementation Status (a7-py)
28332833

2834-
Status snapshot (2026-02-21), based on running `PYTHONPATH=. uv run pytest` after unskipping deferred semantic tests:
2834+
Status snapshot (2026-02-24), based on running `PYTHONPATH=. uv run pytest -q`:
28352835

28362836
- ✅ Full compiler pipeline exists (tokenizer, parser, semantic passes, AST preprocessing, Zig backend).
28372837
- ✅ Examples continue to pass end-to-end verification.
2838-
- 📊 Test status: **1039 passed, 7 failed, 0 skipped**.
2838+
- 📊 Test status: **1067 passed, 0 failed, 0 skipped**.
28392839

28402840
### E.1 Current Open Gaps
28412841

2842-
1. **Match expressions in semantic type checking**
2843-
- `MATCH_EXPR` nodes are parsed but not yet handled by `TypeCheckingPass`.
2842+
1. **`fall` semantic/codegen behavior**
2843+
- `fall` parses, but full semantic validation and backend lowering are still pending.
28442844

2845-
2. **Type-set builtin in declaration expression context**
2846-
- `@type_set(...)` currently fails in top-level constant declarations due to expression parsing path for type arguments.
2845+
2. **Advanced match diagnostics**
2846+
- Exhaustiveness for bool/enum is implemented, but overlap/redundancy diagnostics are still incomplete.
28472847

2848-
3. **Constraint-aware generic arithmetic**
2849-
- Generic functions using arithmetic on `$T` fail without constraint propagation/type knowledge.
2848+
3. **Memory/lifetime model depth**
2849+
- Basic `new`/`del` validation exists; full ownership/lifetime analysis is not complete.
28502850

2851-
4. **Generic struct field substitution**
2852-
- Struct literal/type checks still compare against unresolved `$T/$U` fields in some paths.
2851+
4. **Generic constraint internals**
2852+
- Some generic-constraint helper internals remain placeholder-level and need completion.
28532853

2854-
5. **Field access on instantiated generic structs**
2855-
- `Box(i32)`/`Node(i32)` remain `GenericInstanceType` in field-access checks instead of being resolved to concrete struct field layouts.
2856-
2857-
6. **Literal initialization for generic locals**
2858-
- Initializers like `total: $T = 0` need inference/coercion strategy for generic numeric contexts.
2859-
2860-
7. **Recursive generic instantiation**
2861-
- Recursive generic structs need cycle-safe type resolution and substitution.
2854+
5. **Backend semantic parity hardening**
2855+
- Differential parity checks across backends should continue expanding for new language features.
28622856

28632857
### E.2 Source Of Truth
28642858

docs/archive/README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Archive
2+
3+
This folder stores old or non-essential project artifacts that were moved out of the repository root to keep the main workspace focused.
4+
5+
Moved on 2026-02-21:
6+
7+
- `ERROR_ANALYSIS.md` -> `docs/ERROR_ANALYSIS.md`
8+
- `run.sh` -> `docs/archive/tools/run.sh`
9+
- `fix_types.py` -> `docs/archive/tools/fix_types.py`
10+
- `__pycache__/main.cpython-313.pyc` -> `docs/archive/pycache/__pycache__/main.cpython-313.pyc`
11+
- `site/` -> `docs/archive/site/`
12+
13+
Notes:
14+
15+
- `docs/archive/site/node_modules` was intentionally removed because it is generated dependency output.
File renamed without changes.

0 commit comments

Comments
 (0)