Skip to content

Conversation

@xmnlab
Copy link
Contributor

@xmnlab xmnlab commented Nov 14, 2025

Pull Request description

How to test these changes

  • ...

Pull Request checklists

This PR is a:

  • bug-fix
  • new feature
  • maintenance

About this PR:

  • it includes tests.
  • the tests are executed on CI.
  • the tests generate log file(s) (path).
  • pre-commit hooks were executed locally.
  • this PR requires a project documentation update.

Author's checklist:

  • I have reviewed the changes and it contains no misspelling.
  • The code is well commented, especially in the parts that contain more
    complexity.
  • New and old tests passed locally.

Additional information

Reviewer's checklist

Copy and paste this template for your review's note:

## Reviewer's Checklist

- [ ] I managed to reproduce the problem locally from the `main` branch
- [ ] I managed to test the new changes locally
- [ ] I confirm that the issues mentioned were fixed/resolved .

@yuvimittal yuvimittal closed this Dec 19, 2025
@yuvimittal yuvimittal reopened this Dec 19, 2025
@yuvimittal yuvimittal marked this pull request as ready for review December 19, 2025 04:26
@github-actions
Copy link

OSL ChatGPT Reviewer

NOTE: This is generated by an AI program, so some comments may not make sense.

src/irx/builders/base.py

  • Potential AttributeError in run(): if self.output_file is unset, accessing it raises before your RuntimeError. Use getattr to provide a clear error. (L.96)
    def run(self) -> str:
    """Run the generated executable."""
    output: str | None = getattr(self, "output_file", None)
    if not output:
    raise RuntimeError("No built output to run.")
    return run_command([output])

  • Breaking API risk: renaming translate parameter from expr to node can break callers using keyword args (translate(expr=...)). Keep the original name or accept both and normalize. E.g. simplest is to keep expr. (L.78)
    def translate(self, expr: astx.AST) -> str:
    """Transpile ASTx to LLVM-IR."""
    return self.translator.translate(expr)

  • Minor but correctness-semantic: prefer NotImplementedError over a generic Exception in the abstract translate default to better signal intent. (L.49)
    def translate(self, expr: astx.AST) -> str:
    """Translate a single AST node."""
    skip_unused(expr)
    raise NotImplementedError("Not implemented yet.")


src/irx/builders/llvmlite_arrow.py

  • Correctness: ArrowArray lifetime is stack-bound and release is NULL. If the ArrowArray escapes the function, all pointers (array, buffers, validity, value) become dangling and release=NULL violates the Arrow C Data Interface (release must be non-NULL while valid). At minimum, either ensure it never escapes or move these to heap and provide a release function. Suggest adding helpers and wiring a no-op (or freeing) release first, then consider heap allocation.

    • Replace NULL release with a defined function (L.177-L.183):
      def _get_release_nop(self) -> ir.Function:
      "Create or fetch a no-op ArrowArray release function."
      mod = self._llvm.module
      fn_ty = ir.FunctionType(ir.VoidType(), [self._arrow_array_ty.as_pointer()])
      fn = mod.globals.get("__arrow_release_nop")
      if fn is None:
      fn = ir.Function(mod, fn_ty, name="__arrow_release_nop")
      b = ir.IRBuilder(fn.append_basic_block(name="entry"))
      b.ret_void()
      return fn

      Use:

      rel_fn = self._get_release_nop()
      ib.store(rel_fn, fld(8)) # (L.181)
    • If the array can escape, allocate on heap (L.114-L.129):
      def _heap_alloc(self, ty: ir.Type, name: str) -> ir.Value:
      "malloc + bitcast helper."
      mod = self._llvm.module
      ib = self._llvm.ir_builder
      i8p = ir.IntType(8).as_pointer()
      i64 = ir.IntType(64)
      malloc_ty = ir.FunctionType(i8p, [i64])
      malloc = mod.globals.get("malloc") or ir.Function(mod, malloc_ty, name="malloc")
      size = ir.Constant(i64, self._llvm.target_data.abi_sizeof(ty))
      raw = ib.call(malloc, [size], name=f"{name}.raw")
      return ib.bitcast(raw, ty.as_pointer(), name=name)

      Then use _heap_alloc for arr_ptr, buffers_slot, valid_slot, value_slot. Also provide a matching release that frees them.

  • Correctness: Identified struct type redefinition. ir.global_context.get_identified_type("struct.ArrowArray") persists across visitors; calling set_body multiple times can raise if already defined. Guard the set_body (L.63-L.74):
    def _init_arrow_types(self) -> None:
    "Initialize Arrow C Data Interface types once."
    ctx = ir.global_context
    self._arrow_array_ty = ctx.get_identified_type("struct.ArrowArray")
    if not self._arrow_array_ty.is_opaque:
    return
    # ... then set_body as you do now

  • Behavior regression: Overriding the generic visit to raise breaks base visitor fallback for unhandled nodes. Delegate to super to preserve behavior (L.85-L.89):
    def visit(self, node: astx.AST) -> None:
    "Fallback to base visitor."
    return super().visit(node)

  • Resource leak: NamedTemporaryFile is created and never used/removed; a stray tmp file remains. Write the object directly to a NamedTemporaryFile with suffix ".o" and use its name (L.213-L.218):
    with tempfile.NamedTemporaryFile(suffix=".o", delete=False) as objf:
    objf.write(obj)
    obj_path = objf.name


@yuvimittal yuvimittal marked this pull request as draft December 20, 2025 03:37
@github-actions
Copy link

OSL ChatGPT Reviewer

NOTE: This is generated by an AI program, so some comments may not make sense.

src/irx/builders/base.py

  • Potential AttributeError in run(): accessing self.output_file when it’s not set will raise AttributeError instead of your intended RuntimeError. Use getattr to guard. (L.96)
    def run(self) -> str:
    """Run the generated executable."""
    output: str | None = getattr(self, "output_file", None)
    if not output:
    raise RuntimeError("No built output to run.")
    return run_command([output])

  • Breaking API: renaming translate’s parameter from expr to node can break callers using keyword arguments (expr=...). Recommend keeping expr. (L.80, L.82)
    def translate(self, expr: astx.AST) -> str:
    """Transpile ASTx to LLVM-IR."""
    return self.translator.translate(expr)

  • Raise NotImplementedError instead of a generic Exception in the abstract translate stub for correct semantics and tooling. (L.49)
    def translate(self, expr: astx.AST) -> str:
    """Translate ASTx to IR.
    """
    skip_unused(expr)
    raise NotImplementedError("Not implemented yet.")


src/irx/builders/llvmlite_arrow.py

  • Correctness: set_body on an IdentifiedStructType will raise if called more than once in the same global context. _init_arrow_types() is invoked per visitor (and you re-instantiate the visitor per build), so this will blow up on the second build in the same process. Add a guard before set_body. (L.67)
    Suggested change:
    def _init_arrow_types(self) -> None:
    """Initialize Arrow C Data Interface types once per global context."""
    ctx = ir.global_context
    self._arrow_array_ty = ctx.get_identified_type("struct.ArrowArray")
    # Already defined? No-op to avoid redefinition.
    if getattr(self._arrow_array_ty, "elements", ()):
    return
    ...

  • Correctness (Arrow C Data Interface semantics): release == NULL denotes an already-released array; consumers must treat it as invalid. You’re constructing a “valid” array but populating release with NULL. Provide a non-NULL release and manage lifetime accordingly. (L.181-L.185)
    Suggested change:
    def _declare_arrow_release(self) -> ir.Function:
    """Declare/define a release(ArrowArray*) callback and return its symbol."""
    ...

    Then store its pointer instead of NULL at fld(8).

  • Memory lifetime/UB: All ArrowArray storage and its buffers (validity byte and i32 value) are stack-allocated. If the result escapes the function (e.g., via result_stack and a return), this becomes a dangling pointer. Either:

    • Keep it strictly local and do not let arr_ptr escape (avoid appending to result_stack), or
    • Heap-allocate ArrowArray and buffers and provide a release that frees them. (L.118, L.123, L.128, L.132, L.190)
      Suggested change:
      def _heap_alloc(self, ty: ir.Type, name: str) -> ir.Instruction:
      """Return heap-allocated pointer for Arrow lifetime (via malloc)."""
      ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants