Author: Bevan Hunt, BBA
Date: February 21, 2025
Python’s simplicity and widespread adoption make it a cornerstone of software development, yet its dynamic nature and limited concurrency support introduce runtime errors, performance bottlenecks, and concurrency-related bugs that hinder its use in critical and high-performance applications. We introduce PySafe, a superset of Python that augments the language with advanced safety, verification, and concurrency features while preserving its usability. PySafe integrates a borrow checker for memory safety, contracts for behavior specification, strict typing for error prevention, enhanced tuples for structured data, a prover for formal verification, safe concurrency mechanisms to prevent data races and deadlocks, and compilation to LLVM IR via SSA for performance. This paper details PySafe’s syntax and semantics, outlines its implementation, and demonstrates its benefits through examples, positioning PySafe as a robust extension of Python for modern, reliable systems.
Python’s success stems from its intuitive syntax, rich ecosystem, and dynamic flexibility, making it a preferred choice for rapid development. However, these strengths come with trade-offs: runtime type errors, memory misuse, suboptimal performance in compute-intensive tasks, and limited support for safe parallelism due to the Global Interpreter Lock (GIL). These limitations restrict Python’s applicability in safety-critical systems, high-performance computing, and concurrent applications.
To address these challenges, we propose PySafe, a Python superset that introduces:
- Borrow Checker: Ensures memory safety by tracking resource ownership and borrowing.
- Contracts: Specify preconditions, postconditions, and invariants for verification.
- Strict Typing: Enforces type safety with mandatory annotations and inference.
- Enhanced Tuples: Add named fields and pattern matching for structured data.
- Prover: Enables formal verification of program properties.
- Safe Concurrency: Prevents data races and deadlocks in multi-threaded execution.
- LLVM IR Compilation via SSA: Optimizes performance through native code generation.
PySafe maintains compatibility with existing Python code, allowing developers to adopt its features incrementally.
This paper contributes:
- A Python-compatible language with comprehensive safety, concurrency, and performance enhancements.
- A detailed design of its syntax and semantics.
- An implementation strategy for its compiler.
- Use cases illustrating its advantages.
The paper is structured as follows: Section 2 reviews related work, Section 3 describes the language design, Section 4 explains the implementation, Section 5 evaluates PySafe through examples, and Section 6 discusses future directions.
PySafe builds on concepts from several languages and tools:
- Rust: Pioneered borrow checking for memory safety and safe concurrency without garbage collection [1].
- Eiffel: Introduced design by contract with formal specifications [2].
- mypy: Offers static type checking for Python via optional annotations [3].
- Dafny: Integrates contracts and formal verification with a prover [4].
- Numba: Uses LLVM to JIT-compile Python for performance [5].
- Cython: Adds static typing to Python for speed [6].
- PyPy: A JIT-compiled Python implementation [7].
- Mojo: A Python superset by Modular with borrow checking and optimizations [8].
Rust’s ownership model ensures thread safety, while Go’s goroutines and channels (though not cited here) inspire concurrency alternatives. Python’s GIL limits multi-threading, relying on multiprocessing for parallelism, which lacks fine-grained safety. PySafe uniquely combines these features into a Python-compatible framework.
[1] Matsakis, N. D., & Klock, F. S. (2014). The Rust language. ACM SIGAda Ada Letters, 34(2), 103-104.
[2] Meyer, B. (1992). Applying "design by contract". Computer, 25(10), 40-51.
[3] Lehtosalo, J. (2016). Static types for Python. ACM SIGPLAN Python Symposium.
[4] Leino, K. R. M. (2010). Dafny: An automatic program verifier. LPAR, 6138, 348-370.
[5] Lam, S. K., et al. (2015). Numba: A LLVM-based Python JIT compiler. LLVM-HPC, 1-6.
[6] Behnel, S., et al. (2011). Cython: The best of both worlds. Computing in Science & Engineering, 13(2), 31-39.
[7] Rigo, A., & Pedroni, S. (2006). PyPy's approach to virtual machine construction. OOPSLA, 944-953.
[8] Modular Mojo Programming Language. (n.d.). Retrieved from https://modular.com/
PySafe extends Python with syntax and semantics for safety, verification, and concurrency.
PySafe's borrow checker, inspired by Rust, ensures memory safety using own
and borrow
annotations:
def transfer(a: borrow int, b: own int) -> int:
return a + b # 'a' is immutable; 'b' is owned
It prevents invalid memory access and extends to concurrency by enforcing thread-safe data access.
Contracts specify behavior with decorators:
@pre(lambda self: self.balance >= 0)
@post(lambda self, result: self.balance == old(self).balance - result)
def withdraw(self, amount: int) -> int:
self.balance -= amount
return amount
@pre
and @post
define conditions, with old
referencing pre-execution state, applicable to both single-threaded and concurrent contexts.
PySafe mandates type annotations with inference:
def add(a: int, b: int) -> int:
return a + b
This ensures type safety across threads and compile-time error detection.
Tuples gain named fields and pattern matching:
point = (x=10, y=20)
print(point.x) # Outputs 10
This supports structured data in concurrent programs.
The @prove
decorator enables formal verification:
@prove("forall x in range(10): x >= 0")
def positive_numbers() -> List[int]:
return [i for i in range(10)]
It verifies properties, including concurrency invariants.
PySafe introduces safe concurrency using the borrow checker and new primitives like shared
:
shared_data: shared List[int] = [1, 2, 3]
def worker():
with lock(shared_data):
shared_data.append(4) # Thread-safe access
The borrow checker ensures exclusive mutable access or multiple immutable reads, preventing data races. This builds on Python's threading model, bypassing the GIL where possible.
PySafe compiles to LLVM IR via SSA, optimizing performance for concurrent and sequential code.
The PySafe compiler pipeline includes:
- Parsing: Ratpack builds an AST from PySafe code.
- Type Checking: Validates types using a Hindley-Milner-inspired system.
- Borrow Checking: Tracks ownership and borrowing across threads.
- Contract Checking: Validates contracts dynamically or statically.
- Concurrency Analysis: Ensures thread safety, integrating with the borrow checker.
- SSA Transformation: Converts the AST to SSA for optimization.
- LLVM IR Generation: Produces optimized machine code via LLVM.
The prover operates optionally for @prove
annotations. The compiler will be first implemented in Rust and then bootstrapped using PySafe to compile subsequent versions of the compiler.
As a conceptual design, PySafe's benefits are illustrated through examples:
The borrow checker prevents unsafe operations:
lst: own List[int] = [1, 2, 3]
for x: borrow int in lst:
# lst.append(4) # Blocked by borrow checker
print(x)
A banking system ensures correctness:
class Account:
balance: int = 0
@invariant(lambda self: self.balance >= 0)
@post(lambda self, result: self.balance == old(self).balance + result)
def deposit(self, amount: int) -> int:
self.balance += amount
return amount
The prover confirms invariants hold.
A concurrent counter avoids data races:
counter: shared int = 0
def increment():
with lock(counter):
counter += 1 # Safe, exclusive access
Unlike Python, where race conditions could occur without locks, PySafe ensures safety at compile time.
Compiled matrix multiplication with concurrent threads could outperform Python's interpreted execution, leveraging LLVM optimizations.
These examples highlight PySafe's potential, awaiting implementation.
PySafe enhances Python with memory safety, formal verification, safe concurrency, and performance features, maintaining compatibility and usability. By integrating a borrow checker, contracts, strict typing, enhanced tuples, a prover, concurrency primitives, and LLVM compilation, PySafe offers a foundation for reliable, efficient, and concurrent Python programming. Future work includes refining syntax, implementing full concurrency support beyond initial single-threaded features, building the compiler, and conducting empirical evaluations to validate its user benefits.