Trying to make a C compiler using C

Based on the instruction from: https://github.com/lotabout/write-a-C-interpreter/blob/master/README.md

But modify some by myself:

Change from int to long to fit my 64-bit Mac Machine
Use .h file and CMake ( ✌︎( ᐛ )✌︎)

Final Structure

Compiler
├──no mempool version/
│   ├── CMakeLists.txt
│   ├── global.h
│   ├── main.c
│   ├── lexer.c
│   ├── parser.c
│   ├── test0.c
│   └── vm.c
├──reorganized version/
│   ├── CMakeLists.txt
│   ├── global.h
│   ├── main.c
│   ├── lexer.c
│   ├── parser.c
│   ├── test.c
│   ├── test_mem.c
│   ├── concurrency_mempool.cpp
│   ├── concurrency_mempool.hpp
│   └── vm.c
├──comparison/
    ├── bench.py
    ├── test_perf.c

No Mempool Version(Baseline)

global.h (The Registry): Stores all shared definitions (Token IDs, instruction codes, register variable declarations). Any file that needs to communicate with others must include this.
main.c (The Assembly Plant): Responsible for allocating memory and reading source files. It first calls the parser to work, and then triggers the VM to run.
lexer.c (The Cutter): Solely manages next(). It chops the long source string into individual Tokens (e.g., int, main, 123).
parser.c (The Translator): Solely manages program(), expression(), etc. It observes the Tokens and writes instructions (assembly) into the text array.
vm.c (The CPU): Solely manages eval(). It reads the instructions from the text array and executes them one by one.

How to use this?

How to compile + test running:

cd "no mempool version"
mkdir build
cd build
cmake ..
make
./compiler_no_mem ../test0.c

Then you can get:

xxx build % make                
[100%] Built target compiler
xxx build % ./compiler ../test.c
fibonacci(0) = 1
fibonacci(1) = 1
fibonacci(2) = 2
fibonacci(3) = 3
fibonacci(4) = 5
fibonacci(5) = 8
fibonacci(6) = 13
fibonacci(7) = 21
fibonacci(8) = 34
fibonacci(9) = 55
fibonacci(10) = 89
exit(0)

Reorganized Version(Mempool)

The biggest architectural leap in the reorganized version is the replacement of the native OS malloc/free with a custom-built, highly optimized Concurrent Memory Pool (concurrency_mempool.cpp).

Inspired by Google's TCMalloc, this C++ memory allocator handles all memory requests originating from the VM's MALC and FREE instructions. It drastically reduces user-to-kernel context switches and provides lock-free allocation for multithreaded environments.

Three-Tier Architecture

The Mempool is built upon a strict three-tier caching system to balance performance and memory fragmentation:

ThreadCache (The Front-line): * A thread-local storage (thread_local) cache that handles all small memory requests (≤ 4096 bytes).
- Completely Lock-Free: Threads can allocate and deallocate memory here without any lock contention, making it incredibly fast.
CentralCache (The Distributor): * Acts as the middleman. When a ThreadCache runs out of memory (or has too much), it communicates with the CentralCache.
- Manages memory in blocks called Spans. Uses lightweight SpinLock to handle concurrent requests from multiple ThreadCache instances efficiently.
PageCache (The Wholesaler): * Directly interfaces with the Operating System (mmap on Linux/macOS, VirtualAlloc on Windows) to allocate large chunks of memory in page units.
- Uses a 2-Level Radix Tree (PageMap2) to map raw memory addresses to Span structures in $O(1)$ time complexity, ensuring lightning-fast memory recovery and merging during deallocation.

Key Technical Highlights

Cross-Language Integration: The memory pool is implemented in modern C++ but exports a clean extern "C" API (PoolAllocate, PoolDeallocate), allowing the C-based Virtual Machine to seamlessly call it.
Metadata Optimization (SpanPool): Instead of using the native new operator to create Span metadata objects (which would defeat the purpose of a custom allocator), it features an internal SpanPool that pre-allocates and manages its own metadata structures.
Space-Time Tradeoff: By pre-allocating large pages from the OS and managing fragmentation internally, this architecture sacrifices a slight amount of peak physical memory (RSS) to buy pure execution speed and predictability.

Performance Benchmark (Space-Time Tradeoff)

To prove the effectiveness of the Mempool, we ran a heavy memory benchmark (bench.py running test_perf.c) simulating 100,000 continuous malloc and free operations. We compared the Baseline (Native Allocator) against our Custom Mempool.

Benchmark Results:

Metric	Baseline (Native)	Mempool (Custom)	Difference
Avg Time (ms)	28.76	28.34	+ 1.45% (Faster)
Peak RSS (MB)	2.33	2.98	- 28.18% (More RAM)

Conclusion

The results perfectly demonstrate the classic Space-Time Tradeoff in computer science:

Time Improvement: The custom Mempool successfully outperforms the highly-optimized modern OS allocator in execution speed because it eliminates frequent user-to-kernel mode context switches during allocation.
Space Cost: The Peak RSS (Resident Set Size) is naturally higher. The Mempool pre-allocates large memory blocks (pages) from the OS to serve future requests and maintains internal metadata (Spans, Radix Trees), trading base memory usage for pure execution speed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trying to make a C compiler using C

Final Structure

No Mempool Version(Baseline)

How to use this?

Reorganized Version(Mempool)

Three-Tier Architecture

Key Technical Highlights

Performance Benchmark (Space-Time Tradeoff)

Benchmark Results:

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
comparison		comparison
no mempool version		no mempool version
reorganized version		reorganized version
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Trying to make a C compiler using C

Final Structure

No Mempool Version(Baseline)

How to use this?

Reorganized Version(Mempool)

Three-Tier Architecture

Key Technical Highlights

Performance Benchmark (Space-Time Tradeoff)

Benchmark Results:

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages