Skip to content

djiangtw/tech-column-public

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tech Column

In-Depth System Architecture and Hardware Design

Language Series Articles License Author Updated


📖 About This Column

Tech Column is a technical writing project focused on system architecture, hardware design, and performance optimization. The goal is to explain complex technical concepts clearly using vivid analogies and real-world cases, helping readers understand not just "what" but "why."

About the Cases: All case scenarios in this column are mock scenarios, written based on industry best practices with all sensitive information removed. All content complies with professional ethics and NDA requirements.

Features

  • Vivid Analogies: Understand Cache through libraries, Associativity through parking lots, NoC through city traffic
  • Real-World Cases: Practical problems and solutions from 20+ years of industry experience
  • Progressive Learning: From beginner to advanced, systematically building knowledge
  • Practice-Oriented: Not just theory, but actionable optimization advice and design principles

📊 Project Statistics

Series Articles Word Count
Computer Architecture 4 ~39,700
Cache Architecture 6 ~20,800
Network-on-Chip 6 ~14,100
Storage Architecture 12 ~52,000
Embedded RTOS 8 ~24,000
Bluetooth & IoT 21 ~70,000
Building danieRTOS 40 ~170,000
Tech Events 2 ~45,000
Tech Reads 1 ~21,000

Total: 100 articles, ~457,600 words


📚 Article Series

1. Computer Architecture Series (4 articles) 🆕

Understanding CPU performance design and heterogeneous computing from the architect's perspective.

Article 01 - All Roads Lead to IPC: IPC (Instructions Per Cycle), Latency vs Occupation, Superscalar, Out-of-Order execution, Branch prediction, Cache effects, ROB sizing

Article 02 - Heterogeneous System Architecture: Six performance laws (Amdahl, Gustafson, USL, Roofline, Little's Law, Queuing Theory), Four processor types (CPU/GPU/NPU/DPU), Memory architectures (UMA/CXL/NVLink), Coherence protocols, MLIR, Data-oriented design

Article 03 - Workload-Driven CPU Selection: TMAM (Top-Down Microarchitecture Analysis), CPU taxonomy (ARM Cortex-M/R/A/Neoverse vs RISC-V SiFive/XiangShan/Ventana), Five design scenarios (Ultra-Low Power, Real-Time Embedded, Rich Embedded, Mobile Computing, Cloud & AI Infrastructure), Performance laws application (Little's Law, Roofline Model, ILP/MLP analysis), PPA trade-offs

Article 04 - LLM-Driven RISC-V Vector Code Generation and Verification Methodology: IntrinTrans framework, Multi-Agent FSM (Translator/Compilation/Test/Optimizer), VLA (Vector Length Agnosticism), Strip-mining, LMUL register pressure, Liveness Analysis, Architecture-Aware guardrails, Post-silicon verification (Trace Encoder/Funnel), Cache-aware optimization limitations

Note: This series is available in both Traditional Chinese and English (independently written, not translated).


2. Cache Architecture Series (6 articles)

Deep dive into CPU Cache design and optimization, from basics to practice.

Topics: Cache basics, Associativity, Modern cache design (L1-L3), MESI protocol, Performance optimization, False sharing


4. Network-on-Chip Series (6 articles)

Exploring on-chip communication architecture, from Bus to Network evolution.

Topics: NoC introduction, Topology with graph theory, Routing and deadlock, Router microarchitecture, Cache coherency integration, Advanced packaging


5. Storage Architecture Series (12 articles)

Complete perspective from hardware to software on modern storage systems.

Topics: HDD to SSD evolution, SATA/AHCI, PCIe architecture, NVMe protocol, CXL technology, FTL, GC and wear leveling, Error correction, ZNS, Database optimization, AI/ML workloads, Cloud storage


6. Embedded RTOS Series (8 articles)

Practice-oriented embedded RTOS development with FreeRTOS + RISC-V.

Topics: RTOS introduction, Scheduler deep dive, Interrupt handling, Memory management, GDB+QEMU debugging, SMP challenges, Context switch assembly, RISC-V privilege modes


7. Bluetooth & IoT Series (21 articles)

BLE protocol stack, wireless communication, IoT system integration.

Topics: BLE protocol stack (HCI, L2CAP, ATT/GATT, SMP), PHY/RF, WiFi/BT coexistence, Hardware interfaces (SPI, MIPI, I2C/UART/GPIO), Power optimization, Debugging, Certification, Zigbee comparison, Thread/Matter, AIoT, Security


8. Building danieRTOS Series (40 articles)

Building a RISC-V RTOS from scratch, narrative-style writing, 40 complete tutorials.

danieRTOS is an educational minimal RTOS running on RISC-V architecture.

Version Alias Chapters Core Features
v0.x Nano 01-12 Basic RTOS: Task, Scheduler, Semaphore, Mutex, Queue
v1.x Secure 13-19 User Mode: PMP, Syscall, Fault Handling
v2.x MSMP 20-30 SMP: Spinlock, IPI, Multi-core Scheduler
v3.x SMP 31-40 Integration: SMP + User Mode + Fault Isolation

9. Tech Reads Series (1 article) 🆕

In-depth reviews of foundational textbooks and research papers, bridging theory with system design practice.

Article 01 - A First Course in Information Theory: Bridging Shannon and System Architecture: Connect information theory fundamentals (entropy, mutual information, channel capacity) with real-world system design. Topics include: Roofline Model as entropy bounds, Fano's Inequality in branch prediction, typicality in benchmarking methodology, rate-distortion theory in quantization, and information diagrams for understanding memory consistency models.

Note: This series is available in both Traditional Chinese and English (independently written, not translated).


10. Tech Events Series (2 articles)

Architecture-aware deep dives on major industry events and product launches, focusing on how system architecture, hardware, and infrastructure evolve.

Article 01 - GTC 2026 Technical Review: How AI Factories Are Reshaping System Architecture: From NVIDIA Vera CPU and NVFP4 numerical formats to NVLink/NVL72 clusters and AI Factory infrastructure, this series looks at GTC 2026 through the lens of performance laws, disaggregated inference, and large-scale system design.

Article 02 - Breaking Compute Anxiety: How Arm AGI CPU Reshapes Agentic AI Infrastructure: From the DGX Spark paradox to Meta's heterogeneous clusters, OpenAI's MCTS reasoning trees, and Cloudflare's edge defense—explore how Arm's 136-core AGI CPU tackles Agentic AI's three core challenges: memory bandwidth walls, deterministic latency, and rack-scale economics. Topics: Information entropy in workload characterization, CXL 3.0 zero-copy orchestration, SMT abandonment rationale, P-E-C Triangle optimization, and why ASIC cannot replace CPU in high-entropy scenarios.

Note: This series is available in both Traditional Chinese and English (independently written, not translated).


🎯 Target Audience

This column is suitable for:

  • System Software Engineers: Understanding how hardware affects software performance
  • Embedded Engineers: RTOS, drivers, firmware development
  • Hardware Engineers: CPU, SoC design and verification
  • IoT Developers: Bluetooth, wireless communication, IoT development
  • Computer Architecture Students: Learning system architecture in real-world contexts

Prerequisites:

  • Basic computer organization concepts
  • Understanding of CPU, memory, bus components
  • C programming experience (required for some series)

📄 License

Copyright © 2025 Danny Jiang

All articles are licensed under Creative Commons Attribution 4.0 International License (CC BY 4.0).

You are free to:

  • Share — copy and redistribute the material in any medium or format
  • Adapt — remix, transform, and build upon the material for any purpose, including commercial

Under the following terms:

  • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made

License: https://creativecommons.org/licenses/by/4.0/


📖 How to Use This Column

Online Reading

Browse Markdown files directly on GitHub, starting from the first article of each series.

Offline Reading

Clone this repository:

git clone https://github.com/djiangtw/tech-column-public.git
cd tech-column-public

Recommended Reading Order

Hardware Architecture Beginners: Cache Architecture → Network-on-Chip → Storage Architecture

Embedded Systems: Embedded RTOS → Building danieRTOS

Wireless Communication: Bluetooth & IoT Series


🤝 Contributing

This is a read-only public repository. The column is developed in a private repository.

Feedback Welcome:

  • Open issues for typos, errors, or suggestions
  • Discussion and questions are encouraged

Note: Pull requests cannot be accepted as this is synced one-way from the private development repository.


👨‍💻 About the Author

Danny Jiang

System software engineer focused on RISC-V architecture, embedded systems, and performance optimization. 20+ years of industry experience, passionate about explaining complex technical concepts through vivid analogies.

Other Works:


🔗 Links


📝 Citation

If you cite this column in research, teaching, or articles:

Danny Jiang. (2025). Tech Column: In-Depth System Architecture and Hardware Design.
Licensed under CC BY 4.0. https://github.com/djiangtw/tech-column-public

Happy Reading! 📖

For any questions or suggestions, feel free to contact me through GitHub Issues.

About

Tech Column - 深入淺出的系統架構與軟硬體設計文章(Cache、NoC、性能優化)

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors