diff --git a/de/modules/ROOT/images/fifo-read.svg b/de/modules/ROOT/images/fifo-read.svg
new file mode 100644
index 0000000..373f8d9
--- /dev/null
+++ b/de/modules/ROOT/images/fifo-read.svg
@@ -0,0 +1,304 @@
+
+
+
diff --git a/de/modules/ROOT/images/fifo-write.svg b/de/modules/ROOT/images/fifo-write.svg
new file mode 100644
index 0000000..2475083
--- /dev/null
+++ b/de/modules/ROOT/images/fifo-write.svg
@@ -0,0 +1,233 @@
+
+
+
diff --git a/de/modules/ROOT/nav.adoc b/de/modules/ROOT/nav.adoc
index 1d0f328..7a802df 100644
--- a/de/modules/ROOT/nav.adoc
+++ b/de/modules/ROOT/nav.adoc
@@ -1,4 +1,7 @@
* xref:index.adoc[README]
* xref::[Why do we ... ?]
** xref:why/use-antora.adoc[use Antora]
-** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
\ No newline at end of file
+** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
+** xref:why/choose-the-cocotb-framework-for-verification.adoc[choose the cocotb framework for verification]
+* xref::[How do we ... ?]
+** xref:how/implement-data-caching-and-sharding.adoc[implement data caching and sharding]
\ No newline at end of file
diff --git a/de/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc b/de/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
new file mode 100644
index 0000000..cd10efa
--- /dev/null
+++ b/de/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
@@ -0,0 +1,94 @@
+= How do we implement data caching and sharding
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+The AXI protocol supports **narrow transfer** (8/16/32-bit) with optional data bus widthfootnote:[AXI use `AxSIZE[2:0\]` to indicate the number of bytes transferred per transfer.], while SDRAM operates with fixed 16-bit physical interfaces. This mismatch creates two key challenges:
+
+* **Alignment Overhead**: SDRAM requires 16-bit aligned address, whereas the AXI protocol allows _unaligned transfers_. This misalignment necessitates manual data reordering and buffering to ensure the correct order before transferring data to SDRAM or back to the master.
+* **Transfer Suspension**: AXI transfers can be suspended when the `ready` or `valid` signals are deasserted. However, SDRAM does not support the ability to suspend burst operations. Once a burst transaction is initiated, it processes the entire burst without any delay. Therefore, this behavior needs to be carefully considered.
+
+== Analysis
+
+=== Read Operation
+
+In this section, we assume that the AXI data bus width is 32-bit, which means we only need to consider 8-bit, 16-bit, and 32-bit read operations.
+
+. 8-bit
++
+For 8-bit reads, in the best case, a single SDRAM read can satisfy 2 AXI transfersfootnote:aligned[If the address is aligned to 16 bits, assuming the address pattern is like `?0`.]. However, in the worst case, a single SDRAM read can only satisfy 1 AXI transfer.
+. 16-bit
++
+For 16-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:aligned[]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+. 32-bit
++
+For 32-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:[Assuming the address pattern is like `1?`.]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+
+Since we can suspend the AXI read response by deasserting the `valid` signal, **as long as** we can cache the SDRAM read responses, implementing a read cache becomes straightforward.
+
+[CAUTION]
+====
+Caching the SDRAM read responses means caching *all* responses, even during the CAS latency.
+
+Therefore, we should stop the SDRAM burst read transaction when the AXI master `ready` signal is deasserted and reserve sufficient buffer for the CAS latency.
+====
+
+=== Write Operation
+
+Unlike read operations, write operations do not require consideration of CAS latency. As a result, it is possible to transfer both the address and data at the same time.
+
+Similar to read operations, this section only considers 8-bit, 16-bit, and 32-bit read operations.
+
+Due to the limitation of SDRAM`'s CKE feature, which cannot remain deasserted indefinitely, when the master suspends a transfer, we must restart the transaction. Restarting burst transfers requires re-sending the `BURST WRITE` command with data and address, making it similar to the single write operation. Therefore, the single write operation is used in this section instead.
+
+. 8-bit
++
+For 8-bit writes, typically 2 transfers are required to complete a write operation. In the best casefootnote:unaligned[If the address is unaligned to 16 bits, assuming the address pattern is like `?1`.], 1 transfer is sufficient.
+. 16-bit
++
+For 16-bit writes, 1 transfer can always accommodate a single write.
++
+[TIP]
+====
+Let`'s consider two scenarios.
+
+Aligned Address:: The address is like `?0`. In this scenario, the data is 16-bit, and the address is aligned, allowing for a straightforward write operation.
+Unaligned Address:: The address is like `?1`. In this scenario, the data is 8-bit, so a single write is sufficient. The next address must be aligned.
+
+Therefore, regardless of whether the address is aligned or not, only a single write is needed.
+====
+. 32-bit
++
+For 32-bit write, under normal circumstances, it is sufficient to accommodate 2 writes. In special casesfootnote:unaligned[], it can only support a single write.
+
+[NOTE]
+====
+Since a `READ` operation is equivalent to a `STOP BURST` followed by a `READ`, and a `WRITE` can be considered equivalent to a `STOP BURST` followed by a `WRITE`. We can directly send a `WRITE` to start a new burst operation without the need to send a `STOP BURST` first.
+
+However, if the master`'s `valid` signal is deasserted and we do not have enough write buffer data, we need to manually send a `STOP BURST`. In the worst-case scenario, this may result in an additional idle cycle.
+
+Therefore, we still use a single write operation.
+====
+
+== Implementation
+
+Due to the CAS latency of SDRAM, we choose to separate the command and data channel. Therefore, the command channel is controlled by the state machine. The RFIFO will only connect to `DQi`, and the WFIFO will only connect to `DQo`.
+
+=== Read Operation
+
+We use two FIFOs to implement the read cache. RFIFO1 is a 1-depth 32-bit Bypass FIFO that is directly connected to the `R` response channel of AXI. RFIFO2 is a 4-depth 16-bit Bypass FIFO that is directly connected to the `DQi` port of SDRAM.
+
+image::ROOT:fifo-read.svg[RFIFO]
+
+Every time two 16-bit data are available in FIFO2, the data will be cached into RFIFO1. When RFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. Due to the CAS latency of SDRAM -- for simplicity, we assume a CAS latency of 3 cycles -- it will still send 3 cycles of read data. At this time, RFIFO2 will cache all the read data. When RFIFO1 is empty, the burst operation will be restarted.
+
+=== Write Operation
+
+Since the AXI protocol supports back-pressure, we can use a single FIFO to implement the write cache. The WFIFO is a 4-depth 32-bit Bypass FIFO that is directly connected to the `DQo` port of SDRAM. But we need to manually control the back-pressure in the `B` channel. Therefore, we connect two FIFOs to the `B` channel (to simulate the `almost_empty` signal).
+
+image::ROOT:fifo-write.svg[WFIFO]
+
+When WFIFO is empty or BFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. When WFIFO is full, the AXI write request will be suspended through back-pressure.
\ No newline at end of file
diff --git a/de/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc b/de/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..c1e4116
--- /dev/null
+++ b/de/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why do we choose the cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test case and logic to the simulators, making migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be hard. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. To make matters worse, working offline (which is common in IC development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test case and logic from simulators, as frequent switching between different simulators is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and simulators. We plan to reimplement a set of open-source models in the future to enhance compatibility with all simulators, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test case and logic across different tools, minimizing friction when switching between them.
+
+== Advantages of cocotb
+
+link:https://github.com/cocotb/cocotb[cocotb] is a framework empowering users to write VHDL and Verilog testbenches in Python. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for IC verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+cocotb has a large and active community of developers. There are already several successful examples of using cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+While commercial simulators like Synopsys VCS or Cadence Xcelium provide more comprehensive feature sets, cocotb preserves **the crucial final step** of seamless test case portability to open-source simulators like link:https://github.com/steveicarus/iverilog[ICARUS Verilog] or link:https://github.com/verilator/verilator[Verilator].
\ No newline at end of file
diff --git a/de/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc b/de/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..ec1c35c
--- /dev/null
+++ b/de/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why we chose the Cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test code to the verification tools, making future migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be inconvenient. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. Additionally, working offline (which is common in EDA development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test code from verification tools, as frequent switching between different verification tools is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and verification tools. We plan to reimplement a set of open-source models in the future to enhance compatibility with all verification tools, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test code across different tools, minimizing friction when switching between them.
+
+== Advantages of Cocotb
+
+Cocotb is a Python-based verification framework designed for SystemVerilog and Verilog HDL. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+Cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for EDA verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+Cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+Cocotb has a large and active community of developers. There are already several successful examples of using Cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+Cocotb is an excellent choice for verification complex systems due to its flexible framework, extensibility, and strong community support. While commercial simulators like Synopsys VCS or Cadence Xcelium offer a broader range of features and capabilities, Cocotb remains a strong candidate for open-source projects, offering a solid, flexible, and community-backed solution.
\ No newline at end of file
diff --git a/en/modules/ROOT/images/fifo-read.svg b/en/modules/ROOT/images/fifo-read.svg
new file mode 100644
index 0000000..c4f8e0d
--- /dev/null
+++ b/en/modules/ROOT/images/fifo-read.svg
@@ -0,0 +1,3 @@
+
+
+
\ No newline at end of file
diff --git a/en/modules/ROOT/images/fifo-write.svg b/en/modules/ROOT/images/fifo-write.svg
new file mode 100644
index 0000000..ba14e39
--- /dev/null
+++ b/en/modules/ROOT/images/fifo-write.svg
@@ -0,0 +1,3 @@
+
+
+
\ No newline at end of file
diff --git a/en/modules/ROOT/nav.adoc b/en/modules/ROOT/nav.adoc
index e62ff7e..5196720 100644
--- a/en/modules/ROOT/nav.adoc
+++ b/en/modules/ROOT/nav.adoc
@@ -2,3 +2,6 @@
* xref::[Why do we ... ?]
** xref:why/use-antora.adoc[use Antora]
** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
+** xref:why/choose-the-cocotb-framework-for-verification.adoc[choose the cocotb framework for verification]
+* xref::[How do we ... ?]
+** xref:how/implement-data-caching-and-sharding.adoc[implement data caching and sharding]
diff --git a/en/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc b/en/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
new file mode 100644
index 0000000..360d80d
--- /dev/null
+++ b/en/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
@@ -0,0 +1,101 @@
+= How do we implement data caching and sharding
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+The AXI protocol supports **narrow transfer** (8/16/32-bit) with optional data bus widthfootnote:[AXI use `AxSIZE[2:0\]` to indicate the number of bytes transferred per transfer.], while SDRAM operates with fixed 16-bit physical interfaces. This mismatch creates two key challenges:
+
+* **Alignment Overhead**: SDRAM requires 16-bit aligned address, whereas the AXI protocol allows _unaligned transfers_. This misalignment necessitates manual data reordering and buffering to ensure the correct order before transferring data to SDRAM or back to the master.
+
+* **Transfer Suspension**: AXI transfers can be suspended when the `ready` or `valid` signals are deasserted. However, SDRAM does not support the ability to suspend burst operations. Once a burst transaction is initiated, it processes the entire burst without any delay. Therefore, this behavior needs to be carefully considered.
+
+== Analysis
+
+=== Read Operation
+
+In this section, we assume that the AXI data bus width is 32-bit, which means we only need to consider 8-bit, 16-bit, and 32-bit read operations.
+
+. 8-bit
++
+For 8-bit reads, in the best case, a single SDRAM read can satisfy 2 AXI transfersfootnote:aligned[If the address is aligned to 16 bits, assuming the address pattern is like `?0`.]. However, in the worst case, a single SDRAM read can only satisfy 1 AXI transfer.
+
+. 16-bit
++
+For 16-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:aligned[]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+
+. 32-bit
++
+For 32-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:[Assuming the address pattern is like `1?`.]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+
+Since we can suspend the AXI read response by deasserting the `valid` signal, **as long as** we can cache the SDRAM read responses, implementing a read cache becomes straightforward.
+
+[CAUTION]
+====
+Caching the SDRAM read responses means caching *all* responses, even during the CAS latency.
+
+Therefore, we should stop the SDRAM burst read transaction when the AXI master `ready` signal is deasserted and reserve sufficient buffer for the CAS latency.
+====
+
+=== Write Operation
+
+Unlike read operations, write operations do not require consideration of CAS latency. As a result, it is possible to transfer both the address and data at the same time.
+
+Similar to read operations, this section only considers 8-bit, 16-bit, and 32-bit read operations.
+
+Due to the limitation of SDRAM`'s CKE feature, which cannot remain deasserted indefinitely, when the master suspends a transfer, we must restart the transaction. Restarting burst transfers requires re-sending the `BURST WRITE` command with data and address, making it similar to the single write operation. Therefore, the single write operation is used in this section instead.
+
+. 8-bit
++
+For 8-bit writes, typically 2 transfers are required to complete a write operation. In the best casefootnote:unaligned[If the address is unaligned to 16 bits, assuming the address pattern is like `?1`.], 1 transfer is sufficient.
+
+. 16-bit
++
+For 16-bit writes, 1 transfer can always accommodate a single write.
++
+[TIP]
+====
+Let`'s consider two scenarios.
+
+Aligned Address::
+The address is like `?0`. In this scenario, the data is 16-bit, and the address is aligned, allowing for a straightforward write operation.
+
+Unaligned Address::
+The address is like `?1`. In this scenario, the data is 8-bit, so a single write is sufficient. The next address must be aligned.
+
+Therefore, regardless of whether the address is aligned or not, only a single write is needed.
+====
+
+. 32-bit
++
+For 32-bit write, under normal circumstances, it is sufficient to accommodate 2 writes. In special casesfootnote:unaligned[], it can only support a single write.
+
+[NOTE]
+====
+Since a `READ` operation is equivalent to a `STOP BURST` followed by a `READ`, and a `WRITE` can be considered equivalent to a `STOP BURST` followed by a `WRITE`. We can directly send a `WRITE` to start a new burst operation without the need to send a `STOP BURST` first.
+
+However, if the master`'s `valid` signal is deasserted and we do not have enough write buffer data, we need to manually send a `STOP BURST`. In the worst-case scenario, this may result in an additional idle cycle.
+
+Therefore, we still use a single write operation.
+====
+
+== Implementation
+
+Due to the CAS latency of SDRAM, we choose to separate the command and data channel. Therefore, the command channel is controlled by the state machine. The RFIFO will only connect to `DQi`, and the WFIFO will only connect to `DQo`.
+
+=== Read Operation
+
+We use two FIFOs to implement the read cache. RFIFO1 is a 1-depth 32-bit Bypass FIFO that is directly connected to the `R` response channel of AXI. RFIFO2 is a 4-depth 16-bit Bypass FIFO that is directly connected to the `DQi` port of SDRAM.
+
+image::ROOT:fifo-read.svg[RFIFO]
+
+Every time two 16-bit data are available in FIFO2, the data will be cached into RFIFO1. When RFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. Due to the CAS latency of SDRAM -- for simplicity, we assume a CAS latency of 3 cycles -- it will still send 3 cycles of read data. At this time, RFIFO2 will cache all the read data. When RFIFO1 is empty, the burst operation will be restarted.
+
+=== Write Operation
+
+Since the AXI protocol supports back-pressure, we can use a single FIFO to implement the write cache. The WFIFO is a 4-depth 32-bit Bypass FIFO that is directly connected to the `DQo` port of SDRAM. But we need to manually control the back-pressure in the `B` channel. Therefore, we connect two FIFOs to the `B` channel (to simulate the `almost_empty` signal).
+
+image::ROOT:fifo-write.svg[WFIFO]
+
+When WFIFO is empty or BFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. When WFIFO is full, the AXI write request will be suspended through back-pressure.
diff --git a/en/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc b/en/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..8c12f07
--- /dev/null
+++ b/en/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,39 @@
+= Why do we choose the cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test case and logic to the simulators, making migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be hard. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. To make matters worse, working offline (which is common in IC development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test case and logic from simulators, as frequent switching between different simulators is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and simulators. We plan to reimplement a set of open-source models in the future to enhance compatibility with all simulators, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test case and logic across different tools, minimizing friction when switching between them.
+
+== Advantages of cocotb
+
+link:https://github.com/cocotb/cocotb[cocotb] is a framework empowering users to write VHDL and Verilog testbenches in Python. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for IC verification), can be employed for Ethernet verification without any modifications.
+
+. Ease of Use
++
+cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+
+. Community Support
++
+cocotb has a large and active community of developers. There are already several successful examples of using cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+While commercial simulators like Synopsys VCS or Cadence Xcelium provide more comprehensive feature sets, cocotb preserves **the crucial final step** of seamless test case portability to open-source simulators like link:https://github.com/steveicarus/iverilog[ICARUS Verilog] or link:https://github.com/verilator/verilator[Verilator].
diff --git a/en/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc b/en/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..ad3304d
--- /dev/null
+++ b/en/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,39 @@
+= Why we chose the Cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test code to the verification tools, making future migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be inconvenient. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. Additionally, working offline (which is common in EDA development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test code from verification tools, as frequent switching between different verification tools is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and verification tools. We plan to reimplement a set of open-source models in the future to enhance compatibility with all verification tools, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test code across different tools, minimizing friction when switching between them.
+
+== Advantages of Cocotb
+
+Cocotb is a Python-based verification framework designed for SystemVerilog and Verilog HDL. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+Cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for EDA verification), can be employed for Ethernet verification without any modifications.
+
+. Ease of Use
++
+Cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+
+. Community Support
++
+Cocotb has a large and active community of developers. There are already several successful examples of using Cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+Cocotb is an excellent choice for verification complex systems due to its flexible framework, extensibility, and strong community support. While commercial simulators like Synopsys VCS or Cadence Xcelium offer a broader range of features and capabilities, Cocotb remains a strong candidate for open-source projects, offering a solid, flexible, and community-backed solution.
diff --git a/fr/modules/ROOT/images/fifo-read.svg b/fr/modules/ROOT/images/fifo-read.svg
new file mode 100644
index 0000000..373f8d9
--- /dev/null
+++ b/fr/modules/ROOT/images/fifo-read.svg
@@ -0,0 +1,304 @@
+
+
+
diff --git a/fr/modules/ROOT/images/fifo-write.svg b/fr/modules/ROOT/images/fifo-write.svg
new file mode 100644
index 0000000..2475083
--- /dev/null
+++ b/fr/modules/ROOT/images/fifo-write.svg
@@ -0,0 +1,233 @@
+
+
+
diff --git a/fr/modules/ROOT/nav.adoc b/fr/modules/ROOT/nav.adoc
index 1d0f328..7a802df 100644
--- a/fr/modules/ROOT/nav.adoc
+++ b/fr/modules/ROOT/nav.adoc
@@ -1,4 +1,7 @@
* xref:index.adoc[README]
* xref::[Why do we ... ?]
** xref:why/use-antora.adoc[use Antora]
-** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
\ No newline at end of file
+** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
+** xref:why/choose-the-cocotb-framework-for-verification.adoc[choose the cocotb framework for verification]
+* xref::[How do we ... ?]
+** xref:how/implement-data-caching-and-sharding.adoc[implement data caching and sharding]
\ No newline at end of file
diff --git a/fr/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc b/fr/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
new file mode 100644
index 0000000..cd10efa
--- /dev/null
+++ b/fr/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
@@ -0,0 +1,94 @@
+= How do we implement data caching and sharding
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+The AXI protocol supports **narrow transfer** (8/16/32-bit) with optional data bus widthfootnote:[AXI use `AxSIZE[2:0\]` to indicate the number of bytes transferred per transfer.], while SDRAM operates with fixed 16-bit physical interfaces. This mismatch creates two key challenges:
+
+* **Alignment Overhead**: SDRAM requires 16-bit aligned address, whereas the AXI protocol allows _unaligned transfers_. This misalignment necessitates manual data reordering and buffering to ensure the correct order before transferring data to SDRAM or back to the master.
+* **Transfer Suspension**: AXI transfers can be suspended when the `ready` or `valid` signals are deasserted. However, SDRAM does not support the ability to suspend burst operations. Once a burst transaction is initiated, it processes the entire burst without any delay. Therefore, this behavior needs to be carefully considered.
+
+== Analysis
+
+=== Read Operation
+
+In this section, we assume that the AXI data bus width is 32-bit, which means we only need to consider 8-bit, 16-bit, and 32-bit read operations.
+
+. 8-bit
++
+For 8-bit reads, in the best case, a single SDRAM read can satisfy 2 AXI transfersfootnote:aligned[If the address is aligned to 16 bits, assuming the address pattern is like `?0`.]. However, in the worst case, a single SDRAM read can only satisfy 1 AXI transfer.
+. 16-bit
++
+For 16-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:aligned[]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+. 32-bit
++
+For 32-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:[Assuming the address pattern is like `1?`.]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+
+Since we can suspend the AXI read response by deasserting the `valid` signal, **as long as** we can cache the SDRAM read responses, implementing a read cache becomes straightforward.
+
+[CAUTION]
+====
+Caching the SDRAM read responses means caching *all* responses, even during the CAS latency.
+
+Therefore, we should stop the SDRAM burst read transaction when the AXI master `ready` signal is deasserted and reserve sufficient buffer for the CAS latency.
+====
+
+=== Write Operation
+
+Unlike read operations, write operations do not require consideration of CAS latency. As a result, it is possible to transfer both the address and data at the same time.
+
+Similar to read operations, this section only considers 8-bit, 16-bit, and 32-bit read operations.
+
+Due to the limitation of SDRAM`'s CKE feature, which cannot remain deasserted indefinitely, when the master suspends a transfer, we must restart the transaction. Restarting burst transfers requires re-sending the `BURST WRITE` command with data and address, making it similar to the single write operation. Therefore, the single write operation is used in this section instead.
+
+. 8-bit
++
+For 8-bit writes, typically 2 transfers are required to complete a write operation. In the best casefootnote:unaligned[If the address is unaligned to 16 bits, assuming the address pattern is like `?1`.], 1 transfer is sufficient.
+. 16-bit
++
+For 16-bit writes, 1 transfer can always accommodate a single write.
++
+[TIP]
+====
+Let`'s consider two scenarios.
+
+Aligned Address:: The address is like `?0`. In this scenario, the data is 16-bit, and the address is aligned, allowing for a straightforward write operation.
+Unaligned Address:: The address is like `?1`. In this scenario, the data is 8-bit, so a single write is sufficient. The next address must be aligned.
+
+Therefore, regardless of whether the address is aligned or not, only a single write is needed.
+====
+. 32-bit
++
+For 32-bit write, under normal circumstances, it is sufficient to accommodate 2 writes. In special casesfootnote:unaligned[], it can only support a single write.
+
+[NOTE]
+====
+Since a `READ` operation is equivalent to a `STOP BURST` followed by a `READ`, and a `WRITE` can be considered equivalent to a `STOP BURST` followed by a `WRITE`. We can directly send a `WRITE` to start a new burst operation without the need to send a `STOP BURST` first.
+
+However, if the master`'s `valid` signal is deasserted and we do not have enough write buffer data, we need to manually send a `STOP BURST`. In the worst-case scenario, this may result in an additional idle cycle.
+
+Therefore, we still use a single write operation.
+====
+
+== Implementation
+
+Due to the CAS latency of SDRAM, we choose to separate the command and data channel. Therefore, the command channel is controlled by the state machine. The RFIFO will only connect to `DQi`, and the WFIFO will only connect to `DQo`.
+
+=== Read Operation
+
+We use two FIFOs to implement the read cache. RFIFO1 is a 1-depth 32-bit Bypass FIFO that is directly connected to the `R` response channel of AXI. RFIFO2 is a 4-depth 16-bit Bypass FIFO that is directly connected to the `DQi` port of SDRAM.
+
+image::ROOT:fifo-read.svg[RFIFO]
+
+Every time two 16-bit data are available in FIFO2, the data will be cached into RFIFO1. When RFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. Due to the CAS latency of SDRAM -- for simplicity, we assume a CAS latency of 3 cycles -- it will still send 3 cycles of read data. At this time, RFIFO2 will cache all the read data. When RFIFO1 is empty, the burst operation will be restarted.
+
+=== Write Operation
+
+Since the AXI protocol supports back-pressure, we can use a single FIFO to implement the write cache. The WFIFO is a 4-depth 32-bit Bypass FIFO that is directly connected to the `DQo` port of SDRAM. But we need to manually control the back-pressure in the `B` channel. Therefore, we connect two FIFOs to the `B` channel (to simulate the `almost_empty` signal).
+
+image::ROOT:fifo-write.svg[WFIFO]
+
+When WFIFO is empty or BFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. When WFIFO is full, the AXI write request will be suspended through back-pressure.
\ No newline at end of file
diff --git a/fr/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc b/fr/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..c1e4116
--- /dev/null
+++ b/fr/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why do we choose the cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test case and logic to the simulators, making migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be hard. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. To make matters worse, working offline (which is common in IC development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test case and logic from simulators, as frequent switching between different simulators is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and simulators. We plan to reimplement a set of open-source models in the future to enhance compatibility with all simulators, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test case and logic across different tools, minimizing friction when switching between them.
+
+== Advantages of cocotb
+
+link:https://github.com/cocotb/cocotb[cocotb] is a framework empowering users to write VHDL and Verilog testbenches in Python. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for IC verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+cocotb has a large and active community of developers. There are already several successful examples of using cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+While commercial simulators like Synopsys VCS or Cadence Xcelium provide more comprehensive feature sets, cocotb preserves **the crucial final step** of seamless test case portability to open-source simulators like link:https://github.com/steveicarus/iverilog[ICARUS Verilog] or link:https://github.com/verilator/verilator[Verilator].
\ No newline at end of file
diff --git a/fr/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc b/fr/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..ec1c35c
--- /dev/null
+++ b/fr/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why we chose the Cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test code to the verification tools, making future migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be inconvenient. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. Additionally, working offline (which is common in EDA development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test code from verification tools, as frequent switching between different verification tools is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and verification tools. We plan to reimplement a set of open-source models in the future to enhance compatibility with all verification tools, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test code across different tools, minimizing friction when switching between them.
+
+== Advantages of Cocotb
+
+Cocotb is a Python-based verification framework designed for SystemVerilog and Verilog HDL. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+Cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for EDA verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+Cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+Cocotb has a large and active community of developers. There are already several successful examples of using Cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+Cocotb is an excellent choice for verification complex systems due to its flexible framework, extensibility, and strong community support. While commercial simulators like Synopsys VCS or Cadence Xcelium offer a broader range of features and capabilities, Cocotb remains a strong candidate for open-source projects, offering a solid, flexible, and community-backed solution.
\ No newline at end of file
diff --git a/ja/modules/ROOT/images/fifo-read.svg b/ja/modules/ROOT/images/fifo-read.svg
new file mode 100644
index 0000000..373f8d9
--- /dev/null
+++ b/ja/modules/ROOT/images/fifo-read.svg
@@ -0,0 +1,304 @@
+
+
+
diff --git a/ja/modules/ROOT/images/fifo-write.svg b/ja/modules/ROOT/images/fifo-write.svg
new file mode 100644
index 0000000..2475083
--- /dev/null
+++ b/ja/modules/ROOT/images/fifo-write.svg
@@ -0,0 +1,233 @@
+
+
+
diff --git a/ja/modules/ROOT/nav.adoc b/ja/modules/ROOT/nav.adoc
index 1d0f328..7a802df 100644
--- a/ja/modules/ROOT/nav.adoc
+++ b/ja/modules/ROOT/nav.adoc
@@ -1,4 +1,7 @@
* xref:index.adoc[README]
* xref::[Why do we ... ?]
** xref:why/use-antora.adoc[use Antora]
-** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
\ No newline at end of file
+** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
+** xref:why/choose-the-cocotb-framework-for-verification.adoc[choose the cocotb framework for verification]
+* xref::[How do we ... ?]
+** xref:how/implement-data-caching-and-sharding.adoc[implement data caching and sharding]
\ No newline at end of file
diff --git a/ja/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc b/ja/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
new file mode 100644
index 0000000..cd10efa
--- /dev/null
+++ b/ja/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
@@ -0,0 +1,94 @@
+= How do we implement data caching and sharding
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+The AXI protocol supports **narrow transfer** (8/16/32-bit) with optional data bus widthfootnote:[AXI use `AxSIZE[2:0\]` to indicate the number of bytes transferred per transfer.], while SDRAM operates with fixed 16-bit physical interfaces. This mismatch creates two key challenges:
+
+* **Alignment Overhead**: SDRAM requires 16-bit aligned address, whereas the AXI protocol allows _unaligned transfers_. This misalignment necessitates manual data reordering and buffering to ensure the correct order before transferring data to SDRAM or back to the master.
+* **Transfer Suspension**: AXI transfers can be suspended when the `ready` or `valid` signals are deasserted. However, SDRAM does not support the ability to suspend burst operations. Once a burst transaction is initiated, it processes the entire burst without any delay. Therefore, this behavior needs to be carefully considered.
+
+== Analysis
+
+=== Read Operation
+
+In this section, we assume that the AXI data bus width is 32-bit, which means we only need to consider 8-bit, 16-bit, and 32-bit read operations.
+
+. 8-bit
++
+For 8-bit reads, in the best case, a single SDRAM read can satisfy 2 AXI transfersfootnote:aligned[If the address is aligned to 16 bits, assuming the address pattern is like `?0`.]. However, in the worst case, a single SDRAM read can only satisfy 1 AXI transfer.
+. 16-bit
++
+For 16-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:aligned[]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+. 32-bit
++
+For 32-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:[Assuming the address pattern is like `1?`.]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+
+Since we can suspend the AXI read response by deasserting the `valid` signal, **as long as** we can cache the SDRAM read responses, implementing a read cache becomes straightforward.
+
+[CAUTION]
+====
+Caching the SDRAM read responses means caching *all* responses, even during the CAS latency.
+
+Therefore, we should stop the SDRAM burst read transaction when the AXI master `ready` signal is deasserted and reserve sufficient buffer for the CAS latency.
+====
+
+=== Write Operation
+
+Unlike read operations, write operations do not require consideration of CAS latency. As a result, it is possible to transfer both the address and data at the same time.
+
+Similar to read operations, this section only considers 8-bit, 16-bit, and 32-bit read operations.
+
+Due to the limitation of SDRAM`'s CKE feature, which cannot remain deasserted indefinitely, when the master suspends a transfer, we must restart the transaction. Restarting burst transfers requires re-sending the `BURST WRITE` command with data and address, making it similar to the single write operation. Therefore, the single write operation is used in this section instead.
+
+. 8-bit
++
+For 8-bit writes, typically 2 transfers are required to complete a write operation. In the best casefootnote:unaligned[If the address is unaligned to 16 bits, assuming the address pattern is like `?1`.], 1 transfer is sufficient.
+. 16-bit
++
+For 16-bit writes, 1 transfer can always accommodate a single write.
++
+[TIP]
+====
+Let`'s consider two scenarios.
+
+Aligned Address:: The address is like `?0`. In this scenario, the data is 16-bit, and the address is aligned, allowing for a straightforward write operation.
+Unaligned Address:: The address is like `?1`. In this scenario, the data is 8-bit, so a single write is sufficient. The next address must be aligned.
+
+Therefore, regardless of whether the address is aligned or not, only a single write is needed.
+====
+. 32-bit
++
+For 32-bit write, under normal circumstances, it is sufficient to accommodate 2 writes. In special casesfootnote:unaligned[], it can only support a single write.
+
+[NOTE]
+====
+Since a `READ` operation is equivalent to a `STOP BURST` followed by a `READ`, and a `WRITE` can be considered equivalent to a `STOP BURST` followed by a `WRITE`. We can directly send a `WRITE` to start a new burst operation without the need to send a `STOP BURST` first.
+
+However, if the master`'s `valid` signal is deasserted and we do not have enough write buffer data, we need to manually send a `STOP BURST`. In the worst-case scenario, this may result in an additional idle cycle.
+
+Therefore, we still use a single write operation.
+====
+
+== Implementation
+
+Due to the CAS latency of SDRAM, we choose to separate the command and data channel. Therefore, the command channel is controlled by the state machine. The RFIFO will only connect to `DQi`, and the WFIFO will only connect to `DQo`.
+
+=== Read Operation
+
+We use two FIFOs to implement the read cache. RFIFO1 is a 1-depth 32-bit Bypass FIFO that is directly connected to the `R` response channel of AXI. RFIFO2 is a 4-depth 16-bit Bypass FIFO that is directly connected to the `DQi` port of SDRAM.
+
+image::ROOT:fifo-read.svg[RFIFO]
+
+Every time two 16-bit data are available in FIFO2, the data will be cached into RFIFO1. When RFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. Due to the CAS latency of SDRAM -- for simplicity, we assume a CAS latency of 3 cycles -- it will still send 3 cycles of read data. At this time, RFIFO2 will cache all the read data. When RFIFO1 is empty, the burst operation will be restarted.
+
+=== Write Operation
+
+Since the AXI protocol supports back-pressure, we can use a single FIFO to implement the write cache. The WFIFO is a 4-depth 32-bit Bypass FIFO that is directly connected to the `DQo` port of SDRAM. But we need to manually control the back-pressure in the `B` channel. Therefore, we connect two FIFOs to the `B` channel (to simulate the `almost_empty` signal).
+
+image::ROOT:fifo-write.svg[WFIFO]
+
+When WFIFO is empty or BFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. When WFIFO is full, the AXI write request will be suspended through back-pressure.
\ No newline at end of file
diff --git a/ja/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc b/ja/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..c1e4116
--- /dev/null
+++ b/ja/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why do we choose the cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test case and logic to the simulators, making migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be hard. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. To make matters worse, working offline (which is common in IC development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test case and logic from simulators, as frequent switching between different simulators is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and simulators. We plan to reimplement a set of open-source models in the future to enhance compatibility with all simulators, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test case and logic across different tools, minimizing friction when switching between them.
+
+== Advantages of cocotb
+
+link:https://github.com/cocotb/cocotb[cocotb] is a framework empowering users to write VHDL and Verilog testbenches in Python. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for IC verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+cocotb has a large and active community of developers. There are already several successful examples of using cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+While commercial simulators like Synopsys VCS or Cadence Xcelium provide more comprehensive feature sets, cocotb preserves **the crucial final step** of seamless test case portability to open-source simulators like link:https://github.com/steveicarus/iverilog[ICARUS Verilog] or link:https://github.com/verilator/verilator[Verilator].
\ No newline at end of file
diff --git a/ja/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc b/ja/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..ec1c35c
--- /dev/null
+++ b/ja/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why we chose the Cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test code to the verification tools, making future migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be inconvenient. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. Additionally, working offline (which is common in EDA development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test code from verification tools, as frequent switching between different verification tools is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and verification tools. We plan to reimplement a set of open-source models in the future to enhance compatibility with all verification tools, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test code across different tools, minimizing friction when switching between them.
+
+== Advantages of Cocotb
+
+Cocotb is a Python-based verification framework designed for SystemVerilog and Verilog HDL. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+Cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for EDA verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+Cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+Cocotb has a large and active community of developers. There are already several successful examples of using Cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+Cocotb is an excellent choice for verification complex systems due to its flexible framework, extensibility, and strong community support. While commercial simulators like Synopsys VCS or Cadence Xcelium offer a broader range of features and capabilities, Cocotb remains a strong candidate for open-source projects, offering a solid, flexible, and community-backed solution.
\ No newline at end of file
diff --git a/ko/modules/ROOT/images/fifo-read.svg b/ko/modules/ROOT/images/fifo-read.svg
new file mode 100644
index 0000000..373f8d9
--- /dev/null
+++ b/ko/modules/ROOT/images/fifo-read.svg
@@ -0,0 +1,304 @@
+
+
+
diff --git a/ko/modules/ROOT/images/fifo-write.svg b/ko/modules/ROOT/images/fifo-write.svg
new file mode 100644
index 0000000..2475083
--- /dev/null
+++ b/ko/modules/ROOT/images/fifo-write.svg
@@ -0,0 +1,233 @@
+
+
+
diff --git a/ko/modules/ROOT/nav.adoc b/ko/modules/ROOT/nav.adoc
index 1d0f328..7a802df 100644
--- a/ko/modules/ROOT/nav.adoc
+++ b/ko/modules/ROOT/nav.adoc
@@ -1,4 +1,7 @@
* xref:index.adoc[README]
* xref::[Why do we ... ?]
** xref:why/use-antora.adoc[use Antora]
-** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
\ No newline at end of file
+** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
+** xref:why/choose-the-cocotb-framework-for-verification.adoc[choose the cocotb framework for verification]
+* xref::[How do we ... ?]
+** xref:how/implement-data-caching-and-sharding.adoc[implement data caching and sharding]
\ No newline at end of file
diff --git a/ko/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc b/ko/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
new file mode 100644
index 0000000..cd10efa
--- /dev/null
+++ b/ko/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
@@ -0,0 +1,94 @@
+= How do we implement data caching and sharding
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+The AXI protocol supports **narrow transfer** (8/16/32-bit) with optional data bus widthfootnote:[AXI use `AxSIZE[2:0\]` to indicate the number of bytes transferred per transfer.], while SDRAM operates with fixed 16-bit physical interfaces. This mismatch creates two key challenges:
+
+* **Alignment Overhead**: SDRAM requires 16-bit aligned address, whereas the AXI protocol allows _unaligned transfers_. This misalignment necessitates manual data reordering and buffering to ensure the correct order before transferring data to SDRAM or back to the master.
+* **Transfer Suspension**: AXI transfers can be suspended when the `ready` or `valid` signals are deasserted. However, SDRAM does not support the ability to suspend burst operations. Once a burst transaction is initiated, it processes the entire burst without any delay. Therefore, this behavior needs to be carefully considered.
+
+== Analysis
+
+=== Read Operation
+
+In this section, we assume that the AXI data bus width is 32-bit, which means we only need to consider 8-bit, 16-bit, and 32-bit read operations.
+
+. 8-bit
++
+For 8-bit reads, in the best case, a single SDRAM read can satisfy 2 AXI transfersfootnote:aligned[If the address is aligned to 16 bits, assuming the address pattern is like `?0`.]. However, in the worst case, a single SDRAM read can only satisfy 1 AXI transfer.
+. 16-bit
++
+For 16-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:aligned[]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+. 32-bit
++
+For 32-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:[Assuming the address pattern is like `1?`.]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+
+Since we can suspend the AXI read response by deasserting the `valid` signal, **as long as** we can cache the SDRAM read responses, implementing a read cache becomes straightforward.
+
+[CAUTION]
+====
+Caching the SDRAM read responses means caching *all* responses, even during the CAS latency.
+
+Therefore, we should stop the SDRAM burst read transaction when the AXI master `ready` signal is deasserted and reserve sufficient buffer for the CAS latency.
+====
+
+=== Write Operation
+
+Unlike read operations, write operations do not require consideration of CAS latency. As a result, it is possible to transfer both the address and data at the same time.
+
+Similar to read operations, this section only considers 8-bit, 16-bit, and 32-bit read operations.
+
+Due to the limitation of SDRAM`'s CKE feature, which cannot remain deasserted indefinitely, when the master suspends a transfer, we must restart the transaction. Restarting burst transfers requires re-sending the `BURST WRITE` command with data and address, making it similar to the single write operation. Therefore, the single write operation is used in this section instead.
+
+. 8-bit
++
+For 8-bit writes, typically 2 transfers are required to complete a write operation. In the best casefootnote:unaligned[If the address is unaligned to 16 bits, assuming the address pattern is like `?1`.], 1 transfer is sufficient.
+. 16-bit
++
+For 16-bit writes, 1 transfer can always accommodate a single write.
++
+[TIP]
+====
+Let`'s consider two scenarios.
+
+Aligned Address:: The address is like `?0`. In this scenario, the data is 16-bit, and the address is aligned, allowing for a straightforward write operation.
+Unaligned Address:: The address is like `?1`. In this scenario, the data is 8-bit, so a single write is sufficient. The next address must be aligned.
+
+Therefore, regardless of whether the address is aligned or not, only a single write is needed.
+====
+. 32-bit
++
+For 32-bit write, under normal circumstances, it is sufficient to accommodate 2 writes. In special casesfootnote:unaligned[], it can only support a single write.
+
+[NOTE]
+====
+Since a `READ` operation is equivalent to a `STOP BURST` followed by a `READ`, and a `WRITE` can be considered equivalent to a `STOP BURST` followed by a `WRITE`. We can directly send a `WRITE` to start a new burst operation without the need to send a `STOP BURST` first.
+
+However, if the master`'s `valid` signal is deasserted and we do not have enough write buffer data, we need to manually send a `STOP BURST`. In the worst-case scenario, this may result in an additional idle cycle.
+
+Therefore, we still use a single write operation.
+====
+
+== Implementation
+
+Due to the CAS latency of SDRAM, we choose to separate the command and data channel. Therefore, the command channel is controlled by the state machine. The RFIFO will only connect to `DQi`, and the WFIFO will only connect to `DQo`.
+
+=== Read Operation
+
+We use two FIFOs to implement the read cache. RFIFO1 is a 1-depth 32-bit Bypass FIFO that is directly connected to the `R` response channel of AXI. RFIFO2 is a 4-depth 16-bit Bypass FIFO that is directly connected to the `DQi` port of SDRAM.
+
+image::ROOT:fifo-read.svg[RFIFO]
+
+Every time two 16-bit data are available in FIFO2, the data will be cached into RFIFO1. When RFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. Due to the CAS latency of SDRAM -- for simplicity, we assume a CAS latency of 3 cycles -- it will still send 3 cycles of read data. At this time, RFIFO2 will cache all the read data. When RFIFO1 is empty, the burst operation will be restarted.
+
+=== Write Operation
+
+Since the AXI protocol supports back-pressure, we can use a single FIFO to implement the write cache. The WFIFO is a 4-depth 32-bit Bypass FIFO that is directly connected to the `DQo` port of SDRAM. But we need to manually control the back-pressure in the `B` channel. Therefore, we connect two FIFOs to the `B` channel (to simulate the `almost_empty` signal).
+
+image::ROOT:fifo-write.svg[WFIFO]
+
+When WFIFO is empty or BFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. When WFIFO is full, the AXI write request will be suspended through back-pressure.
\ No newline at end of file
diff --git a/ko/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc b/ko/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..c1e4116
--- /dev/null
+++ b/ko/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why do we choose the cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test case and logic to the simulators, making migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be hard. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. To make matters worse, working offline (which is common in IC development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test case and logic from simulators, as frequent switching between different simulators is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and simulators. We plan to reimplement a set of open-source models in the future to enhance compatibility with all simulators, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test case and logic across different tools, minimizing friction when switching between them.
+
+== Advantages of cocotb
+
+link:https://github.com/cocotb/cocotb[cocotb] is a framework empowering users to write VHDL and Verilog testbenches in Python. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for IC verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+cocotb has a large and active community of developers. There are already several successful examples of using cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+While commercial simulators like Synopsys VCS or Cadence Xcelium provide more comprehensive feature sets, cocotb preserves **the crucial final step** of seamless test case portability to open-source simulators like link:https://github.com/steveicarus/iverilog[ICARUS Verilog] or link:https://github.com/verilator/verilator[Verilator].
\ No newline at end of file
diff --git a/ko/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc b/ko/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..ec1c35c
--- /dev/null
+++ b/ko/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why we chose the Cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test code to the verification tools, making future migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be inconvenient. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. Additionally, working offline (which is common in EDA development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test code from verification tools, as frequent switching between different verification tools is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and verification tools. We plan to reimplement a set of open-source models in the future to enhance compatibility with all verification tools, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test code across different tools, minimizing friction when switching between them.
+
+== Advantages of Cocotb
+
+Cocotb is a Python-based verification framework designed for SystemVerilog and Verilog HDL. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+Cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for EDA verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+Cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+Cocotb has a large and active community of developers. There are already several successful examples of using Cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+Cocotb is an excellent choice for verification complex systems due to its flexible framework, extensibility, and strong community support. While commercial simulators like Synopsys VCS or Cadence Xcelium offer a broader range of features and capabilities, Cocotb remains a strong candidate for open-source projects, offering a solid, flexible, and community-backed solution.
\ No newline at end of file
diff --git a/zh-CN/modules/ROOT/images/fifo-read.svg b/zh-CN/modules/ROOT/images/fifo-read.svg
new file mode 100644
index 0000000..373f8d9
--- /dev/null
+++ b/zh-CN/modules/ROOT/images/fifo-read.svg
@@ -0,0 +1,304 @@
+
+
+
diff --git a/zh-CN/modules/ROOT/images/fifo-write.svg b/zh-CN/modules/ROOT/images/fifo-write.svg
new file mode 100644
index 0000000..2475083
--- /dev/null
+++ b/zh-CN/modules/ROOT/images/fifo-write.svg
@@ -0,0 +1,233 @@
+
+
+
diff --git a/zh-CN/modules/ROOT/nav.adoc b/zh-CN/modules/ROOT/nav.adoc
index 1d0f328..7a802df 100644
--- a/zh-CN/modules/ROOT/nav.adoc
+++ b/zh-CN/modules/ROOT/nav.adoc
@@ -1,4 +1,7 @@
* xref:index.adoc[README]
* xref::[Why do we ... ?]
** xref:why/use-antora.adoc[use Antora]
-** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
\ No newline at end of file
+** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
+** xref:why/choose-the-cocotb-framework-for-verification.adoc[choose the cocotb framework for verification]
+* xref::[How do we ... ?]
+** xref:how/implement-data-caching-and-sharding.adoc[implement data caching and sharding]
\ No newline at end of file
diff --git a/zh-CN/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc b/zh-CN/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
new file mode 100644
index 0000000..cd10efa
--- /dev/null
+++ b/zh-CN/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
@@ -0,0 +1,94 @@
+= How do we implement data caching and sharding
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+The AXI protocol supports **narrow transfer** (8/16/32-bit) with optional data bus widthfootnote:[AXI use `AxSIZE[2:0\]` to indicate the number of bytes transferred per transfer.], while SDRAM operates with fixed 16-bit physical interfaces. This mismatch creates two key challenges:
+
+* **Alignment Overhead**: SDRAM requires 16-bit aligned address, whereas the AXI protocol allows _unaligned transfers_. This misalignment necessitates manual data reordering and buffering to ensure the correct order before transferring data to SDRAM or back to the master.
+* **Transfer Suspension**: AXI transfers can be suspended when the `ready` or `valid` signals are deasserted. However, SDRAM does not support the ability to suspend burst operations. Once a burst transaction is initiated, it processes the entire burst without any delay. Therefore, this behavior needs to be carefully considered.
+
+== Analysis
+
+=== Read Operation
+
+In this section, we assume that the AXI data bus width is 32-bit, which means we only need to consider 8-bit, 16-bit, and 32-bit read operations.
+
+. 8-bit
++
+For 8-bit reads, in the best case, a single SDRAM read can satisfy 2 AXI transfersfootnote:aligned[If the address is aligned to 16 bits, assuming the address pattern is like `?0`.]. However, in the worst case, a single SDRAM read can only satisfy 1 AXI transfer.
+. 16-bit
++
+For 16-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:aligned[]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+. 32-bit
++
+For 32-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:[Assuming the address pattern is like `1?`.]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+
+Since we can suspend the AXI read response by deasserting the `valid` signal, **as long as** we can cache the SDRAM read responses, implementing a read cache becomes straightforward.
+
+[CAUTION]
+====
+Caching the SDRAM read responses means caching *all* responses, even during the CAS latency.
+
+Therefore, we should stop the SDRAM burst read transaction when the AXI master `ready` signal is deasserted and reserve sufficient buffer for the CAS latency.
+====
+
+=== Write Operation
+
+Unlike read operations, write operations do not require consideration of CAS latency. As a result, it is possible to transfer both the address and data at the same time.
+
+Similar to read operations, this section only considers 8-bit, 16-bit, and 32-bit read operations.
+
+Due to the limitation of SDRAM`'s CKE feature, which cannot remain deasserted indefinitely, when the master suspends a transfer, we must restart the transaction. Restarting burst transfers requires re-sending the `BURST WRITE` command with data and address, making it similar to the single write operation. Therefore, the single write operation is used in this section instead.
+
+. 8-bit
++
+For 8-bit writes, typically 2 transfers are required to complete a write operation. In the best casefootnote:unaligned[If the address is unaligned to 16 bits, assuming the address pattern is like `?1`.], 1 transfer is sufficient.
+. 16-bit
++
+For 16-bit writes, 1 transfer can always accommodate a single write.
++
+[TIP]
+====
+Let`'s consider two scenarios.
+
+Aligned Address:: The address is like `?0`. In this scenario, the data is 16-bit, and the address is aligned, allowing for a straightforward write operation.
+Unaligned Address:: The address is like `?1`. In this scenario, the data is 8-bit, so a single write is sufficient. The next address must be aligned.
+
+Therefore, regardless of whether the address is aligned or not, only a single write is needed.
+====
+. 32-bit
++
+For 32-bit write, under normal circumstances, it is sufficient to accommodate 2 writes. In special casesfootnote:unaligned[], it can only support a single write.
+
+[NOTE]
+====
+Since a `READ` operation is equivalent to a `STOP BURST` followed by a `READ`, and a `WRITE` can be considered equivalent to a `STOP BURST` followed by a `WRITE`. We can directly send a `WRITE` to start a new burst operation without the need to send a `STOP BURST` first.
+
+However, if the master`'s `valid` signal is deasserted and we do not have enough write buffer data, we need to manually send a `STOP BURST`. In the worst-case scenario, this may result in an additional idle cycle.
+
+Therefore, we still use a single write operation.
+====
+
+== Implementation
+
+Due to the CAS latency of SDRAM, we choose to separate the command and data channel. Therefore, the command channel is controlled by the state machine. The RFIFO will only connect to `DQi`, and the WFIFO will only connect to `DQo`.
+
+=== Read Operation
+
+We use two FIFOs to implement the read cache. RFIFO1 is a 1-depth 32-bit Bypass FIFO that is directly connected to the `R` response channel of AXI. RFIFO2 is a 4-depth 16-bit Bypass FIFO that is directly connected to the `DQi` port of SDRAM.
+
+image::ROOT:fifo-read.svg[RFIFO]
+
+Every time two 16-bit data are available in FIFO2, the data will be cached into RFIFO1. When RFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. Due to the CAS latency of SDRAM -- for simplicity, we assume a CAS latency of 3 cycles -- it will still send 3 cycles of read data. At this time, RFIFO2 will cache all the read data. When RFIFO1 is empty, the burst operation will be restarted.
+
+=== Write Operation
+
+Since the AXI protocol supports back-pressure, we can use a single FIFO to implement the write cache. The WFIFO is a 4-depth 32-bit Bypass FIFO that is directly connected to the `DQo` port of SDRAM. But we need to manually control the back-pressure in the `B` channel. Therefore, we connect two FIFOs to the `B` channel (to simulate the `almost_empty` signal).
+
+image::ROOT:fifo-write.svg[WFIFO]
+
+When WFIFO is empty or BFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. When WFIFO is full, the AXI write request will be suspended through back-pressure.
\ No newline at end of file
diff --git a/zh-CN/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc b/zh-CN/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..c1e4116
--- /dev/null
+++ b/zh-CN/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why do we choose the cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test case and logic to the simulators, making migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be hard. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. To make matters worse, working offline (which is common in IC development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test case and logic from simulators, as frequent switching between different simulators is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and simulators. We plan to reimplement a set of open-source models in the future to enhance compatibility with all simulators, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test case and logic across different tools, minimizing friction when switching between them.
+
+== Advantages of cocotb
+
+link:https://github.com/cocotb/cocotb[cocotb] is a framework empowering users to write VHDL and Verilog testbenches in Python. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for IC verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+cocotb has a large and active community of developers. There are already several successful examples of using cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+While commercial simulators like Synopsys VCS or Cadence Xcelium provide more comprehensive feature sets, cocotb preserves **the crucial final step** of seamless test case portability to open-source simulators like link:https://github.com/steveicarus/iverilog[ICARUS Verilog] or link:https://github.com/verilator/verilator[Verilator].
\ No newline at end of file
diff --git a/zh-CN/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc b/zh-CN/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..ec1c35c
--- /dev/null
+++ b/zh-CN/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why we chose the Cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test code to the verification tools, making future migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be inconvenient. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. Additionally, working offline (which is common in EDA development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test code from verification tools, as frequent switching between different verification tools is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and verification tools. We plan to reimplement a set of open-source models in the future to enhance compatibility with all verification tools, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test code across different tools, minimizing friction when switching between them.
+
+== Advantages of Cocotb
+
+Cocotb is a Python-based verification framework designed for SystemVerilog and Verilog HDL. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+Cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for EDA verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+Cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+Cocotb has a large and active community of developers. There are already several successful examples of using Cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+Cocotb is an excellent choice for verification complex systems due to its flexible framework, extensibility, and strong community support. While commercial simulators like Synopsys VCS or Cadence Xcelium offer a broader range of features and capabilities, Cocotb remains a strong candidate for open-source projects, offering a solid, flexible, and community-backed solution.
\ No newline at end of file
diff --git a/zh-TW/modules/ROOT/images/fifo-read.svg b/zh-TW/modules/ROOT/images/fifo-read.svg
new file mode 100644
index 0000000..373f8d9
--- /dev/null
+++ b/zh-TW/modules/ROOT/images/fifo-read.svg
@@ -0,0 +1,304 @@
+
+
+
diff --git a/zh-TW/modules/ROOT/images/fifo-write.svg b/zh-TW/modules/ROOT/images/fifo-write.svg
new file mode 100644
index 0000000..2475083
--- /dev/null
+++ b/zh-TW/modules/ROOT/images/fifo-write.svg
@@ -0,0 +1,233 @@
+
+
+
diff --git a/zh-TW/modules/ROOT/nav.adoc b/zh-TW/modules/ROOT/nav.adoc
index 1d0f328..7a802df 100644
--- a/zh-TW/modules/ROOT/nav.adoc
+++ b/zh-TW/modules/ROOT/nav.adoc
@@ -1,4 +1,7 @@
* xref:index.adoc[README]
* xref::[Why do we ... ?]
** xref:why/use-antora.adoc[use Antora]
-** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
\ No newline at end of file
+** xref:why/support-unaligned-narrow-transfer.adoc[support unaligned/narrow transfer]
+** xref:why/choose-the-cocotb-framework-for-verification.adoc[choose the cocotb framework for verification]
+* xref::[How do we ... ?]
+** xref:how/implement-data-caching-and-sharding.adoc[implement data caching and sharding]
\ No newline at end of file
diff --git a/zh-TW/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc b/zh-TW/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
new file mode 100644
index 0000000..cd10efa
--- /dev/null
+++ b/zh-TW/modules/ROOT/pages/how/implement-data-caching-and-sharding.adoc
@@ -0,0 +1,94 @@
+= How do we implement data caching and sharding
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+The AXI protocol supports **narrow transfer** (8/16/32-bit) with optional data bus widthfootnote:[AXI use `AxSIZE[2:0\]` to indicate the number of bytes transferred per transfer.], while SDRAM operates with fixed 16-bit physical interfaces. This mismatch creates two key challenges:
+
+* **Alignment Overhead**: SDRAM requires 16-bit aligned address, whereas the AXI protocol allows _unaligned transfers_. This misalignment necessitates manual data reordering and buffering to ensure the correct order before transferring data to SDRAM or back to the master.
+* **Transfer Suspension**: AXI transfers can be suspended when the `ready` or `valid` signals are deasserted. However, SDRAM does not support the ability to suspend burst operations. Once a burst transaction is initiated, it processes the entire burst without any delay. Therefore, this behavior needs to be carefully considered.
+
+== Analysis
+
+=== Read Operation
+
+In this section, we assume that the AXI data bus width is 32-bit, which means we only need to consider 8-bit, 16-bit, and 32-bit read operations.
+
+. 8-bit
++
+For 8-bit reads, in the best case, a single SDRAM read can satisfy 2 AXI transfersfootnote:aligned[If the address is aligned to 16 bits, assuming the address pattern is like `?0`.]. However, in the worst case, a single SDRAM read can only satisfy 1 AXI transfer.
+. 16-bit
++
+For 16-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:aligned[]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+. 32-bit
++
+For 32-bit reads, in the best case, a single SDRAM read can satisfy 1 AXI transferfootnote:[Assuming the address pattern is like `1?`.]. However, in the worst case, 2 SDRAM reads are required to satisfy 1 AXI transfer.
+
+Since we can suspend the AXI read response by deasserting the `valid` signal, **as long as** we can cache the SDRAM read responses, implementing a read cache becomes straightforward.
+
+[CAUTION]
+====
+Caching the SDRAM read responses means caching *all* responses, even during the CAS latency.
+
+Therefore, we should stop the SDRAM burst read transaction when the AXI master `ready` signal is deasserted and reserve sufficient buffer for the CAS latency.
+====
+
+=== Write Operation
+
+Unlike read operations, write operations do not require consideration of CAS latency. As a result, it is possible to transfer both the address and data at the same time.
+
+Similar to read operations, this section only considers 8-bit, 16-bit, and 32-bit read operations.
+
+Due to the limitation of SDRAM`'s CKE feature, which cannot remain deasserted indefinitely, when the master suspends a transfer, we must restart the transaction. Restarting burst transfers requires re-sending the `BURST WRITE` command with data and address, making it similar to the single write operation. Therefore, the single write operation is used in this section instead.
+
+. 8-bit
++
+For 8-bit writes, typically 2 transfers are required to complete a write operation. In the best casefootnote:unaligned[If the address is unaligned to 16 bits, assuming the address pattern is like `?1`.], 1 transfer is sufficient.
+. 16-bit
++
+For 16-bit writes, 1 transfer can always accommodate a single write.
++
+[TIP]
+====
+Let`'s consider two scenarios.
+
+Aligned Address:: The address is like `?0`. In this scenario, the data is 16-bit, and the address is aligned, allowing for a straightforward write operation.
+Unaligned Address:: The address is like `?1`. In this scenario, the data is 8-bit, so a single write is sufficient. The next address must be aligned.
+
+Therefore, regardless of whether the address is aligned or not, only a single write is needed.
+====
+. 32-bit
++
+For 32-bit write, under normal circumstances, it is sufficient to accommodate 2 writes. In special casesfootnote:unaligned[], it can only support a single write.
+
+[NOTE]
+====
+Since a `READ` operation is equivalent to a `STOP BURST` followed by a `READ`, and a `WRITE` can be considered equivalent to a `STOP BURST` followed by a `WRITE`. We can directly send a `WRITE` to start a new burst operation without the need to send a `STOP BURST` first.
+
+However, if the master`'s `valid` signal is deasserted and we do not have enough write buffer data, we need to manually send a `STOP BURST`. In the worst-case scenario, this may result in an additional idle cycle.
+
+Therefore, we still use a single write operation.
+====
+
+== Implementation
+
+Due to the CAS latency of SDRAM, we choose to separate the command and data channel. Therefore, the command channel is controlled by the state machine. The RFIFO will only connect to `DQi`, and the WFIFO will only connect to `DQo`.
+
+=== Read Operation
+
+We use two FIFOs to implement the read cache. RFIFO1 is a 1-depth 32-bit Bypass FIFO that is directly connected to the `R` response channel of AXI. RFIFO2 is a 4-depth 16-bit Bypass FIFO that is directly connected to the `DQi` port of SDRAM.
+
+image::ROOT:fifo-read.svg[RFIFO]
+
+Every time two 16-bit data are available in FIFO2, the data will be cached into RFIFO1. When RFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. Due to the CAS latency of SDRAM -- for simplicity, we assume a CAS latency of 3 cycles -- it will still send 3 cycles of read data. At this time, RFIFO2 will cache all the read data. When RFIFO1 is empty, the burst operation will be restarted.
+
+=== Write Operation
+
+Since the AXI protocol supports back-pressure, we can use a single FIFO to implement the write cache. The WFIFO is a 4-depth 32-bit Bypass FIFO that is directly connected to the `DQo` port of SDRAM. But we need to manually control the back-pressure in the `B` channel. Therefore, we connect two FIFOs to the `B` channel (to simulate the `almost_empty` signal).
+
+image::ROOT:fifo-write.svg[WFIFO]
+
+When WFIFO is empty or BFIFO1 is full, the state machine will automatically switch to the `STOP` state, forcing the SDRAM to stop the burst operation. When WFIFO is full, the AXI write request will be suspended through back-pressure.
\ No newline at end of file
diff --git a/zh-TW/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc b/zh-TW/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..c1e4116
--- /dev/null
+++ b/zh-TW/modules/ROOT/pages/why/choose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why do we choose the cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test case and logic to the simulators, making migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be hard. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. To make matters worse, working offline (which is common in IC development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test case and logic from simulators, as frequent switching between different simulators is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and simulators. We plan to reimplement a set of open-source models in the future to enhance compatibility with all simulators, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test case and logic across different tools, minimizing friction when switching between them.
+
+== Advantages of cocotb
+
+link:https://github.com/cocotb/cocotb[cocotb] is a framework empowering users to write VHDL and Verilog testbenches in Python. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for IC verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+cocotb has a large and active community of developers. There are already several successful examples of using cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+While commercial simulators like Synopsys VCS or Cadence Xcelium provide more comprehensive feature sets, cocotb preserves **the crucial final step** of seamless test case portability to open-source simulators like link:https://github.com/steveicarus/iverilog[ICARUS Verilog] or link:https://github.com/verilator/verilator[Verilator].
\ No newline at end of file
diff --git a/zh-TW/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc b/zh-TW/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
new file mode 100644
index 0000000..ec1c35c
--- /dev/null
+++ b/zh-TW/modules/ROOT/pages/why/chose-the-cocotb-framework-for-verification.adoc
@@ -0,0 +1,36 @@
+= Why we chose the Cocotb framework for verification
+
+:toc:
+:icons: font
+:author: kazutoiris
+
+== Background
+
+Initially, we used Verilator for verification; however, it did not support some of the Verilog features required by our specification model, which prompted us to switch to Synopsys VCS. Since we are an open-source project, Synopsys VCS was only a temporary solution. Our ultimate goal is to implement a fully open-source verification framework, along with an open-source verification model.
+
+Directly using DPI-C for verification would tightly couple the test code to the verification tools, making future migration and adaptability more difficult. To address this, we adopted the approach same as link:https://github.com/chipsalliance/t1/tree/master/difftest[t1/difftest at master · chipsalliance/t1]. This approach allows us to write Rust code that can interface with both Verilator and Synopsys VCS.
+
+However, managing the versions of Rust, Cargo, and their dependencies presented significant challenges. We attempted to use Nix for package management, but it proved to be inconvenient. For instance, whenever we modified Cargo.toml, we had to manually update the checksum and refresh dependencies -- even during development when version consistency wasn`'t critical. Additionally, working offline (which is common in EDA development) introduced further difficulties. Without Intranet caching solution or package management for Nix, such as Sonatype Nexus for Maven, syncing dependencies between Intranet and Internet became cumbersome.
+
+Compare with Verilog and Bluespec, Bluespec offers a more modern approach, where higher-level features won`'t affect the lower-level Verilog compatibility. This decoupling ensures that Verilog artifacts are compatible with all Verilog tools, irrespective of how the higher layers are abstracted or reused. This is in contrast to directly using SystemVerilog for new features, which could lead to compatibility issues. By adopting Bluespec, we achieve better compatibility without sacrificing abstraction capabilities.
+
+Our goal is to decouple test code from verification tools, as frequent switching between different verification tools is often necessary to adapt to models from various vendors. For example, encrypted Verilog models from vendors are typically adapted to work with Synopsys VCS, Siemens Questa, or Cadence Xcelium. Although these models provide consistent interfaces, differences in encryption methods, Verilog versions, and implementation quirks often lead to tight coupling between the models and verification tools. We plan to reimplement a set of open-source models in the future to enhance compatibility with all verification tools, but for now, our focus is on decoupling the verification process. This would allow the reuse of the test code across different tools, minimizing friction when switching between them.
+
+== Advantages of Cocotb
+
+Cocotb is a Python-based verification framework designed for SystemVerilog and Verilog HDL. It provides a high-level API that simplifies interactions with the simulation environment, enabling users to write tests in a more intuitive and concise manner. It can be integrated with various simulation tools such as Synopsys VCS, Verilator, and others.
+
+. High-Level API
++
+Cocotb provides an easy-to-use API that facilitates interaction with the simulation environment. This simplifies the writing of tests in a clear and straightforward way. We can use pure Python libraries for verification. For example, link:https://github.com/secdev/scapy[Scapy], a Python-based interactive packet manipulation library (though it`'s not originally designed for EDA verification), can be employed for Ethernet verification without any modifications.
+. Ease of Use
++
+Cocotb is easy to install and use, there`'s no need to manually add additional dependencies.
+. Lightweight
++
+There are many package managers, such as poetry, pipenv, conda, and others, that can manage Python versions and dependencies. These tools are much lighter than alternatives like Nix.
+. Community Support
++
+Cocotb has a large and active community of developers. There are already several successful examples of using Cocotb for large-scale projects, such as the PCI-E verification framework link:https://github.com/alexforencich/cocotbext-pcie[alexforencich/cocotbext-pcie].
+
+Cocotb is an excellent choice for verification complex systems due to its flexible framework, extensibility, and strong community support. While commercial simulators like Synopsys VCS or Cadence Xcelium offer a broader range of features and capabilities, Cocotb remains a strong candidate for open-source projects, offering a solid, flexible, and community-backed solution.
\ No newline at end of file