@@ -23,7 +23,10 @@ SupraSeal is an optimized batch sealing implementation for Filecoin that allows
23
23
- NVMe drives with high IOPS (10-20M total IOPS recommended)
24
24
- GPU for PC2 phase (NVIDIA RTX 3090 or better recommended)
25
25
- 1GB hugepages configured (minimum 36 pages)
26
- - Ubuntu 22.04 or compatible Linux distribution
26
+ - Ubuntu 22.04 or compatible Linux distribution (gcc-11 required, doesn't need to be system-wide)
27
+ - At least 256GB RAM, ALL MEMORY CHANNELS POPULATED
28
+ - Without ** all** memory channels populated sealing ** performance will suffer drastically**
29
+ - NUMA-Per-Socket (NPS) set to 1
27
30
28
31
## Setup
29
32
@@ -68,6 +71,13 @@ LayerNVMEDevices = [
68
71
# Add PCIe addresses for all NVMe devices to use
69
72
]
70
73
74
+ # Set to your desiced batch size (what the batch-cpu command says your CPU supports AND what you have nvme space for)
75
+ BatchSealBatchSize = 32
76
+
77
+ # pipelines can be either 1 or 2; 2 pipelines double storage requirements but in correctly balanced systems makes
78
+ # layer hashing run 100% of the time, nearly doubling throughput
79
+ BatchSealPipelines = 2
80
+
71
81
# Set to true for Zen2 or older CPUs for compatibility
72
82
SingleHasherPerThread = false
73
83
```
@@ -139,10 +149,64 @@ curio seal start --now --cc --count 32 --actor f01234 --layers cluster --duratio
139
149
* Monitor hasher core utilisation
140
150
141
151
## Troubleshooting
152
+
153
+ ### Node doesn't start / isn't visible in the UI
142
154
* Ensure hugepages are configured correctly
143
155
* Check NVMe device IOPS and capacity
144
156
* If spdk setup fails, try to ` wipefs -a ` the NVMe devices (this will wipe partitions from the devices, be careful!)
145
- * Benchmark iops with:
157
+
158
+ ### Performance issues
159
+
160
+ You can monitor performance by looking at "hasher" core utilisation in e.g. ` htop ` .
161
+
162
+ To identify hasher cores, call ` curio calc supraseal-config --batch-size 128 ` (with the correct batch size), and look for ` coordinators `
163
+
164
+ ``` go
165
+ topology:
166
+ ...
167
+ {
168
+ pc1: {
169
+ writer = 1 ;
170
+ ...
171
+ hashers_per_core = 2 ;
172
+
173
+ sector_configs: (
174
+ {
175
+ sectors = 128 ;
176
+ coordinators = (
177
+ { core = 59 ;
178
+ hashers = 8 ; },
179
+ { core = 64 ;
180
+ hashers = 14 ; },
181
+ { core = 72 ;
182
+ hashers = 14 ; },
183
+ { core = 80 ;
184
+ hashers = 14 ; },
185
+ { core = 88 ;
186
+ hashers = 14 ; }
187
+ )
188
+ }
189
+
190
+ )
191
+ },
192
+
193
+ pc2: {
194
+ ...
195
+ }
196
+
197
+ ```
198
+
199
+ In this example, cores 59, 64, 72, 80, and 88 are "coordinators", with two hashers per core, meaning that
200
+ * In first group core 59 is a coordinator, cores 60-63 are hashers (4 hasher cores / 8 hasher threads)
201
+ * In second group core 64 is a coordinator, cores 65-71 are hashers (7 hasher cores / 14 hasher threads)
202
+ * And so on
203
+
204
+ Coordinator cores will usually sit at 100% utilisation, hasher threads ** SHOULD** sit at 100% utilisation, anything less
205
+ indicates a bottleneck in the system, like not enough NVMe IOPS, not enough Memory bandwidth, or incorrect NUMA setup.
206
+
207
+ To troubleshoot:
208
+ * Read the requirements at the top of this page very carefully
209
+ * Benchmark iops with:
146
210
``` bash
147
211
cd extern/supra_seal/deps/spdk-v22.09/
148
212
0 commit comments