Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SupraSeal #69

Merged
merged 97 commits into from
Aug 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
963f034
Import supraseal extern
magik6k Jun 24, 2024
45bdc73
supraseal test bin
magik6k Jun 24, 2024
c84ae08
supraseal build setup
magik6k Jun 24, 2024
8d9d61e
begin implementing batch seal task
magik6k Jun 25, 2024
9af8447
Minimum viable Batch seal task
magik6k Jun 27, 2024
1b82e8c
Supraseal config
magik6k Jun 27, 2024
2c094aa
supraseal persist batch meta in cache
magik6k Jun 27, 2024
6dff6dd
batch seal: Mastly done C1
magik6k Jun 27, 2024
a2e64e3
binary C1 decode
magik6k Jun 28, 2024
ced1848
better decode
magik6k Jun 28, 2024
5166014
Fixes to raw proof decode
magik6k Jun 28, 2024
ea25f9f
Working SN C1 reader, C2 poc test
magik6k Jun 29, 2024
71bac7c
supraseal slot manager
magik6k Jul 1, 2024
9dbd612
add created_at to batch_sector_refs
magik6k Jul 1, 2024
5c8122d
update supraseal extern ref
magik6k Jul 1, 2024
7aa7a3b
call schedule in supraseal task
magik6k Jul 2, 2024
9e11cb7
supraseal debugging
magik6k Jul 2, 2024
9361bb4
gen, fix task wiring
magik6k Jul 2, 2024
6bb2fc6
supraseal link fixes
magik6k Jul 2, 2024
33a2365
more supraffi ld flags
magik6k Jul 2, 2024
a7f4c4b
try a thing
magik6k Jul 2, 2024
263dbe0
linkers are easy
magik6k Jul 2, 2024
7d8c7ba
debug init quit
magik6k Jul 2, 2024
0c4ac19
better slot size error message
magik6k Jul 2, 2024
8145793
pipelines and pages are different units
magik6k Jul 2, 2024
224fda4
allow batch scheduling
magik6k Jul 2, 2024
b1e5717
pass all deps to NewSupraSeal
magik6k Jul 2, 2024
a9327b9
set treed task in batch seal
magik6k Jul 2, 2024
5c147b3
batch storage fixes
magik6k Jul 2, 2024
e1ff205
set correct path type in batch seal task
magik6k Jul 2, 2024
7064729
hugepage check
magik6k Jul 2, 2024
9bc7e3b
use correct parents file path
magik6k Jul 2, 2024
f7ddd58
log pc2 inputs
magik6k Jul 2, 2024
5dde4bf
pc2 debugging
magik6k Jul 3, 2024
132237a
fixed supra path decode
magik6k Jul 3, 2024
6119615
more supraseal updates
magik6k Jul 3, 2024
09b7a47
setup for real batch run
magik6k Jul 3, 2024
901faa7
batch: fix pipeline machine id getter
magik6k Jul 3, 2024
feededb
fix invalid commr error return
magik6k Jul 3, 2024
ca2b1f7
supraseal with fixed c1
magik6k Jul 4, 2024
b5b1f5e
fix machine host/port in finalize
magik6k Jul 4, 2024
feec7ab
log c1 debug
magik6k Jul 4, 2024
0fec872
oh no databases
magik6k Jul 4, 2024
20e8658
run finalize on batchseal nodes
magik6k Jul 5, 2024
70e889c
fix batch slot allocation
magik6k Jul 5, 2024
25316ce
allow batch.json
magik6k Jul 5, 2024
f7761c1
fix batch slot startup
magik6k Jul 6, 2024
c14e0c7
gen fixes
magik6k Jul 6, 2024
cebbc94
allow c1 out in cache
magik6k Jul 6, 2024
222b55d
fix api verson check
magik6k Jul 10, 2024
361d63c
post-rebase fixes
magik6k Jul 24, 2024
6c943c6
register the supraseal task
magik6k Jul 25, 2024
c6ff986
fix lint
magik6k Jul 25, 2024
c530414
make gen
magik6k Jul 26, 2024
4993cc6
batch build target
magik6k Jul 30, 2024
27560e3
update supraseal to fix sppark conflict
magik6k Jul 30, 2024
b7dc7f4
fix batch task registration
magik6k Jul 30, 2024
7758118
make: Always set LIBRARY_PATH
magik6k Jul 30, 2024
68ddfaf
supraseal with gcc11 default
magik6k Jul 30, 2024
b38ad45
build supraseal dep from make
magik6k Jul 30, 2024
734cacb
minimum sdr tasks setting
magik6k Jul 30, 2024
39d3920
schedule batch seal on sdr tasks which weren't claimed
magik6k Jul 30, 2024
d17bd8a
allow multiple supraseal pipelines
magik6k Jul 31, 2024
5b0cf2b
prometheus metrics endpoint
magik6k Jul 31, 2024
5b9b506
supraseal phase metrics
magik6k Jul 31, 2024
cbc30db
Don't turn supraseal tasks into supraseal tasks
magik6k Jul 31, 2024
ab9c86e
fix LIBRARY_PATH handling
magik6k Jul 31, 2024
6164604
Fix library_path more correctly
magik6k Jul 31, 2024
cd210f2
Prefer user LIBRARY_PATH
magik6k Jul 31, 2024
dcaf3da
Set LIBRARY_PATH only if not already set
magik6k Jul 31, 2024
1350716
Fix sptool build
magik6k Aug 1, 2024
4e57bd8
faster ema
magik6k Aug 1, 2024
a40413b
batch cpu calc
magik6k Aug 2, 2024
e72374c
hasher count is in threads
magik6k Aug 2, 2024
386a600
fix calc assinging one core too many
magik6k Aug 2, 2024
be7641a
supraseal config generator
magik6k Aug 2, 2024
b7363b6
fix calc comment
magik6k Aug 2, 2024
d1853e6
Move batch seal properties to a Seal section
magik6k Aug 2, 2024
a82ac70
make gen
magik6k Aug 2, 2024
44a29da
batch config gen
magik6k Aug 2, 2024
f6c01a5
fix config gen
magik6k Aug 2, 2024
410397b
fix config nvme list
magik6k Aug 2, 2024
d652eba
Add --duration-days to curio seal start --cc
magik6k Aug 2, 2024
0c4e1ff
Set after_synth in supraseal task
magik6k Aug 5, 2024
325cab5
fix: WinPoSt: Prioritize recent tasks, don't care about old mining bases
magik6k Aug 6, 2024
38bffb6
Address review
magik6k Aug 7, 2024
fea733d
make gen
magik6k Aug 7, 2024
e90f202
supraseal docs
magik6k Aug 7, 2024
3ed7b3b
docs on hw
magik6k Aug 7, 2024
a609ce7
supraseal: Allocate cores on first processor
magik6k Aug 7, 2024
5c871ab
count cpu packages correctly
magik6k Aug 7, 2024
20bb284
webui: Fix redirect on sector remove
magik6k Aug 7, 2024
6daa769
improve supraseal config output, more diag info
magik6k Aug 8, 2024
2f66ddf
add some docs on troubleshooting batch seal perf
magik6k Aug 8, 2024
68af4e6
fix lint
magik6k Aug 8, 2024
9ac5e51
improve docs
magik6k Aug 9, 2024
2c9e322
no linter, you are wrong
magik6k Aug 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
[submodule "extern/filecoin-ffi"]
path = extern/filecoin-ffi
url = https://github.com/filecoin-project/filecoin-ffi.git
[submodule "extern/supra_seal"]
path = extern/supra_seal
url = https://github.com/magik6k/supra_seal.git
branch = feat/multi-out-paths
52 changes: 51 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ SHELL=/usr/bin/env bash

GOCC?=go

## FILECOIN-FFI

FFI_PATH:=extern/filecoin-ffi/
FFI_DEPS:=.install-filcrypto
FFI_DEPS:=$(addprefix $(FFI_PATH),$(FFI_DEPS))
Expand All @@ -22,6 +24,23 @@ BUILD_DEPS+=ffi-version-check

.PHONY: ffi-version-check

## SUPRA-FFI

ifeq ($(shell uname),Linux)
SUPRA_FFI_PATH:=extern/supra_seal/
SUPRA_FFI_DEPS:=.install-supraseal
SUPRA_FFI_DEPS:=$(addprefix $(SUPRA_FFI_PATH),$(SUPRA_FFI_DEPS))

$(SUPRA_FFI_DEPS): build/.supraseal-install ;

build/.supraseal-install: $(SUPRA_FFI_PATH)
cd $(SUPRA_FFI_PATH) && ./build.sh
@touch $@

MODULES+=$(SUPRA_FFI_PATH)
CLEAN+=build/.supraseal-install
endif

$(MODULES): build/.update-modules ;
# dummy file that marks the last time modules were updated
build/.update-modules:
Expand All @@ -30,6 +49,12 @@ build/.update-modules:

# end git modules

## CUDA Library Path
CUDA_PATH := $(shell dirname $$(dirname $$(which nvcc)))
CUDA_LIB_PATH := $(CUDA_PATH)/lib64
LIBRARY_PATH ?= $(CUDA_LIB_PATH)
export LIBRARY_PATH

## MAIN BINARIES

CLEAN+=build/.update-modules
Expand All @@ -41,7 +66,7 @@ deps: $(BUILD_DEPS)

curio: $(BUILD_DEPS)
rm -f curio
GOAMD64=v3 $(GOCC) build $(GOFLAGS) -o curio -ldflags " -s -w \
GOAMD64=v3 CGO_LDFLAGS_ALLOW=$(CGO_LDFLAGS_ALLOW) $(GOCC) build $(GOFLAGS) -o curio -ldflags " -s -w \
-X github.com/filecoin-project/curio/build.IsOpencl=$(FFI_USE_OPENCL) \
-X github.com/filecoin-project/curio/build.CurrentCommit=+git_`git log -1 --format=%h_%cI`" \
./cmd/curio
Expand All @@ -54,6 +79,31 @@ sptool: $(BUILD_DEPS)
.PHONY: sptool
BINS+=sptool

ifeq ($(shell uname),Linux)

batchdep: build/.supraseal-install
batchdep: $(BUILD_DEPS)
,PHONY: batchdep

batch: GOFLAGS+=-tags=supraseal
batch: CGO_LDFLAGS_ALLOW='.*'
batch: batchdep build
.PHONY: batch

batch-calibnet: GOFLAGS+=-tags=calibnet,supraseal
batch-calibnet: CGO_LDFLAGS_ALLOW='.*'
batch-calibnet: batchdep build
.PHONY: batch-calibnet

else
batch:
@echo "Batch target is only available on Linux systems"
@exit 1

batch-calibnet:
@echo "Batch-calibnet target is only available on Linux systems"
@exit 1
endif

calibnet: GOFLAGS+=-tags=calibnet
calibnet: build
Expand Down
169 changes: 169 additions & 0 deletions cmd/curio/calc.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
package main

import (
"fmt"

"github.com/fatih/color"
"github.com/urfave/cli/v2"

"github.com/filecoin-project/curio/tasks/sealsupra"
)

var calcCmd = &cli.Command{
Name: "calc",
Usage: "Math Utils",
Flags: []cli.Flag{
&cli.StringFlag{
Name: "actor",
},
},
Subcommands: []*cli.Command{
calcBatchCpuCmd,
calcSuprasealConfigCmd,
},
}

var calcBatchCpuCmd = &cli.Command{
Name: "batch-cpu",
Usage: "Analyze and display the layout of batch sealer threads",
Description: `Analyze and display the layout of batch sealer threads on your CPU.

It provides detailed information about CPU utilization for batch sealing operations, including core allocation, thread
distribution for different batch sizes.`,
Flags: []cli.Flag{
&cli.BoolFlag{Name: "dual-hashers", Value: true},
},
Action: func(cctx *cli.Context) error {
info, err := sealsupra.GetSystemInfo()
if err != nil {
return err
}

fmt.Println("Basic CPU Information")
fmt.Println("")
fmt.Printf("Processor count: %d\n", info.ProcessorCount)
fmt.Printf("Core count: %d\n", info.CoreCount)
fmt.Printf("Thread count: %d\n", info.CoreCount*info.ThreadsPerCore)
fmt.Printf("Threads per core: %d\n", info.ThreadsPerCore)
fmt.Printf("Cores per L3 cache (CCX): %d\n", info.CoresPerL3)
fmt.Printf("L3 cache count (CCX count): %d\n", info.CoreCount/info.CoresPerL3)

ccxFreeCores := info.CoresPerL3 - 1 // one core per ccx goes to the coordinator
ccxFreeThreads := ccxFreeCores * info.ThreadsPerCore
fmt.Printf("Hasher Threads per CCX: %d\n", ccxFreeThreads)

sectorsPerThread := 1
if cctx.Bool("dual-hashers") {
sectorsPerThread = 2
}

sectorsPerCCX := ccxFreeThreads * sectorsPerThread
fmt.Printf("Sectors per CCX: %d\n", sectorsPerCCX)

fmt.Println("---------")

printForBatchSize := func(batchSize int) {
fmt.Printf("Batch Size: %s sectors\n", color.CyanString("%d", batchSize))
fmt.Println()

config, err := sealsupra.GenerateSupraSealConfig(*info, cctx.Bool("dual-hashers"), batchSize, nil)
if err != nil {
fmt.Printf("Error generating config: %s\n", err)
return
}

fmt.Printf("Required Threads: %d\n", config.RequiredThreads)
fmt.Printf("Required CCX: %d\n", config.RequiredCCX)
fmt.Printf("Required Cores: %d hasher (+4 minimum for non-hashers)\n", config.RequiredCores)

enoughCores := config.RequiredCores <= info.CoreCount
if enoughCores {
fmt.Printf("Enough cores available for hashers %s\n", color.GreenString("✔"))
} else {
fmt.Printf("Not enough cores available for hashers %s\n", color.RedString("✘"))
return
}

fmt.Printf("Non-hasher cores: %d\n", info.CoreCount-config.RequiredCores)

if config.P2WrRdOverlap {
color.Yellow("! P2 writer will share a core with P2 reader, performance may be impacted")
}
if config.P2HsP1WrOverlap {
color.Yellow("! P2 hasher will share a core with P1 writer, performance may be impacted")
}
if config.P2HcP2RdOverlap {
color.Yellow("! P2 hasher_cpu will share a core with P2 reader, performance may be impacted")
}

fmt.Println()
fmt.Printf("pc1 writer: %d\n", config.Topology.PC1Writer)
fmt.Printf("pc1 reader: %d\n", config.Topology.PC1Reader)
fmt.Printf("pc1 orchestrator: %d\n", config.Topology.PC1Orchestrator)
fmt.Println()
fmt.Printf("pc2 reader: %d\n", config.Topology.PC2Reader)
fmt.Printf("pc2 hasher: %d\n", config.Topology.PC2Hasher)
fmt.Printf("pc2 hasher_cpu: %d\n", config.Topology.PC2HasherCPU)
fmt.Printf("pc2 writer: %d\n", config.Topology.PC2Writer)
fmt.Printf("pc2 writer_cores: %d\n", config.Topology.PC2WriterCores)
fmt.Println()
fmt.Printf("c1 reader: %d\n", config.Topology.C1Reader)
fmt.Println()

fmt.Printf("Unoccupied Cores: %d\n\n", config.UnoccupiedCores)

fmt.Println("{")
fmt.Printf(" sectors = %d;\n", batchSize)
fmt.Println(" coordinators = (")
for i, coord := range config.Topology.SectorConfigs[0].Coordinators {
fmt.Printf(" { core = %d;\n hashers = %d; }", coord.Core, coord.Hashers)
if i < len(config.Topology.SectorConfigs[0].Coordinators)-1 {
fmt.Println(",")
} else {
fmt.Println()
}
}
fmt.Println(" )")
fmt.Println("}")

fmt.Println("---------")
}

printForBatchSize(16)
printForBatchSize(32)
printForBatchSize(64)
printForBatchSize(128)

return nil
},
}

var calcSuprasealConfigCmd = &cli.Command{
Name: "supraseal-config",
Usage: "Generate a supra_seal configuration",
Description: `Generate a supra_seal configuration for a given batch size.

This command outputs a configuration expected by SupraSeal. Main purpose of this command is for debugging and testing.
The config can be used directly with SupraSeal binaries to test it without involving Curio.`,
Flags: []cli.Flag{
&cli.BoolFlag{
Name: "dual-hashers",
Value: true,
Usage: "Zen3 and later supports two sectors per thread, set to false for older CPUs",
},
&cli.IntFlag{
Name: "batch-size",
Aliases: []string{"b"},
Required: true,
},
},
Action: func(cctx *cli.Context) error {
cstr, err := sealsupra.GenerateSupraSealConfigString(cctx.Bool("dual-hashers"), cctx.Int("batch-size"), nil)
if err != nil {
return err
}

fmt.Println(cstr)
return nil
},
}
1 change: 1 addition & 0 deletions cmd/curio/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ func main() {
marketCmd,
fetchParamCmd,
ffiCmd,
calcCmd,
}

jaeger := tracing.SetupJaegerTracing("curio")
Expand Down
24 changes: 23 additions & 1 deletion cmd/curio/pipeline.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ import (

"github.com/filecoin-project/go-address"
"github.com/filecoin-project/go-state-types/abi"
"github.com/filecoin-project/go-state-types/builtin"
miner12 "github.com/filecoin-project/go-state-types/builtin/v12/miner"

"github.com/filecoin-project/curio/cmd/curio/guidedsetup"
"github.com/filecoin-project/curio/deps"
Expand Down Expand Up @@ -61,6 +63,12 @@ var sealStartCmd = &cli.Command{
Name: "layers",
Usage: "list of layers to be interpreted (atop defaults). Default: base",
},
&cli.IntFlag{
Name: "duration-days",
Aliases: []string{"d"},
Usage: "How long to commit sectors for",
DefaultText: "1278 (3.5 years)",
},
},
Action: func(cctx *cli.Context) error {
if !cctx.Bool("now") {
Expand Down Expand Up @@ -118,9 +126,23 @@ var sealStartCmd = &cli.Command{
return xerrors.Errorf("getting seal proof type: %w", err)
}

var userDuration *int64
if cctx.IsSet("duration-days") {
days := cctx.Int("duration-days")
userDuration = new(int64)
*userDuration = int64(days) * builtin.EpochsInDay

if miner12.MaxSectorExpirationExtension < *userDuration {
return xerrors.Errorf("duration exceeds max allowed: %d > %d", *userDuration, miner12.MaxSectorExpirationExtension)
}
if miner12.MinSectorExpiration > *userDuration {
return xerrors.Errorf("duration is too short: %d < %d", *userDuration, miner12.MinSectorExpiration)
}
}

num, err := seal.AllocateSectorNumbers(ctx, dep.Chain, dep.DB, act, cctx.Int("count"), func(tx *harmonydb.Tx, numbers []abi.SectorNumber) (bool, error) {
for _, n := range numbers {
_, err := tx.Exec("insert into sectors_sdr_pipeline (sp_id, sector_number, reg_seal_proof) values ($1, $2, $3)", mid, n, spt)
_, err := tx.Exec("insert into sectors_sdr_pipeline (sp_id, sector_number, reg_seal_proof, user_sector_duration_epochs) values ($1, $2, $3, $4)", mid, n, spt, userDuration)
if err != nil {
return false, xerrors.Errorf("inserting into sectors_sdr_pipeline: %w", err)
}
Expand Down
6 changes: 4 additions & 2 deletions cmd/curio/rpc/rpc.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,15 @@ import (
"github.com/filecoin-project/curio/api/client"
"github.com/filecoin-project/curio/build"
"github.com/filecoin-project/curio/deps"
"github.com/filecoin-project/curio/lib/metrics"
"github.com/filecoin-project/curio/lib/paths"
"github.com/filecoin-project/curio/lib/repo"
"github.com/filecoin-project/curio/web"

lapi "github.com/filecoin-project/lotus/api"
cliutil "github.com/filecoin-project/lotus/cli/util"
"github.com/filecoin-project/lotus/lib/rpcenc"
"github.com/filecoin-project/lotus/metrics"
lotusmetrics "github.com/filecoin-project/lotus/metrics"
"github.com/filecoin-project/lotus/metrics/proxy"
"github.com/filecoin-project/lotus/storage/pipeline/piece"
"github.com/filecoin-project/lotus/storage/sealer/fsutil"
Expand Down Expand Up @@ -71,6 +72,7 @@ func CurioHandler(
mux.Handle("/rpc/v0", rpcServer)
mux.Handle("/rpc/streams/v0/push/{uuid}", readerHandler)
mux.PathPrefix("/remote").HandlerFunc(remote)
mux.Handle("/debug/metrics", metrics.Exporter())
mux.PathPrefix("/").Handler(http.DefaultServeMux) // pprof

if !permissioned {
Expand Down Expand Up @@ -283,7 +285,7 @@ func ListenAndServe(ctx context.Context, dependencies *deps.Deps, shutdownChan c
permissioned),
ReadHeaderTimeout: time.Minute * 3,
BaseContext: func(listener net.Listener) context.Context {
ctx, _ := tag.New(context.Background(), tag.Upsert(metrics.APIInterface, "lotus-worker"))
ctx, _ := tag.New(context.Background(), tag.Upsert(lotusmetrics.APIInterface, "curio"))
return ctx
},
Addr: dependencies.ListenAddr,
Expand Down
9 changes: 1 addition & 8 deletions cmd/curio/run.go
Original file line number Diff line number Diff line change
Expand Up @@ -106,14 +106,7 @@ var runCmd = &cli.Command{
ctxclose()
}()
}
// Register all metric views
/*
if err := view.Register(
metrics.MinerNodeViews...,
); err != nil {
log.Fatalf("Cannot register the view: %v", err)
}
*/

// Set the metric to one so it is published to the exporter
stats.Record(ctx, metrics.LotusInfo.M(1))

Expand Down
Loading