Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to bootstrap F3 before 900 epochs in 2k devnet #12710

Closed
4 of 11 tasks
parthshah1 opened this issue Nov 20, 2024 · 15 comments
Closed
4 of 11 tasks

Unable to bootstrap F3 before 900 epochs in 2k devnet #12710

parthshah1 opened this issue Nov 20, 2024 · 15 comments
Assignees
Labels
kind/bug Kind: Bug

Comments

@parthshah1
Copy link
Contributor

parthshah1 commented Nov 20, 2024

Checklist

  • This is not a security-related bug/issue. If it is, please follow please follow the security policy.
  • I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.
  • I am running the Latest release, the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
  • I did not make any code changes to lotus.

Lotus component

  • lotus daemon - chain sync
  • lotus fvm/fevm - Lotus FVM and FEVM interactions
  • lotus miner/worker - sealing
  • lotus miner - proving(WindowPoSt/WinningPoSt)
  • lotus JSON-RPC API
  • lotus message management (mpool)
  • Other

Lotus Version

Daemon:  1.29.1+2k+git.3a3395d.dirty+api1.5.0
Local: lotus version 1.29.1+2k+git.3a3395d.dirty

Did not made any code changes affecting the F3 functionality.

Repro Steps

  1. Run '...'
  2. Do '...'
  3. See error '...'
    ...

Describe the Bug

We are running a docker network with multiple miners. We want to test for F3 correctness issues. The problem is though, it takes ~4 hours in our simulation to reach F3 Bootstrap epoch. ~1 hour in ideal setup. Which really hinders the extensive testing we want to do.

Logging Information

N/A
@parthshah1 parthshah1 added the kind/bug Kind: Bug label Nov 20, 2024
@github-project-automation github-project-automation bot moved this to 📌 Triage in FilOz Nov 20, 2024
@parthshah1
Copy link
Contributor Author

Feel free to change the label.

cc: @masih

@parthshah1 parthshah1 changed the title Unable to bootstrap F3 before 900 epochs in devnet Unable to bootstrap F3 before 900 epochs in 2k devnet Nov 20, 2024
@BigLep BigLep added this to F3 Feb 14, 2025
@github-project-automation github-project-automation bot moved this to Todo in F3 Feb 14, 2025
@BigLep
Copy link
Member

BigLep commented Feb 14, 2025

2025-02-14 conversation: @masih is going to take this. Should be a small config change.

Slack thread that triggered this issue creation: https://filecoinproject.slack.com/archives/C0556MSR945/p1731967602211289

@masih
Copy link
Member

masih commented Feb 14, 2025

Hi @parthshah1 I am not sure if you are still blocked on this. You could make the time much shorter by changing the configuration on EC finality for the 2k network?

You can also change F3BootstrapEpoch in params to -1 to stop the static manifest fusing after 1000 epoch for long running networks

@parthshah1
Copy link
Contributor Author

@masih. This is great. It will be beneficial for the simulations.

You could make the time much shorter by changing the configuration on EC finality for the 2k network?
Do you mean reducing the epochs in EC finality in 2k params??

You can also change F3BootstrapEpoch in params to -1 to stop the static manifest fusing after 1000 epoch for long running networks
I think this is a good approach, we tried changing it earlier but it would wait for 900 epochs to bootstrap epoch.

@parthshah1
Copy link
Contributor Author

I am not sure if you are still blocked on this. You could make the time much shorter by changing the configuration on EC finality for the 2k network?

@masih, Looking into this, do we need to hardcode a value in F3 manifest or change this to -1?

@masih
Copy link
Member

masih commented Feb 18, 2025

If you'd like to test the dynamic manifest mechanism, then you'd need to run a manifest server and update the ID of it in the 2k params.

Otherwise, you can directly update the EC finality in the static manifest in addition to reducing the bootstrap epoch

@parthshah1
Copy link
Contributor Author

Testing dynamic was on our mind, but then I heard that post F3 launch in mainnet, the plan is to remove the dynamic manifest system. Ref: https://filecoinproject.slack.com/archives/C0556MSR945/p1733249822989779?thread_ts=1733247288.806609&cid=C0556MSR945

@masih
Copy link
Member

masih commented Feb 18, 2025

That is correct. Once F3 activates the dynamic manifest mechanism will be removed. But it has not been removed yet.

Reducing the EC finality and "activating" F3 on a 2k network is probably the easiest path forward for you folks.

@parthshah1
Copy link
Contributor Author

Do you have an idea what should be a minimum in case of finality??

I get this:

lotus-1 | ERROR: initializing node: starting node: could not build arguments for function "reflect".makeFuncStub (/usr/local/go/src/reflect/asm_amd64.s:28): failed to build *lf3.F3: could not build arguments for function "reflect".makeFuncStub (/usr/local/go/src/reflect/asm_amd64.s:28): failed to build manifest.ManifestProvider: received non-nil error from function "reflect".makeFuncStub (/usr/local/go/src/reflect/asm_amd64.s:28): invalid manifest: bootstrap epoch 11 before finality 900

Do I have to set anything in particular before 900???

@masih
Copy link
Member

masih commented Feb 18, 2025

Looks like the EC Finality is not updated. Bootstrap epoch cannot be less than EC finality.

Override the finality in your branch to some value smaller than the bootstrap epoch.

@parthshah1
Copy link
Contributor Author

@masih, coming back to this finally.
This is what I've done so far:

  1. Change env var LOTUS_F3_BOOTSTRAPEPOCH = 21
  2. Changed EC finality here = 20
  3. Overriden EC finality here to 20.

I get these logs non-stop:

lotus-2 | 2025-02-19T16:33:58.579Z INFO f3/manifest-provider manifest/fusing_provider.go:103 delaying fusing manifest switch-over because head is behind the target epoch {"head": 0, "target epoch": 1, "bootstrap epoch": 21} lotus-2 | 2025-02-19T16:33:58.570Z INFO f3 [email protected]/f3.go:238 waiting for bootstrap epoch {"duration": "12ns"} lotus-2 | 2025-02-19T16:33:58.579Z INFO f3 [email protected]/f3.go:238 waiting for bootstrap epoch {"duration": "12ns"} lotus-2 | 2025-02-19T16:33:58.579Z INFO f3 [email protected]/f3.go:238 waiting for bootstrap epoch {"duration": "4ns"} lotus-2 | 2025-02-19T16:33:58.579Z INFO f3 [email protected]/f3.go:238 waiting for bootstrap epoch {"duration": "17ns"} lotus-2 | 2025-02-19T16:33:58.579Z INFO f3 [email protected]/f3.go:238 waiting for bootstrap epoch {"duration": "9ns"} lotus-2 | 2025-02-19T16:33:58.579Z INFO f3/manifest-provider manifest/fusing_provider.go:103 delaying fusing manifest switch-over because head is behind the target epoch {"head": 0, "target epoch": 1, "bootstrap epoch": 21} lotus-2 | 2025-02-19T16:33:58.579Z INFO f3/manifest-provider manifest/fusing_provider.go:103 delaying fusing manifest switch-over because head is behind the target epoch {"head": 0, "target epoch": 1, "bootstrap epoch": 21} lotus-2 | 2025-02-19T16:33:58.579Z INFO f3/manifest-provider manifest/fusing_provider.go:103 delaying fusing manifest switch-over because head is behind the target epoch {"head": 0, "target epoch": 1, "bootstrap epoch": 21} lotus-2 | 2025-02-19T16:33:58.579Z INFO f3/manifest-provider manifest/fusing_provider.go:103 delaying fusing manifest switch-over because head is behind the target epoch {"head": 0, "target epoch": 1, "bootstrap epoch": 21}

@parthshah1
Copy link
Contributor Author

If curious about what I've done, feel free to checkout https://github.com/FilecoinFoundationWeb/Filecoin-Antithesis/tree/f3-bootstrap-below-900

@masih
Copy link
Member

masih commented Feb 24, 2025

@masih, coming back to this finally. This is what I've done so far:

  1. Change env var LOTUS_F3_BOOTSTRAPEPOCH = 21
  2. Changed EC finality here = 20
  3. Overriden EC finality here to 20.

Please revert the changes in step 2. That step changed the EC period and not needed.

@parthshah1
Copy link
Contributor Author

Worked like a charm. Thanks a lot.

@masih
Copy link
Member

masih commented Feb 24, 2025

Closing this issue as complete. Please feel free to reopen if there is anything that I might have missed.

@masih masih closed this as completed Feb 24, 2025
@github-project-automation github-project-automation bot moved this from Todo to Done in F3 Feb 24, 2025
@github-project-automation github-project-automation bot moved this from 📌 Triage to 🎉 Done in FilOz Feb 24, 2025
@rjan90 rjan90 moved this from 🎉 Done to ☑️ Done (Archive) in FilOz Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Kind: Bug
Projects
Status: Done
Status: ☑️ Done (Archive)
Development

No branches or pull requests

3 participants