Addresses #8218
A simplified version of #8241 for the initial release.
I've tried to minimise the logic change in this PR, although introducing the `NodeCustodyType` enum still result in quite a bit a of diff, but the actual logic change in `CustodyContext` is quite small.
The main changes are in the `CustdoyContext` struct
* ~~combining `validator_custody_count` and `current_is_supernode` fields into a single `custody_group_count_at_head` field. We persist the cgc of the initial cli values into the `custody_group_count_at_head` field and only allow for increase (same behaviour as before).~~
* I noticed the above approach caused a backward compatibility issue, I've [made a fix](15569bc085) and changed the approach slightly (which was actually what I had originally in mind):
* when initialising, only override the `validator_custody_count` value if either flag `--supernode` or `--semi-supernode` is used; otherwise leave it as the existing default `0`. Most other logic remains unchanged.
All existing validator custody unit tests are still all passing, and I've added additional tests to cover semi-supernode, and restoring `CustodyContext` from disk.
Note: I've added a `WARN` if the user attempts to switch to a `--semi-supernode` or `--supernode` - this currently has no effect, but once @eserilev column backfill is merged, we should be able to support this quite easily.
Things to test
- [x] cgc in metadata / enr
- [x] cgc in metrics
- [x] subscribed subnets
- [x] getBlobs endpoint
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
As identified by a researcher during the Fusaka security competition, we were computing the proposer index incorrectly in some places by computing without lookahead.
- [x] Add "low level" checks to computation functions in `consensus/types` to ensure they error cleanly
- [x] Re-work the determination of proposer shuffling decision roots, which are now fork aware.
- [x] Re-work and simplify the beacon proposer cache to be fork-aware.
- [x] Optimise `with_proposer_cache` to use `OnceCell`.
- [x] All tests passing.
- [x] Resolve all remaining `FIXME(sproul)`s.
- [x] Unit tests for `ProtoBlock::proposer_shuffling_root_for_child_block`.
- [x] End-to-end regression test.
- [x] Test on pre-Fulu network.
- [x] Test on post-Fulu network.
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Addresses #7866.
Use Rayon to speed up batch KZG verification during range / backfill sync.
While I was analysing the traces, I also discovered a bug that resulted in only the first 128 columns in a chain segment batch being verified. This PR fixes it, so we might actually observe slower range sync due to more cells being KZG verified.
I've also updated the handling of batch KZG failure to only find the first invalid KZG column when verification fails as this gets very expensive during range/backfill sync.
Adds the required boilerplate code for the Gloas (Glamsterdam) hard fork. This allows PRs testing Gloas-candidate features to test fork transition.
This also includes de-duplication of post-Bellatrix readiness notifiers from #6797 (credit to @dapplion)
Re-export `context_deserialize_derive` inside of `context_deserialize` so they are both available from the same interface, which matches how popular crates (like `serde`) handle this.
This also nests both crates inside a new `context_deserialize` directory which will make it easier to eventually spin out into a different repo (if/when) we decide to do that (plus I prefer it aesthetically).
This PR fixes a bug where wrong columns could get processed immediately after a CGC increase.
Scenario:
- The node's CGC increased due to additional validators attached to it (lets say from 10 to 11)
- The new CGC is advertised and new subnets are subscribed immediately, however the change won't be effective in the data availability check until the next epoch (See [this](ab0e8870b4/beacon_node/beacon_chain/src/validator_custody.rs (L93-L99))). Data availability checker still only require 10 columns for the current epoch.
- During this time, data columns for the additional custody column (lets say column 11) may arrive via gossip as we're already subscribed to the topic, and it may be incorrectly used to satisfy the existing data availability requirement (10 columns), and result in this additional column (instead of a required one) getting persisted, resulting in database inconsistency.
Fix Clippy for recently released Rust 1.90 beta. There may be more changes required when Rust 1.89 stable is released in a few days, but possibly not 🤞
Resolves#6767
This PR implements a basic version of validator custody.
- It introduces a new `CustodyContext` object which contains info regarding number of validators attached to a node and the custody count they contribute to the cgc.
- The `CustodyContext` is added in the da_checker and has methods for returning the current cgc and the number of columns to sample at head. Note that the logic for returning the cgc existed previously in the network globals.
- To estimate the number of validators attached, we use the `beacon_committee_subscriptions` endpoint. This might overestimate the number of validators actually publishing attestations from the node in the case of multi BN setups. We could also potentially use the `publish_attestations` endpoint to get a more conservative estimate at a later point.
- Anytime there's a change in the `custody_group_count` due to addition/removal of validators, the custody context should send an event on a broadcast channnel. The only subscriber for the channel exists in the network service which simply subscribes to more subnets. There can be additional subscribers in sync that will start a backfill once the cgc changes.
TODO
- [ ] **NOT REQUIRED:** Currently, the logic only handles an increase in validator count and does not handle a decrease. We should ideally unsubscribe from subnets when the cgc has decreased.
- [ ] **NOT REQUIRED:** Add a service in the `CustodyContext` that emits an event once `MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS ` passes after updating the current cgc. This event should be picked up by a subscriber which updates the enr and metadata.
- [x] Add more tests
This PR adds the ability to download [nightly reference tests from the consensus-specs repo](https://github.com/ethereum/consensus-specs/actions/workflows/generate_vectors.yml). This will be used by spec maintainers to ensure that there are no unexpected test failures prior to new releases. Also, we will keep track of test compliance with [this website](https://jtraglia.github.io/nyx/); eventually this will be integrated into Hive.
* A new script (`download_test_vectors.sh`) is added to handle downloads.
* The logic for downloading GitHub artifacts is a bit complex.
* Rename the variables which store test versions:
* `TESTS_TAG` to `CONSENSUS_SPECS_TEST_VERSION`.
* `BLS_TEST_TAG` to `BLS_TEST_VERSION`, for consistency.
* Delete tarballs after extracting them.
* I see no need to keep these; they just use extra disk.
* Consolidate `clean` rules into a single rule.
* Do `clean` prior to downloading/extracting tests.
* Remove `CURL` variable with GitHub token; don't need it for downloading releases.
* Do `mkdir -p` when creating directories.
* Probably more small stuff...
Fix clippy lints for `rustc` 1.87
clippy complains about `BeaconChainError` being too large. I went on a bit of a boxing spree because of this. We may instead want to `Box` some of the `BeaconChainError` variants?
Backport of:
- https://github.com/sigp/lighthouse/pull/7067
For:
- https://github.com/sigp/lighthouse/issues/7039
- Prevent writing to state cache when migrating the database
- Add `state-cache-headroom` flag to control pruning
- Prune old epoch boundary states ahead of mid-epoch states
- Never prune head block's state
- Avoid caching ancestor states unless they are on an epoch boundary
- Log when states enter/exit the cache
Co-authored-by: Eitan Seri-Levi <eserilev@ucsc.edu>
Fix a regression introduced in this PR:
- https://github.com/sigp/lighthouse/pull/6361
We were indexing into the `MerkleTree` with raw generalized indices, which was incorrect and triggering `debug_assert` failures, as described here:
- https://github.com/sigp/lighthouse/issues/7005
- Convert `generalized_index` to the correct leaf index prior to proof generation.
- Add sanity checks on indices used in `BeaconState::generate_proof`.
- Remove debug asserts from `MerkleTree::generate_proof` in favour of actual errors. This would have caught the bug earlier.
- Refactor the EF tests so that the merkle validity tests are actually run. They were misconfigured in a way that resulted in them running silently with 0 test cases, and the `check_all_files_accessed.py` script still had an ignore that covered the test files, so this omission wasn't detected.
Update spec tests for recent v1.5.0-beta.2 release. There are no substantial changes for Electra and earlier, and the Fulu test updates to be made are tracked here:
- https://github.com/sigp/lighthouse/issues/6957
- Add `SingleAttestation` SSZ tests
- Add new `deposit_with_reorg` fork choice tests
- Update tag to v1.5.0-beta.2
- Ignore Fulu tests
Addresses #6706
This PR activates PeerDAS at the Fulu fork epoch instead of `EIP_7594_FORK_EPOCH`. This means we no longer support testing PeerDAS with Deneb / Electrs, as it's now part of a hard fork.
No substantial changes in v1.5.0-beta.1, this PR just updates the tests.
The optimisation described in this PR is already implemented in our single-pass epoch processing:
- https://github.com/ethereum/consensus-specs/pull/4081
* Refactor spec testing for features and simplify usage.
* Fix `SszStatic` tests for PeerDAS: exclude eip7594 test vectors when testing Electra types.
* Merge branch 'unstable' into refactor-ef-tests-features
* Fix partial withdrawals count
* Remove get_active_balance
* Remove queue_entire_balance_and_reset_validator
* Switch to compounding when consolidating with source==target
* Queue deposit requests and apply them during epoch processing
* Fix ef tests
* Clear todos
* Fix engine api formatting issues
* Merge branch 'unstable' into electra-alpha7
* Make add_validator_to_registry more in line with the spec
* Address some review comments
* Cleanup
* Update initialize_beacon_state_from_eth1
* Merge branch 'unstable' into electra-alpha7
* Fix rpc decoding for blobs by range/root
* Fix block hash computation
* Fix process_deposits bug
* Merge branch 'unstable' into electra-alpha7
* Fix topup deposit processing bug
* Update builder api for electra
* Refactor mock builder to separate functionality
* Merge branch 'unstable' into electra-alpha7
* Address review comments
* Use copied for reference rather than cloned
* Optimise and simplify PendingDepositsContext::new
* Merge remote-tracking branch 'origin/unstable' into electra-alpha7
* Fix processing of deposits with invalid signatures
* Remove redundant code in genesis init
* Revert "Refactor mock builder to separate functionality"
This reverts commit 6d10456912.
* Revert "Update builder api for electra"
This reverts commit c5c9aca6db.
* Simplify pre-activation sorting
* Fix stale validators used in upgrade_to_electra
* Merge branch 'unstable' into electra-alpha7
* Get blobs from EL.
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
* Avoid cloning blobs after fetching blobs.
* Address review comments and refactor code.
* Fix lint.
* Move blob computation metric to the right spot.
* Merge branch 'unstable' into das-fetch-blobs
* Merge branch 'unstable' into das-fetch-blobs
# Conflicts:
# beacon_node/beacon_chain/src/beacon_chain.rs
# beacon_node/beacon_chain/src/block_verification.rs
# beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs
* Merge branch 'unstable' into das-fetch-blobs
# Conflicts:
# beacon_node/beacon_chain/src/beacon_chain.rs
* Gradual publication of data columns for supernodes.
* Recompute head after importing block with blobs from the EL.
* Fix lint
* Merge branch 'unstable' into das-fetch-blobs
* Use blocking task instead of async when computing cells.
* Merge branch 'das-fetch-blobs' of github.com:jimmygchen/lighthouse into das-fetch-blobs
* Merge remote-tracking branch 'origin/unstable' into das-fetch-blobs
* Fix semantic conflicts
* Downgrade error log.
* Merge branch 'unstable' into das-fetch-blobs
# Conflicts:
# beacon_node/beacon_chain/src/data_availability_checker.rs
# beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs
# beacon_node/execution_layer/src/engine_api.rs
# beacon_node/execution_layer/src/engine_api/json_structures.rs
# beacon_node/network/src/network_beacon_processor/gossip_methods.rs
# beacon_node/network/src/network_beacon_processor/mod.rs
# beacon_node/network/src/network_beacon_processor/sync_methods.rs
* Merge branch 'unstable' into das-fetch-blobs
* Publish block without waiting for blob and column proof computation.
* Address review comments and refactor.
* Merge branch 'unstable' into das-fetch-blobs
* Fix test and docs.
* Comment cleanups.
* Merge branch 'unstable' into das-fetch-blobs
* Address review comments and cleanup
* Address review comments and cleanup
* Refactor to de-duplicate gradual publication logic.
* Add more logging.
* Merge remote-tracking branch 'origin/unstable' into das-fetch-blobs
# Conflicts:
# Cargo.lock
* Fix incorrect comparison on `num_fetched_blobs`.
* Implement gradual blob publication.
* Merge branch 'unstable' into das-fetch-blobs
* Inline `publish_fn`.
* Merge branch 'das-fetch-blobs' of github.com:jimmygchen/lighthouse into das-fetch-blobs
* Gossip verify blobs before publishing
* Avoid queries for 0 blobs and error for duplicates
* Gossip verified engine blob before processing them, and use observe cache to detect duplicates before publishing.
* Merge branch 'das-fetch-blobs' of github.com:jimmygchen/lighthouse into das-fetch-blobs
# Conflicts:
# beacon_node/network/src/network_beacon_processor/mod.rs
* Merge branch 'unstable' into das-fetch-blobs
* Fix invalid commitment inclusion proofs in blob sidecars created from EL blobs.
* Only publish EL blobs triggered from gossip block, and not RPC block.
* Downgrade gossip blob log to `debug`.
* Merge branch 'unstable' into das-fetch-blobs
* Merge branch 'unstable' into das-fetch-blobs
* Grammar