Update `c-kzg` from `v1` to `v2`. My motivation here is that `alloy-consensus` now uses `c-kzg` in `v2` and this results in a conflict when using lighthouse in combination with latest alloy. I tried also to disable the `czkg` feature in alloy, but the conflict persisted.
See here for the alloy update to `c-kzg v2`: https://github.com/alloy-rs/alloy/pull/2240
Error:
```
error: failed to select a version for `c-kzg`.
...
versions that meet the requirements `^1` are: 1.0.3, 1.0.2, 1.0.0
the package `c-kzg` links to the native library `ckzg`, but it conflicts with a previous package which links to `ckzg` as well:
package `c-kzg v2.1.0`
... which satisfies dependency `c-kzg = "^2.1"` of package `alloy-consensus v0.13.0`
... which satisfies dependency `alloy-consensus = "^0.13.0"` of package ...
...
```
- Upgrade `alloy-consensus` to `0.14.0` and disable all default features
- Upgrade `c-kzg` to `v2.1.0`
- Upgrade `alloy-primitives` to `1.0.0`
- Adapt the code to the new API `c-kzg`
- There is now `NO_PRECOMPUTE` as my understand from https://github.com/ethereum/c-kzg-4844/pull/545/files we should use `0` here as `new_from_trusted_setup_no_precomp` does not precomp. But maybe it is misleading. For all other places I used `RECOMMENDED_PRECOMP_WIDTH` because `8` is matching the recommendation.
- `BYTES_PER_G1_POINT` and `BYTES_PER_G2_POINT` are no longer public in `c-kzg`
- I adapted two tests that checking for the `Attestation` bitfield size. But I could not pinpoint to what has changed and why now 8 bytes less. I would be happy about any hint, and if this is correct. I found related a PR here: https://github.com/sigp/lighthouse/pull/6915
- Use same fields names, in json, as well as `c-kzg` and `rust_eth_kzg` for `g1_monomial`, `g1_lagrange`, and `g2_monomial`
Anchor wants the `notify` function to run only in certain cases - so the `spawn_notifier` function is unsuitable for us.
Anchor uses it's own `notify` function, which then calls `notifier_service::notify` (in most circumstances). To enable that, `notify` needs to be `pub`.
Closes:
- https://github.com/sigp/lighthouse/issues/7690
Another checkpoint sync related fix! See issue for a description of the bug.
We fix it by just loading the block root of the `oldest_block_slot`, rather than trying to load the slot prior, which will always fail.
Fix a bug involving checkpoint sync from genesis reported by Sunnyside labs.
Ensure that the store's `anchor` is initialised prior to storing the genesis state. In the case of checkpoint sync from genesis, the genesis state will be in the _hot DB_, so we need the hot DB metadata to be initialised in order to store it.
I've extended the existing checkpoint sync tests to cover this case as well. There are some subtleties around what the `state_upper_limit` should be set to in this case. I've opted to just enable state reconstruction from the start in the test so it gets set to 0, which results in an end state more consistent with the other test cases (full state reconstruction). This is required because we can't meaningfully do any state reconstruction when the split slot is 0 (there is no range of frozen slots to reconstruct).
- #6240
- Bring built-in network configs up to date with latest consensus-spec PeerDAS configs.
- Add `MIN_EPOCHS_FOR_DATA_COLUMN_SIDECARS_REQUESTS` and use it to determine data availability window after the Fulu fork.
N/A
This PR switches to using `prepare_beacon_proposer` instead of `beacon_committee_subscriptions` endpoint to register validators with the custody context.
We currently use the `beacon_committee_subscriptions` endpoint for registering validators in the custody context.
Using the subscriptions endpoint has a few disadvantages:
1. The lighthouse VC tries to optimise the number of calls it makes to this endpoint to reduce the load on the subscriptions endpoint. So we would be getting different a subset of the total number of validators in each call. This will lead to a ramp up of the validator custody units instead of a one time bump. For e.g. see these logs
```
Jun 30 22:36:05.012 DEBUG Validator count at head updated old_count: 0, new_count: 19
Jun 30 22:36:11.016 DEBUG Validator count at head updated old_count: 19, new_count: 24
Jun 30 22:36:17.017 DEBUG Validator count at head updated old_count: 24, new_count: 27
Jun 30 22:36:23.020 DEBUG Validator count at head updated old_count: 27, new_count: 32
Jun 30 22:36:29.016 DEBUG Validator count at head updated old_count: 32, new_count: 36
Jun 30 22:36:35.005 DEBUG Validator count at head updated old_count: 36, new_count: 42
Jun 30 22:36:41.014 DEBUG Validator count at head updated old_count: 42, new_count: 44
Jun 30 22:36:47.017 DEBUG Validator count at head updated old_count: 44, new_count: 46
Jun 30 22:36:53.007 DEBUG Validator count at head updated old_count: 46, new_count: 48
Jun 30 22:36:59.009 DEBUG Validator count at head updated old_count: 48, new_count: 49
Jun 30 22:37:05.014 DEBUG Validator count at head updated old_count: 49, new_count: 50
Jun 30 22:37:11.007 DEBUG Validator count at head updated old_count: 50, new_count: 53
Jun 30 22:37:17.007 DEBUG Validator count at head updated old_count: 53, new_count: 55
Jun 30 22:37:35.008 DEBUG Validator count at head updated old_count: 55, new_count: 58
Jun 30 22:37:41.007 DEBUG Validator count at head updated old_count: 58, new_count: 59
Jun 30 22:37:53.010 DEBUG Validator count at head updated old_count: 59, new_count: 60
Jun 30 22:38:05.013 DEBUG Validator count at head updated old_count: 60, new_count: 61
Jun 30 22:38:23.006 DEBUG Validator count at head updated old_count: 61, new_count: 62
Jun 30 22:38:29.009 DEBUG Validator count at head updated old_count: 62, new_count: 63
Jun 30 22:38:41.009 DEBUG Validator count at head updated old_count: 63, new_count: 64
```
2. Different VCs would probably have different behaviours in terms of sending subscriptions
In contrast, the `prepare_beacon_proposer` endpoint usage would be more standard across different VCs without any filtering of validators. Not doing so could mean potentially missing proposals so VCs are incentivised to make this call on any change in the validators managed by them.
Lighthouse calls this endpoint every slot.
Update `SAMPLES_PER_SLOT` to be number of custody groups instead of data columns. This should have no impact on the current implementation as config currently maintains a `group:subnet:column` ratio of `1:1:1`. **In short, this PR doesn't change anything for Fusaka, but ensures compliance with the spec and potential future changes.**
I've added separate methods to compute sampling columns and custody groups for clarity: `spec.sampling_size_columns` and `spec.sampling_size_custod_groups`
See the clarifications in this PR for more details: https://github.com/ethereum/consensus-specs/pull/4251
N/A
Persist the epoch -> cgc values. This is to ensure that `ValidatorRegistrations::latest_validator_custody_requirement` always returns a `Some` value post restart assuming the `epoch_validator_custody_requirements` map has been updated in the previous runs.
This PR implements some heuristics to check for breaking database changes. The goal is to prevent accidental changes to the database schema occurring without a version bump.
The `sync_contributions_across_fork_with_skip_slots` test have been quite flaky recently on CI, we suspect it might be caused by the recent introduction of a `default` timeout in #7400, and CI is failing to consistently process those http requests within 1s likely due to limited resources.
This PR increases the `default` timeout to 2s in tests to avoid flaky tests, but keeps the remaining timeout the same (1s).
https://github.com/sigp/lighthouse/actions/runs/15965113170/job/45023976021
```
FAIL [ 8.945s] http_api::bn_http_api_tests fork_tests::sync_contributions_across_fork_with_skip_slots
stdout ───
running 1 test
test fork_tests::sync_contributions_across_fork_with_skip_slots ... FAILED
failures:
failures:
fork_tests::sync_contributions_across_fork_with_skip_slots
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 175 filtered out; finished in 8.91s
stderr ───
thread 'fork_tests::sync_contributions_across_fork_with_skip_slots' panicked at beacon_node/http_api/tests/fork_tests.rs:239:10:
called `Result::unwrap()` on an `Err` value: HttpClient(url: http://127.0.0.1:41793/, kind: timeout, detail: operation timed out)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
When we removed the eth1 data, I wrote a v25 schema upgrade to delete the data on disk:
- https://github.com/sigp/lighthouse/pull/7133
However, I forgot to update the current schema version, so this change was never actioned.
This PR updates the current schema version to v25 so that the migration runs.
This bug was first found and partially fixed by @VolodymyrBg in #7317 - this PR applies the same fix everywhere else.
The old logic updated the waker when it already matched the context, and did nothing when it was stale:
```rust
if waker.will_wake(cx.waker()) {
self.waker = Some(cx.waker().clone());
}
```
This is the wrong way around. We only want to update the waker if it doesn't match the current context:
```rust
if !waker.will_wake(cx.waker()) {
self.waker = Some(cx.waker().clone());
}
```
I don't think we've ever noticed any issues, but it’s a subtle bug that could lead to missed wakeups.
N/A
For responding to by_range requests , we should ideally only respond with items in the range `req.start_slot()..req.start_slot() + req.count`.
We were not filtering the generated response for blobs and data columns, only for blocks. This PR adds the filtering for the sidecars as well.
Adds a new `/lighthouse` API call to the VC which allows the list of beacon nodes to be updated dynamically at runtime.
An entirely new beacon node list is provided to the VC so it effectively adds, removes or reorders nodes to match the new list.
This can then be used in Siren, which will enable a "drag to reorder" system along with adding and removing beacon nodes while the VC is on. This will make it unnecessary to reboot the VC when users want to simply add or remove a BN from the list.
N/A
After the electra fork which includes EIP 6110, the beacon node no longer needs the eth1 bridging mechanism to include new deposits as they are provided by the EL as a `deposit_request`. So after electra + a transition period where the finalized bridge deposits pre-fork are included through the old mechanism, we no longer need the elaborate machinery we had to get deposit contract data from the execution layer.
Since holesky has already forked to electra and completed the transition period, this PR basically checks to see if removing all the eth1 related logic leads to any surprises.
I think this should resolve#7155
This removes the level field from the instrumenting we were doing across a range of functions. The level will now default to the level of the log.
Partially https://github.com/sigp/lighthouse/issues/6291
This PR removes the reprocess event channel from being externally exposed. All work events are now sent through the single `BeaconProcessorSend` channel. I've introduced a new `Work::Reprocess` enum variant which we then use to schedule jobs for reprocess. I've also created a new scheduler module which will eventually house the different scheduler impls.
This is all needed as an initial step to generalize the beacon processor
A "full" implementation for the generalized beacon processor can be found here
https://github.com/sigp/lighthouse/pull/6448
I'm going to try to break up the full implementation into smaller PR's so it can actually be reviewed
#6970
This allows for us to receive `SingleAttestation` over gossip and process it without converting. There is still a conversion to `Attestation` as a final step in the attestation verification process, but by then the `SingleAttestation` is fully verified.
I've also fully removed the `submitPoolAttestationsV1` endpoint as its been deprecated
I've also pre-emptively deprecated supporting `Attestation` in `submitPoolAttestationsV2` endpoint. See here for more info: https://github.com/ethereum/beacon-APIs/pull/531
I tried to the minimize the diff here by only making the "required" changes. There are some unnecessary complexities with the way we manage the different attestation verification wrapper types. We could probably consolidate this to one wrapper type and refactor this even further. We could leave that to a separate PR if we feel like cleaning things up in the future.
Note that I've also updated the test harness to always submit `SingleAttestation` regardless of fork variant. I don't see a problem in that approach and it allows us to delete more code :)
Giving more context about late block re-orgs would make the concept easier to grasp for newcomers.
Add more context to this section in the Lighthouse Book.
Currently the validator effective balance used for computing PeerDAS custody group count is only updated when the validator subscribes to the BN via `validator/beacon_committee_subscriptions`.
If a validator stops registering with the node, the effective balance gets outdated and stays in the BN memory until the next restart. They are no longer required for CGC computation, as long as the CGC never reduces as per the spec, therefore they can be dropped.
Resolves#6767
This PR implements a basic version of validator custody.
- It introduces a new `CustodyContext` object which contains info regarding number of validators attached to a node and the custody count they contribute to the cgc.
- The `CustodyContext` is added in the da_checker and has methods for returning the current cgc and the number of columns to sample at head. Note that the logic for returning the cgc existed previously in the network globals.
- To estimate the number of validators attached, we use the `beacon_committee_subscriptions` endpoint. This might overestimate the number of validators actually publishing attestations from the node in the case of multi BN setups. We could also potentially use the `publish_attestations` endpoint to get a more conservative estimate at a later point.
- Anytime there's a change in the `custody_group_count` due to addition/removal of validators, the custody context should send an event on a broadcast channnel. The only subscriber for the channel exists in the network service which simply subscribes to more subnets. There can be additional subscribers in sync that will start a backfill once the cgc changes.
TODO
- [ ] **NOT REQUIRED:** Currently, the logic only handles an increase in validator count and does not handle a decrease. We should ideally unsubscribe from subnets when the cgc has decreased.
- [ ] **NOT REQUIRED:** Add a service in the `CustodyContext` that emits an event once `MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS ` passes after updating the current cgc. This event should be picked up by a subscriber which updates the enr and metadata.
- [x] Add more tests
We're seeing slow KZG performance on `fusaka-devnet-0` and looking for optimisations to improve performance.
Zipping the list first then `into_par_iter` shows a 10% improvement in performance benchmark, i suspect this might be even more material when running on a beacon node.
Before:
```
blobs_to_data_column_sidecars_20
time: [11.583 ms 12.041 ms 12.534 ms]
Found 5 outliers among 100 measurements (5.00%)
```
After:
```
blobs_to_data_column_sidecars_20
time: [10.506 ms 10.724 ms 10.982 ms]
change: [-14.925% -10.941% -6.5452%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
```
The libp2p/discv5 logs are not stored when stopping local-testnet.
Store the `beacon/logs` directory to [Kurtosis Files Artifacts](https://docs.kurtosis.com/advanced-concepts/files-artifacts/) so that they are downloaded locally by `kurtosis enclave dump`.
Our basic sim test has been [flaky](https://github.com/sigp/lighthouse/actions/runs/15458818777/job/43515966229) for some time, and seems like it has gotten worse since electra fork was added to it in #7199.
It looks like the github runner is struggling with the load, currently it runs 7 nodes on a 4 CPU runner, which is definitely too much. We could consider moving this to run on our self hosted runner - but I think running 7 nodes is unnecessary and we can probably trim test this down.
Reduce number of basic sim test nodes from 7 (3 BN + 3 Proposer BN + 1 extra) to 4 (2 BN + 1 Proposer BN + 1 extra).
If we want to run more nodes, we'd have to consider running on self hosted runners.