Commit Graph

1603 Commits

Author SHA1 Message Date
Eitan Seri-Levi
bd8a2a8ffb Gossip recently computed light client data (#7023) 2025-07-08 07:07:10 +00:00
Michael Sproul
c7bb3b00e4 Fix lookups of the block at oldest_block_slot (#7693)
Closes:

- https://github.com/sigp/lighthouse/issues/7690


  Another checkpoint sync related fix! See issue for a description of the bug.

We fix it by just loading the block root of the `oldest_block_slot`, rather than trying to load the slot prior, which will always fail.
2025-07-02 23:40:04 +00:00
Michael Sproul
a459a9af98 Fix and test checkpoint sync from genesis (#7689)
Fix a bug involving checkpoint sync from genesis reported by Sunnyside labs.


  Ensure that the store's `anchor` is initialised prior to storing the genesis state. In the case of checkpoint sync from genesis, the genesis state will be in the _hot DB_, so we need the hot DB metadata to be initialised in order to store it.

I've extended the existing checkpoint sync tests to cover this case as well. There are some subtleties around what the `state_upper_limit` should be set to in this case. I've opted to just enable state reconstruction from the start in the test so it gets set to 0, which results in an end state more consistent with the other test cases (full state reconstruction). This is required because we can't meaningfully do any state reconstruction when the split slot is 0 (there is no range of frozen slots to reconstruct).
2025-07-02 04:50:33 +00:00
Jimmy Chen
fcc602a787 Update fulu network configs and add MIN_EPOCHS_FOR_DATA_COLUMN_SIDECARS_REQUESTS (#7646)
- #6240
- Bring built-in network configs up to date with latest consensus-spec PeerDAS configs.
- Add `MIN_EPOCHS_FOR_DATA_COLUMN_SIDECARS_REQUESTS` and use it to determine data availability window after the Fulu fork.
2025-07-02 02:38:25 +00:00
Jimmy Chen
41742ce2bd Update SAMPLES_PER_SLOT to be number of custody groups instead of data columns (#7683)
Update `SAMPLES_PER_SLOT` to be number of custody groups instead of data columns. This should have no impact on the current implementation as config currently maintains a `group:subnet:column` ratio of `1:1:1`.  **In short, this PR doesn't change anything for Fusaka, but ensures compliance with the spec and potential future changes.**

I've added separate methods to compute sampling columns and custody groups for clarity: `spec.sampling_size_columns` and `spec.sampling_size_custod_groups`

See the clarifications in this PR for more details: https://github.com/ethereum/consensus-specs/pull/4251
2025-07-02 00:08:40 +00:00
Pawan Dhananjay
e305cb1b92 Custody persist fix (#7661)
N/A


  Persist the epoch -> cgc values. This is to ensure that `ValidatorRegistrations::latest_validator_custody_requirement` always returns a `Some` value post restart assuming the `epoch_validator_custody_requirements` map has been updated in the previous runs.
2025-07-01 06:06:37 +00:00
Michael Sproul
c1f94d9b7b Test database schema stability (#7669)
This PR implements some heuristics to check for breaking database changes. The goal is to prevent accidental changes to the database schema occurring without a version bump.
2025-06-30 10:55:47 +00:00
Nikola Ristić
2d759f78be Fix beacon_chain metrics descriptions (#6576)
n/a


  Just fixed a small mistake in the metrics description.
2025-06-30 00:11:14 +00:00
Odinson
6ea5f14b39 feat: better error message for light_client/bootstrap endpoint (#7597)
Fixes #7229
2025-06-28 03:55:39 +00:00
Jimmy Chen
83cad25d98 Fix Rust 1.88 clippy errors & execution engine tests (#7657)
Fix Rust 1.88 clippy errors.
2025-06-27 18:21:17 +00:00
Pawan Dhananjay
9b1f3ed9d1 Add gossip check (#7652)
N/A


  Add an additional gossip condition.
2025-06-27 00:26:38 +00:00
chonghe
8e3c5d1524 Rust 1.89 compiler lint fix (#7644)
Fix lints for Rust 1.89 beta compiler
2025-06-25 05:33:17 +00:00
Eitan Seri-Levi
56b2d4b525 Remove instrumenting log level (#7636)
https://github.com/sigp/lighthouse/issues/7155


  Theres some additional places we set instrumenting log levels that wasn't covered in #7620
2025-06-24 06:29:10 +00:00
Pawan Dhananjay
11bcccb353 Remove all prod eth1 related code (#7133)
N/A


  After the electra fork which includes EIP 6110, the beacon node no longer needs the eth1 bridging mechanism to include new deposits as they are provided by the EL as a `deposit_request`. So after electra + a transition period where the finalized bridge deposits pre-fork are included through the old mechanism, we no longer need the elaborate machinery we had to get deposit contract data from the execution layer.

Since holesky has already forked to electra and completed the transition period, this PR basically checks to see if removing all the eth1 related logic leads to any surprises.
2025-06-23 03:00:07 +00:00
Age Manning
d50924677a Remove instrumenting log level (#7620)
I think this should resolve #7155


This removes the level field from the instrumenting we were doing across a range of functions. The level will now default to the level of the log.
2025-06-20 07:44:59 +00:00
Lion - dapplion
dd98534158 Hierarchical state diffs in hot DB (#6750)
This PR implements https://github.com/sigp/lighthouse/pull/5978 (tree-states) but on the hot DB. It allows Lighthouse to massively reduce its disk footprint during non-finality and overall I/O in all cases.

Closes https://github.com/sigp/lighthouse/issues/6580

Conga into https://github.com/sigp/lighthouse/pull/6744

### TODOs

- [x] Fix OOM in CI https://github.com/sigp/lighthouse/pull/7176
- [x] optimise store_hot_state to avoid storing a duplicate state if the summary already exists (should be safe from races now that pruning is cleaner)
- [x] mispelled: get_ancenstor_state_root
- [x] get_ancestor_state_root should use state summaries
- [x] Prevent split from changing during ancestor calc
- [x] Use same hierarchy for hot and cold

### TODO Good optimization for future PRs

- [ ] On the migration, if the latest hot snapshot is aligned with the cold snapshot migrate the diffs instead of the full states.
```
align slot  time
10485760    Nov-26-2024
12582912    Sep-14-2025
14680064    Jul-02-2026
```

### TODO Maybe things good to have

- [ ] Rename anchor_slot https://github.com/sigp/lighthouse/compare/tree-states-hot-rebase-oom...dapplion:lighthouse:tree-states-hot-anchor-slot-rename?expand=1
- [ ] Make anchor fields not public such that they must be mutated through a method. To prevent un-wanted changes of the anchor_slot

### NOTTODO

- [ ] Use fork-choice and a new method [`descendants_of_checkpoint`](ca2388e196 (diff-046fbdb517ca16b80e4464c2c824cf001a74a0a94ac0065e635768ac391062a8)) to filter only the state summaries that descend of finalized checkpoint]
2025-06-19 02:43:25 +00:00
Eitan Seri-Levi
6786b9d12a Single attestation "Full" implementation (#7444)
#6970


  This allows for us to receive `SingleAttestation` over gossip and process it without converting. There is still a conversion to `Attestation` as a final step in the attestation verification process, but by then the `SingleAttestation` is fully verified.

I've also fully removed the `submitPoolAttestationsV1` endpoint as its been deprecated

I've also pre-emptively deprecated supporting `Attestation` in `submitPoolAttestationsV2` endpoint. See here for more info: https://github.com/ethereum/beacon-APIs/pull/531

I tried to the minimize the diff here by only making the "required" changes. There are some unnecessary complexities with the way we manage the different attestation verification wrapper types. We could probably consolidate this to one wrapper type and refactor this even further. We could leave that to a separate PR if we feel like cleaning things up in the future.

Note that I've also updated the test harness to always submit `SingleAttestation` regardless of fork variant. I don't see a problem in that approach and it allows us to delete more code :)
2025-06-17 09:01:26 +00:00
Jimmy Chen
a65f78222d Drop stale registrations without reducing CGC (#7594)
Currently the validator effective balance used for computing PeerDAS custody group count is only updated when the validator subscribes to the BN via  `validator/beacon_committee_subscriptions`.

If a validator stops registering with the node, the effective balance gets outdated and stays in the BN memory until the next restart. They are no longer required for CGC computation, as long as the CGC never reduces as per the spec, therefore they can be dropped.
2025-06-13 14:30:43 +00:00
Daniel Knopik
5472cb8500 Batch verify KZG proofs for getBlobsV2 (#7582) 2025-06-12 14:35:14 +00:00
Pawan Dhananjay
5f208bb858 Implement basic validator custody framework (no backfill) (#7578)
Resolves #6767


  This PR implements a basic version of validator custody.
- It introduces a new `CustodyContext` object which contains info regarding number of validators attached to a node and  the custody count they contribute to the cgc.
- The `CustodyContext` is added in the da_checker and has methods for returning the current cgc and the number of columns to sample at head. Note that the logic for returning the cgc existed previously in the network globals.
- To estimate the number of validators attached, we use the `beacon_committee_subscriptions` endpoint. This might overestimate the number of validators actually publishing attestations from the node in the case of multi BN setups. We could also potentially use the `publish_attestations` endpoint to get a more conservative estimate at a later point.
- Anytime there's a change in the `custody_group_count` due to addition/removal of validators, the custody context should send an event on a broadcast channnel. The only subscriber for the channel exists in the network service which simply subscribes to more subnets. There can be additional subscribers in sync that will start a backfill once the cgc changes.

TODO

- [ ] **NOT REQUIRED:** Currently, the logic only handles an increase in validator count and does not handle a decrease. We should ideally unsubscribe from subnets when the cgc has decreased.
- [ ] **NOT REQUIRED:** Add a service in the `CustodyContext` that emits an event once `MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS ` passes after updating the current cgc. This event should be picked up by a subscriber which updates the enr and metadata.
- [x] Add more tests
2025-06-11 18:10:06 +00:00
Pawan Dhananjay
076a1c3fae Data column sidecar event (#7587)
N/A


  Implement events for data column sidecar https://github.com/ethereum/beacon-APIs/pull/535
2025-06-11 16:39:22 +00:00
Jimmy Chen
8c6abc0b69 Optimise parallelism in compute cells operations by zipping first (#7574)
We're seeing slow KZG performance on `fusaka-devnet-0` and looking for optimisations to improve performance.

Zipping the list first then `into_par_iter` shows a 10% improvement in performance benchmark, i suspect this might be even more material when running on a beacon node.

Before:
```
blobs_to_data_column_sidecars_20
time:   [11.583 ms 12.041 ms 12.534 ms]
Found 5 outliers among 100 measurements (5.00%)
```

After:
```
blobs_to_data_column_sidecars_20
time:   [10.506 ms 10.724 ms 10.982 ms]
change: [-14.925% -10.941% -6.5452%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
```
2025-06-09 12:41:14 +00:00
ethDreamer
b08d49c4cb Changes for fusaka-devnet-1 (#7559)
Changes for [fusaka-devnet-1](https://notes.ethereum.org/@ethpandaops/fusaka-devnet-1)


  [Consensus Specs v1.6.0-alpha.1](https://github.com/ethereum/consensus-specs/pull/4346)
* [EIP-7917: Deterministic Proposer Lookahead](https://eips.ethereum.org/EIPS/eip-7917)
* [EIP-7892: Blob Parameter Only Hardforks](https://eips.ethereum.org/EIPS/eip-7892)
2025-06-09 09:10:08 +00:00
Jimmy Chen
357a8ccbb9 Checkpoint sync without the blobs from Fulu (#7549)
Lighthouse currently requires checkpoint sync to be performed against a supernode in a PeerDAS network, as only supernodes can serve blobs.

This PR lifts that requirement, enabling Lighthouse to checkpoint sync from either a fullnode or a supernode (See https://github.com/sigp/lighthouse/issues/6837#issuecomment-2933094923)

Missing data columns for the checkpoint block isn't a big issue, but we should be able to easily implement backfill once we have the logic to backfill data columns.
2025-06-04 00:31:27 +00:00
ethDreamer
ae30480926 Implement EIP-7892 BPO hardforks (#7521)
[EIP-7892: Blob Parameter Only Hardforks](https://eips.ethereum.org/EIPS/eip-7892)

#7467
2025-06-02 06:54:42 +00:00
Jimmy Chen
94a1446ac9 Fix unexpected blob error and duplicate import in fetch blobs (#7541)
Getting this error on a non-PeerDAS network:

```
May 29 13:30:13.484 ERROR Error fetching or processing blobs from EL    error: BlobProcessingError(AvailabilityCheck(Unexpected("empty blobs"))), block_root: 0x98aa3927056d453614fefbc79eb1f9865666d1f119d0e8aa9e6f4d02aa9395d9
```

It appears we're passing an empty `Vec` to DA checker, because all blobs were already seen on gossip and filtered out, this causes a `AvailabilityCheckError::Unexpected("empty blobs")`.

I've added equivalent unit tests for `getBlobsV1` to cover all the scenarios we test in `getBlobsV2`. This would have caught the bug if I had added it earlier. It also caught another bug which could trigger duplicate block import.

Thanks Santito for reporting this! 🙏
2025-06-02 01:51:09 +00:00
Jimmy Chen
4d21846aba Prevent AvailabilityCheckError when there's no new custody columns to import (#7533)
Addresses a regression recently introduced when we started gossip verifying data columns from EL blobs

```
failures:
network_beacon_processor::tests::accept_processed_gossip_data_columns_without_import

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 90 filtered out; finished in 16.60s

stderr ───

thread 'network_beacon_processor::tests::accept_processed_gossip_data_columns_without_import' panicked at beacon_node/network/src/network_beacon_processor/tests.rs:829:10:
should put data columns into availability cache: Unexpected("empty columns")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```

https://github.com/sigp/lighthouse/actions/runs/15309278812/job/43082341868?pr=7521

If an empty `Vec` is passed to the DA checker, it causes an unexpected error.

This PR addresses it by not passing an empty `Vec` for processing, and not spawning a task to publish.
2025-05-29 02:54:34 +00:00
Mac L
0ddf9a99d6 Remove support for database migrations prior to schema version v22 (#7332)
Remove deprecated database migrations prior to v22 along with v22 migration specific code.
2025-05-28 13:47:21 +00:00
Jimmy Chen
e6ef644db4 Verify getBlobsV2 response and avoid reprocessing imported data columns (#7493)
#7461 and partly #6439.

Desired behaviour after receiving `engine_getBlobs` response:

1. Gossip verify the blobs and proofs, but don't mark them as observed yet. This is because not all blobs are published immediately (due to staggered publishing). If we mark them as observed and not publish them, we could end up blocking the gossip propagation.
2. Blobs are marked as observed _either_ when:
* They are received from gossip and forwarded to the network .
* They are published by the node.

Current behaviour:
-  We only gossip verify `engine_getBlobsV1` responses, but not `engine_getBlobsV2` responses (PeerDAS).
-  After importing EL blobs AND before they're published, if the same blobs arrive via gossip, they will get re-processed, which may result in a re-import.


  1. Perform gossip verification on data columns computed from EL `getBlobsV2` response. We currently only do this for `getBlobsV1` to prevent importing blobs with invalid proofs into the `DataAvailabilityChecker`, this should be done on V2 responses too.
2. Add additional gossip verification to make sure we don't re-process a ~~blob~~ or data column that was imported via the EL `getBlobs` but not yet "seen" on the gossip network. If an "unobserved" gossip blob is found in the availability cache, then we know it has passed verification so we can immediately propagate the `ACCEPT` result and forward it to the network, but without re-processing it.

**UPDATE:** I've left blobs out for the second change mentioned above, as the likelihood and impact is very slow and we haven't seen it enough, but under PeerDAS this issue is a regular occurrence and we do see the same block getting imported many times.
2025-05-26 19:55:58 +00:00
Jimmy Chen
f01dc556d1 Update engine_getBlobsV2 response type and add getBlobsV2 tests (#7505)
Update `engine_getBlobsV2` response type to `Option<Vec<BlobsAndProofV2>>`. See recent spec change [here](https://github.com/ethereum/execution-apis/pull/630).

Added some tests to cover basic fetch blob scenarios.
2025-05-26 04:33:34 +00:00
Akihito Nakano
a2797d4bbd Fix formatting errors from cargo-sort (#7512)
[cargo-sort is currently failing on CI](https://github.com/sigp/lighthouse/actions/runs/15198128212/job/42746931918?pr=7025), likely due to new checks introduced in version [2.0.0](https://github.com/DevinR528/cargo-sort/releases/tag/v2.0.0).


  Fixed the errors by running cargo-sort with formatting enabled.
2025-05-23 05:25:56 +00:00
Akihito Nakano
537fc5bde8 Revive network-test logs files in CI (#7459)
https://github.com/sigp/lighthouse/issues/7187


  This PR adds a writer that implements `tracing_subscriber::fmt::MakeWriter`, which writes logs to separate files for each test.
2025-05-22 02:51:22 +00:00
Michael Sproul
2e96e9769b Use slice.is_sorted now that it's stable (#7507)
Use slice.is_sorted which was stabilised in Rust 1.82.0

I thought there would be more places we could use this, but it seems we often want to check strict monotonicity (i.e. sorted + no duplicates)
2025-05-22 02:14:46 +00:00
Michael Sproul
805c2dc831 Correct reward denominator in op pool (#5047)
Closes #5016


  The op pool was using the wrong denominator when calculating proposer block rewards! This was mostly inconsequential as our studies of Lighthouse's block profitability already showed that it is very close to optimal. The wrong denominator was leftover from phase0 code, and wasn't properly updated for Altair.
2025-05-20 01:06:40 +00:00
ethDreamer
7684d1f866 ContextDeserialize and Beacon API Improvements (#7372)
* #7286
* BeaconAPI is not returning a versioned response when it should for some V1 endpoints
* these [strange functions with vX in the name that still accept `endpoint_version` arguments](https://github.com/sigp/lighthouse/blob/stable/beacon_node/http_api/src/produce_block.rs#L192)

This refactor is a prerequisite to get the fulu EF tests running.
2025-05-19 05:05:16 +00:00
Eitan Seri-Levi
268809a530 Rust clippy 1.87 lint fixes (#7471)
Fix clippy lints for `rustc` 1.87


  clippy complains about `BeaconChainError` being too large. I went on a bit of a boxing spree because of this. We may instead want to `Box` some of the `BeaconChainError` variants?
2025-05-16 05:03:00 +00:00
Michael Sproul
c2c7fb87a8 Make DAG construction more permissive (#7460)
Workaround/fix for:

- https://github.com/sigp/lighthouse/issues/7323


  - Remove the `StateSummariesNotContiguousError`. This allows us to continue with DAG construction and pruning, even in the case where the DAG is disjointed. We will treat any disjoint summaries as roots of their own tree, and prune them (as they are not descended from finalized). This should be safe, as canonical summaries should not be disjoint (if they are, then the DB is already corrupt).
2025-05-15 02:15:35 +00:00
Eitan Seri-Levi
807848bc7a Next sync committee branch bug (#7443)
#7441


  Make sure we're correctly caching light client data
2025-05-13 01:13:15 +00:00
SunnysidedJ
593390162f peerdas-devnet-7: update DataColumnSidecarsByRoot request to use DataColumnsByRootIdentifier (#7399)
Update DataColumnSidecarsByRoot request to use DataColumnsByRootIdentifier #7377


  As described in https://github.com/ethereum/consensus-specs/pull/4284
2025-05-12 00:20:55 +00:00
Jimmy Chen
4b9c16fc71 Add Electra forks to basic sim tests (#7199)
This PR adds transitions to Electra ~~and Fulu~~ fork epochs in the simulator tests.

~~It also covers blob inclusion verification and data column syncing on a full node in Fulu.~~

UPDATE: Remove fulu fork from sim tests due to https://github.com/sigp/lighthouse/pull/7199#issuecomment-2852281176
2025-05-08 08:43:44 +00:00
Jimmy Chen
93ec9df137 Compute proposer shuffling only once in gossip verification (#7304)
When we perform data column gossip verification, we sometimes see multiple proposer shuffling cache miss simultaneously and this results in multiple threads computing the shuffling cache and potentially slows down the gossip verification.

Proposal here is to use a `OnceCell` for each shuffling key to make sure it's only computed once. I have only implemented this in data column verification as a PoC, but this can also be applied to blob and block verification

Related issues:
- https://github.com/sigp/lighthouse/issues/4447
- https://github.com/sigp/lighthouse/issues/7203
2025-05-01 01:30:42 +00:00
Eitan Seri-Levi
9779b4ba2c Optimize validate_data_columns (#7326) 2025-04-30 04:36:50 +00:00
Michael Sproul
e61e92b926 Merge remote-tracking branch 'origin/stable' into unstable 2025-04-22 18:55:06 +10:00
Jean-Baptiste Pinalie
5352d5f78a Update proposer_slashings and attester_slashings amounts for electra. (#7316)
Did not find a specific issue beside https://github.com/sigp/lighthouse/issues/6821


  Leverage `whistleblower_reward_quotient_for_state` to have accurate post-electra `proposer_slashings` and `attester_slashings` fields returned by `/eth/v1/beacon/rewards/blocks/<id>`.
2025-04-17 00:58:36 +00:00
Lion - dapplion
be68dd24d0 Fix wrong custody column count for lookup blocks (#7281)
Fixes
- https://github.com/sigp/lighthouse/issues/7278


  Don't assume 0 columns for `RpcBlockInner::Block`
2025-04-11 22:00:57 +00:00
Mac L
39eb8145f8 Merge branch 'release-v7.0.0' into unstable 2025-04-11 21:32:24 +10:00
Eitan Seri-Levi
aed562abef Downgrade light client errors (#7300)
Downgrade light client errors to debug

Error messages are alarming and usually indicate somethings wrong with the beacon node. The Light Client service is supposed to minimally impact users, and most will not care if the light client server is erroring. Furthermore, the only errors we've seen in the wild are during hard forks, for the first few epochs before the fork finalizes.
2025-04-10 02:17:07 +00:00
SunnysidedJ
d96b73152e Fix for #6296: Deterministic RNG in peer DAS publish block tests (#7192)
#6296: Deterministic RNG in peer DAS publish block tests


  Made test functions to call publish-block APIs with true for the deterministic RNG boolean parameter while production code with false. This will deterministically shuffle columns for unit tests under broadcast_validation_tests.rs.
2025-04-09 15:35:15 +00:00
Jimmy Chen
759b0612b3 Offloading KZG Proof Computation from the beacon node (#7117)
Addresses #7108

- Add EL integration for `getPayloadV5` and `getBlobsV2`
- Offload proof computation and use proofs from EL RPC APIs
2025-04-08 07:37:16 +00:00
Lion - dapplion
70850fe58d Drop head tracker for summaries DAG (#6744)
The head tracker is a persisted piece of state that must be kept in sync with the fork-choice. It has been a source of pruning issues in the past, so we want to remove it
- see https://github.com/sigp/lighthouse/issues/1785

When implementing tree-states in the hot DB we have to change the pruning routine (more details below) so we want to do those changes first in isolation.
- see https://github.com/sigp/lighthouse/issues/6580
- If you want to see the full feature of tree-states hot https://github.com/dapplion/lighthouse/pull/39

Closes https://github.com/sigp/lighthouse/issues/1785


  **Current DB migration routine**

- Locate abandoned heads with head tracker
- Use a roots iterator to collect the ancestors of those heads can be pruned
- Delete those abandoned blocks / states
- Migrate the newly finalized chain to the freezer

In summary, it computes what it has to delete and keeps the rest. Then it migrates data to the freezer. If the abandoned forks routine has a bug it can break the freezer migration.

**Proposed migration routine (this PR)**

- Migrate the newly finalized chain to the freezer
- Load all state summaries from disk
- From those, just knowing the head and finalized block compute two sets: (1) descendants of finalized (2) newly finalized chain
- Iterate all summaries, if a summary does not belong to set (1) or (2), delete

This strategy is more sound as it just checks what's there in the hot DB, computes what it has to keep and deletes the rest. Because it does not rely and 3rd pieces of data we can drop the head tracker and pruning checkpoint. Since the DB migration happens **first** now, as long as the computation of the sets to keep is correct we won't have pruning issues.
2025-04-07 04:23:52 +00:00