Commit Graph

3730 Commits

Author SHA1 Message Date
Michael Sproul
6194dddc5b Persist custody context more readily (#8921)
We received a bug report of a node restarting custody backfill unnecessarily after upgrading to Lighthouse v8.1.1. What happened is:

- User started LH v8.0.1 many months ago, CGC updated 0 -> N but the CGC was not eagerly persisted.
- LH experienced an unclean shutdown (not sure of what type).
- Upon restarting (still running v8.0.1), the custody context read from disk contains CGC=0: `DEBUG Loaded persisted custody context   custody_context: CustodyContext { validator_custody_count: 0, ...`).
- CGC updates again to N, retriggering custody backfill: `DEBUG Validator count at head updated   old_count: 0, new_count: N`.
- Custody backfill does a bunch of downloading for no gain: `DEBUG Imported historical data columns   epoch: Epoch(428433), total_imported: 0`
- While custody backfill is running user updated to v8.1.1, and we see logs for the CGC=N being peristed upon clean shutdown, and then correctly read on startup with v8.1.1.
- Custody backfill keeps running and downloading due to the CGC change still being considered in progress.


  - Call `persist_custody_context` inside the `register_validators` handler so that it is written to disk eagerly whenever it changes. The performance impact of this should be minimal as the amount of data is very small and this call can only happen at most ~128 times (once for each change) in the entire life of a beacon node.
- Call `persist_custody_context` inside `BeaconChainBuilder::build` so that changes caused by CLI flags are persisted (otherwise starting a node with `--semi-supernode` and no validators, then shutting it down uncleanly would cause use to forget the CGC).

These changes greatly reduce the timespan during which an unclean shutdown can create inconsistency. In the worst case, we only lose backfill progress that runs concurrently with the `register_validators` handler (should be extremely minimal, nigh impossible).


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2026-03-02 06:43:51 +00:00
Michael Sproul
7e69f6ca1c Merge remote-tracking branch 'origin/release-v8.1' into back-merge-v8.1.1 2026-03-02 09:20:25 +11:00
Jimmy Chen
a1ef265c9e Add getBlobsV1 and getBlobsV2 support to mock EL server (#8870)
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-02-25 12:17:49 +00:00
Lion - dapplion
d6bf53834f Remove merge transition code (#8761)
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2026-02-25 03:20:28 +00:00
Jimmy Chen
e59f1f03ef Add debug spans to DB write paths (#8895)
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-02-24 20:53:33 +00:00
Michael Sproul
886d31fe7e Delete dysfunctional fork_revert feature (#8891)
I found myself having to update this code for Gloas, and figured we may as well delete it seeing as it doesn't work.

See:

- https://github.com/sigp/lighthouse/issues/4198


  Delete all `fork_revert` logic and the accompanying test.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2026-02-24 06:27:16 +00:00
Jimmy Chen
341682e719 Add unit tests for BatchInfo and fix doc comments (#8873)
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-02-24 00:15:39 +00:00
Eitan Seri-Levi
dcc43e3d20 Implement gloas block gossip verification changes (#8878)
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>

Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
2026-02-23 06:17:24 +00:00
0xMushow
2b214175d5 Enforce stricter checks on certain constants (#8500)
Which issue # does this PR address?
None


  All of these are performing a check, and adding a batch, or creating a new lookup, or a new query, etc..
Hence all of these limits would be off by one.

Example:

```rust
// BACKFILL_BATCH_BUFFER_SIZE = 5
if self.batches.iter().filter(...).count() >= BACKFILL_BATCH_BUFFER_SIZE {
return None;  // ← REJECT
}
// ... later adds batch via Entry::Vacant(entry).insert(...)
```

Without the `>` being changed to a `>=` , we would allow 6. The same idea applies to all changes proposed.


Co-Authored-By: Antoine James <antoine@ethereum.org>

Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>

Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-02-23 02:02:56 +00:00
Jimmy Chen
8d4af658bd Remove unreachable void pattern for ConnectionLimits (#8871)
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-02-20 04:27:33 +00:00
Jimmy Chen
4588971085 Add sync batch state metrics (#8847)
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-02-19 01:57:53 +00:00
Pawan Dhananjay
561898fc1c Process head_chains in descending order of number of peers (#8859)
N/A


  Another find by @gitToki. Sort the preferred_ids in descending order as originally intended from the comment in the function.


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
2026-02-19 00:38:56 +00:00
Michael Sproul
fab77f4fc9 Skip payload_invalidation tests prior to Bellatrix (#8856)
Fix the failure of the beacon-chain tests for phase0/altair, which now only runs nightly.


  Just skip the payload invalidation tests, they don't make any sense prior to Bellatrix anyway.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2026-02-18 21:55:13 +00:00
Jimmy Chen
da141a8c49 Merge remote-tracking branch 'origin/release-v8.1' into unstable 2026-02-18 17:54:48 +11:00
Pawan Dhananjay
c5b4580e37 Return correct variant for snappy errors (#8841)
N/A


  Handle snappy crate errors as InvalidData instead of IoError.


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
2026-02-18 04:17:07 +00:00
Jimmy Chen
691c8cf8e6 Fix duplicate data columns in DataColumnsByRange responses (#8843)
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-02-18 04:16:57 +00:00
Akihito Nakano
c61665b3a1 Penalize peers that send an invalid rpc request (#6986)
Since https://github.com/sigp/lighthouse/pull/6847, invalid `BlocksByRange`/`BlobsByRange` requests, which do not comply with the spec, are [handled in the Handler](3d16d1080f/beacon_node/lighthouse_network/src/rpc/handler.rs (L880-L911)). Any peer that sends an invalid request is penalized and disconnected.

However, other kinds of invalid rpc request, which result in decoding errors, are just dropped. No penalty is applied and the connection with the peer remains.


  I have added handling for the `ListenUpgradeError` event to notify the application of an `RPCError:InvalidData` error and disconnect to the peer that sent the invalid rpc request.

I also added tests for handling invalid rpc requests.


Co-Authored-By: ackintosh <sora.akatsuki@gmail.com>
2026-02-18 14:43:59 +11:00
0xMushow
9065e4a56e fix(beacon_node): add pruning of observed_column_sidecars (#8531)
None


  I noticed that `observed_column_sidecars` is missing its prune call in the finalization handler, which results in a memory leak on long-running nodes (very slow (**7MB/day**)) :

13dfa9200f/beacon_node/beacon_chain/src/canonical_head.rs (L940-L959)

Both caches use the same generic type `ObservedDataSidecars<T>:`
22ec4b3271/beacon_node/beacon_chain/src/beacon_chain.rs (L413-L416)

The type's documentation explicitly requires manual pruning:

>  "*The cache supports pruning based upon the finalized epoch. It does not automatically prune, you must call Self::prune manually.*"


b4704eab4a/beacon_node/beacon_chain/src/observed_data_sidecars.rs (L66-L74)

Currently:
- `observed_blob_sidecars` => pruned
- `observed_column_sidecars` => **NOT** pruned

Without pruning, the underlying HashMap accumulates entries indefinitely, causing continuous memory growth until the node restarts.


Co-Authored-By: Antoine James <antoine@ethereum.org>
2026-02-18 12:38:05 +11:00
Jimmy Chen
4625cb6ab6 Gloas local block building cleanup (#8834)
Continuation of #8754, some small cleanups and address TODOs


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-02-17 09:22:16 +00:00
Eitan Seri-Levi
eec0700f94 Gloas local block building MVP (#8754)
The flow for local block building is
1. Create execution payload and bid
2. Construct beacon block
3. Sign beacon block and publish
4. Sign execution payload and publish

This PR adds the beacon block v4 flow , GET payload envelope and POST payload envelope (local block building only). The spec for these endpoints can be found here:  https://github.com/ethereum/beacon-APIs/pull/552  and is subject to change.

We needed a way to store the unsigned execution payload envelope associated to the execution payload bid that was included in the block. I introduced a new cache that stores these unsigned execution payload envelopes. the GET payload envelope queries this cache directly so that a proposer, after publishing a block, can fetch the payload envelope + sign and publish it.

I kept payload signing and publishing within the validators block service to keep things simple for now. The idea was to build out a block production MVP for devnet 0, try not to affect any non gloas code paths and build things out in such a way that it will be easy to deprecate pre-gloas code paths later on (for example block production v2 and v3).

We will eventually need to track which beacon node was queried for the block so that we can later query it for the payload. But thats not needed for the devnet.


  


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
2026-02-17 02:09:35 +00:00
Mac L
945f6637c5 Remove reqwest re-exports from eth2 (#8829)
Remove `reqwest` from being re-exported within `eth2`


Co-Authored-By: Mac L <mjladson@pm.me>
2026-02-16 16:05:54 +00:00
Michael Sproul
48a2b2802d Delete OnDiskConsensusContext (#8824)
Remove the `OnDiskConsensusContext` type which is no longer used at all after the merge of:

- https://github.com/sigp/lighthouse/pull/8724

This type was not necessary since the merge of Lion's change which removed the on-disk storage:

- https://github.com/sigp/lighthouse/pull/5891


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2026-02-16 02:49:46 +00:00
Eitan Seri-Levi
68ad9758a3 Gloas attestation verification (#8705)
https://github.com/ethereum/consensus-specs/blob/master/specs/gloas/p2p-interface.md#attestation-subnets

Implements attestation verification logic for Gloas and adds a few gloas related tests. Note that a few of these tests rely on gloas test harness block production which hasn't been built out yet. So for now those tests are ignored.


  


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
2026-02-13 21:39:56 +00:00
Lion - dapplion
f4a6b8d9b9 Tree-sync friendly lookup sync tests (#8592)
- Step 0 of the tree-sync roadmap https://github.com/sigp/lighthouse/issues/7678

Current lookup sync tests are written in an explicit way that assume how the internals of lookup sync work. For example the test would do:

- Emit unknown block parent message
- Expect block request for X
- Respond with successful block request
- Expect block processing request for X
- Response with successful processing request
- etc..

This is unnecessarily verbose. And it will requires a complete re-write when something changes in the internals of lookup sync (has happened a few times, mostly for deneb and fulu).

What we really want to assert is:

- WHEN: we receive an unknown block parent message
- THEN: Lookup sync can sync that block
- ASSERT: Without penalizing peers, without unnecessary retries


  Keep all existing tests and add new cases but written in the new style described above. The logic to serve and respond to request is in this function `fn simulate` 2288a3aeb1/beacon_node/network/src/sync/tests/lookups.rs (L301)
- It controls peer behavior based on a `CompleteStrategy` where you can set for example "respond to BlocksByRoot requests with empty"
- It actually runs beacon processor messages running their clousures. Now sync tests actually import blocks, increasing the test coverage to the interaction of sync and the da_checker.
- To achieve the above the tests create real blocks with the test harness. To make the tests as fast as before, I disabled crypto with `TestConfig`

Along the way I found a couple bugs, which I documented on the diff.


Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2026-02-13 04:24:51 +00:00
Mac L
c59e4a0cee Disable legacy-arith by default in consensus/types (#8695)
Currently, `consensus/types` cannot build with `no-default-features` since we use "legacy" standard arithmetic operations.


  - Remove the offending arithmetic to fix compilation.
- Rename `legacy-arith` to `saturating-arith` and disable it by default.


Co-Authored-By: Mac L <mjladson@pm.me>
2026-02-12 20:51:39 +00:00
Mac L
036ba1f221 Add network feature to eth2 (#8558)
This reverts some of the changes from #8524 by adding back the typed network endpoints with an optional `network` feature. Without the `network` feature, these endpoints (and associated dependencies) will not be built.

This means the `enr`, `multiaddr` and `libp2p-identity` dependencies have returned but are now optional


Co-Authored-By: Mac L <mjladson@pm.me>
2026-02-12 20:51:26 +00:00
Sergey Yakovlev
96bc5617d0 fix: auto-populate ENR UDP port from discovery listen port (#8804)
Co-Authored-By: Sergey Yakovlev <selfuryon@pm.me>
2026-02-12 19:33:00 +00:00
radik878
711971f269 fix: cache slot in check_block_relevancy to prevent TOCTOU (#8776)
Co-Authored-By: radik878 <radikpadik76@gmail.com>
2026-02-11 23:45:50 +00:00
Lion - dapplion
d7c78a7f89 rename --reconstruct-historic-states to --archive (#8795)
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2026-02-11 23:45:48 +00:00
Lion - dapplion
8d72cc34eb Add sync request metrics (#7790)
Add error rates metrics on unstable to benchmark against tree-sync. In my branch there are frequent errors but mostly connections errors as the node is still finding it set of stable peers.

These metrics are very useful and unstable can benefit from them ahead of tree-sync


  Add three new metrics:
- sync_rpc_requests_success_total: Total count of sync RPC requests successes
- sync_rpc_requests_error_total: Total count of sync RPC requests errors
- sync_rpc_request_duration_sec: Time to complete a successful sync RPC requesst


Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2026-02-10 23:40:01 +00:00
Akihito Nakano
889946c04b Remove pending requests from ready_requests (#6625)
Co-Authored-By: ackintosh <sora.akatsuki@gmail.com>

Co-Authored-By: João Oliveira <hello@jxs.pt>
2026-02-10 23:14:28 +00:00
Akihito Nakano
e1d3dcc8dc Penalize peers that send an invalid rpc request (#6986)
Since https://github.com/sigp/lighthouse/pull/6847, invalid `BlocksByRange`/`BlobsByRange` requests, which do not comply with the spec, are [handled in the Handler](3d16d1080f/beacon_node/lighthouse_network/src/rpc/handler.rs (L880-L911)). Any peer that sends an invalid request is penalized and disconnected.

However, other kinds of invalid rpc request, which result in decoding errors, are just dropped. No penalty is applied and the connection with the peer remains.


  I have added handling for the `ListenUpgradeError` event to notify the application of an `RPCError:InvalidData` error and disconnect to the peer that sent the invalid rpc request.

I also added tests for handling invalid rpc requests.


Co-Authored-By: ackintosh <sora.akatsuki@gmail.com>
2026-02-10 22:49:20 +00:00
Eitan Seri-Levi
56eb81a5e0 Implement weak subjectivity safety checks (#7347)
Closes #7273


  https://github.com/ethereum/consensus-specs/pull/4179


Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>

Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>

Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2026-02-10 21:52:52 +00:00
João Oliveira
0c9f97f015 Remove libp2p multiaddress (#8683)
Co-Authored-By: João Oliveira <hello@jxs.pt>

Co-Authored-By: ackintosh <sora.akatsuki@gmail.com>
2026-02-09 23:31:49 +00:00
Michael Sproul
a3f04bd2d2 Merge v8.1.0 (stable) into unstable 2026-02-10 08:27:43 +11:00
Pawan Dhananjay
b8d098685f Record metrics for only valid gossip blocks (#8723)
N/A


  Fixes the issue where we were setting block observed timings for blocks that were potentially gossip invalid.
Thanks @gitToki for the find


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>

Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
2026-02-09 05:23:44 +00:00
Michael Sproul
c4437d2927 Fix regression test for #8528 (#8771)
I accidentally broke `unstable` while merging some missed commits from `release-v8.0`. The merge was clean but semantically broken, and I didn't notice because I pushed without running CI 😬


  - Fix the regression test added for #8528, for compatibility with the recent `RpcBlock` changes. I'm passing `is_available = false` which seems correct for this test.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2026-02-09 01:31:50 +00:00
Michael Sproul
532d8918c8 Merge remote-tracking branch 'origin/release-v8.0' into unstable 2026-02-09 09:18:10 +11:00
Mac L
2ad02510cd Remove syn version 1 (#8678)
#8547


  This updates a few of our crates to remove the older `syn 1` crate.

This updates:
- `criterion` -> `0.8`
- `itertools` -> `0.14`

And also certain `sigp` crates:
- `xdelta3` -> [`fe39066`](fe3906605c)
- `superstruct` -> `0.10.1`
- `ethereum_ssz` -> `0.10.1`
- `tree_hash` -> `0.12.1`
- `metastruct` -> `0.1.4`
- `context_deserialize` -> `0.2.1`
- `compare_fields` -> `0.1.1`


Co-Authored-By: Mac L <mjladson@pm.me>
2026-02-07 01:18:04 +00:00
Mac L
e4bc650097 Use if-addrs instead of local_ip_address (#8659)
Swaps out the `local_ip_address` dependency for `if-addrs`. The reason for this is that is that `local_ip_address` is a relatively heavy dependency (depends on `neli`) compared to `if-addrs` and we only use it to check the presence of an IPv6 interface. This is an experiment to see if we can use the more lightweight `if-addrs` instead.


Co-Authored-By: Mac L <mjladson@pm.me>
2026-02-06 15:40:09 +00:00
Pawan Dhananjay
e50bab098e Remove state lru cache (#8724)
N/A


  In https://github.com/sigp/lighthouse/pull/4801 , we added a state lru cache to avoid having too many states in memory which was a concern with 200mb+ states pre tree-states.
With https://github.com/sigp/lighthouse/pull/5891 , we made the overflow cache a simpler in memory lru cache that can only hold 32 pending states at the most and doesn't flush anything to disk. As noted in #5891, we can always fetch older blocks which never became available over rpc if they become available later.

Since we merged tree states, I don't think the state lru cache is relevant anymore. Instead of having the `DietAvailabilityPendingExecutedBlock` that stores only the state root, we can just store the full state in the `AvailabilityPendingExecutedBlock`.
Given entries in the cache can span max 1 epoch (cache size is 32), the underlying `BeaconState` objects in the cache share most of their memory. The state_lru_cache is one level of indirection that doesn't give us any benefit.
Please check me on this cc @dapplion


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
2026-02-04 04:55:53 +00:00
Jimmy Chen
1dd0f7bcbb Remove kzg_commitments from DataColumnSidecarGloas (#8739)
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
2026-02-04 03:37:05 +00:00
Eitan Seri-Levi
39727aa406 Move KZG commitments from payload envelope to payload bid and spec alpha.2 (#8725)
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
2026-02-04 02:52:40 +00:00
hopinheimer
bd1966353a Use events API to eager send attestations (#7892)
Co-Authored-By: hopinheimer <knmanas6@gmail.com>

Co-Authored-By: hopinheimer <48147533+hopinheimer@users.noreply.github.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>

Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
2026-02-04 01:40:16 +00:00
Michael Sproul
d42327bb86 Implement Gloas withdrawals and refactor (#8692)
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>

Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
2026-02-03 07:36:20 +00:00
Eitan Seri-Levi
ed7354d460 Payload envelope db operations (#8717)
Adds support for payload envelopes in the db. This is the minimum we'll need to store and fetch payloads.


  


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
2026-02-03 05:46:10 +00:00
Michael Sproul
940fa81a5b Fast path for /eth/v1/beacon/blocks/head/root (#8729)
Closes:

- https://github.com/sigp/lighthouse/issues/8667


  Use the `early_attester_cache` to serve the head block root (if present). This should be faster than waiting for the head to finish importing.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2026-02-02 06:41:43 +00:00
Eitan Seri-Levi
3ecf964385 Replace INTERVALS_PER_SLOT with explicit slot component times (#7944)
https://github.com/ethereum/consensus-specs/pull/4476


  


Co-Authored-By: Barnabas Busa <barnabas.busa@ethereum.org>

Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>

Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>

Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2026-02-02 05:58:42 +00:00
Jimmy Chen
cd8049a696 Emit NewHead SSE event earlier in block import (#8718)
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-01-29 07:39:05 +00:00
Eitan Seri-Levi
b202e98dd9 Gloas gossip boilerplate (#8700)
All the required boilerplate for gloas gossip. We'll include the gossip message processing logic in a separate PR


  


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-01-28 07:12:48 +00:00