Commit Graph

7108 Commits

Author SHA1 Message Date
Eitan Seri-Levi
33e21634cb Custody backfill sync (#7907)
#7603


  #### Custody backfill sync service
Similar in many ways to the current backfill service. There may be ways to unify the two services. The difficulty there is that the current backfill service tightly couples blocks and their associated blobs/data columns. Any attempts to unify the two services should be left to a separate PR in my opinion.

#### `SyncNeworkContext`
`SyncNetworkContext` manages custody sync data columns by range requests separetly from other sync RPC requests. I think this is a nice separation considering that custody backfill is its own service.

#### Data column import logic
The import logic verifies KZG committments and that the data columns block root matches the block root in the nodes store before importing columns

#### New channel to send messages to `SyncManager`
Now external services can communicate with the `SyncManager`. In this PR this channel is used to trigger a custody sync. Alternatively we may be able to use the existing `mpsc` channel that the `SyncNetworkContext` uses to communicate with the `SyncManager`. I will spend some time reviewing this.


Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>

Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2025-10-22 03:51:34 +00:00
Eitan Seri-Levi
46dde9afee Fix data column rpc request (#8247)
Fixes an issue mentioned in this comment regarding data column rpc requests:
https://github.com/sigp/lighthouse/issues/6572#issuecomment-3400076236


  


Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>

Co-Authored-By: Michael Sproul <micsproul@gmail.com>
2025-10-21 23:54:35 +00:00
Michael Sproul
21bab0899a Improve block header signature handling (#8253)
Closes:

- https://github.com/sigp/lighthouse/issues/7650


  Reject blob and data column sidecars from RPC with invalid signatures.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-10-21 13:58:12 +00:00
chonghe
040d992132 Add version to the response of beacon API getPendingConsolidations (#8251)
* #7440


  


Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io>
2025-10-21 13:58:10 +00:00
Jimmy Chen
66f88f6bb4 Use millis_from_slot_start when comparing against reconstruction deadline (#8246)
This recent PR below changes the max reconstruction delay to be a function of slot time. However it uses `seconds_from_slot_start` when comparing (and dropping `nano`), so it might delay reconstruction on networks where the slot time isn’t a multiple of 4, e.g. on gnosis this only happens at 2s instead of 1.25s.:
- https://github.com/sigp/lighthouse/pull/8067#discussion_r2443875068


  Use `millis_from_slot_start` when comparing against reconstruction deadline

Also added some tests for reconstruction delay.


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-10-21 02:24:43 +00:00
Pawan Dhananjay
092aaae961 Sync cleanups (#8230)
N/A


  1. In the batch retry logic, we were failing to set the batch state to `AwaitingDownload` before attempting a retry. This PR sets it to `AwaitingDownload` before the retry and sets it back to `Downloading` if the retry suceeded in sending out a request
2. Remove all peer scoring logic from retrying and rely on just de priorotizing the failed peer. I finally concede the point to @dapplion 😄
3. Changes `block_components_by_range_request` to accept `block_peers` and `column_peers`. This is to ensure that we use the full synced peerset for requesting columns in order to avoid splitting the column peers among multiple head chains. During forward sync, we want the block peers to be the peers from the syncing chain and column peers to be all synced peers from the peerdb.
Also, fixes a typo and calls `attempt_send_awaiting_download_batches` from more places


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
2025-10-20 11:50:00 +00:00
Jimmy Chen
c012f46cb9 Fix get_header JSON deserialization. (#8228)
#8224


  Please list or describe the changes introduced by this PR.


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-10-20 07:10:40 +00:00
chonghe
2b30c96f16 Avoid attempting to serve blobs after Fulu fork (#7756)
* #7122


  


Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io>

Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>
2025-10-20 06:29:21 +00:00
Jimmy Chen
da93b89e90 Feature gate test CLI flags (#8231)
Closes #6980


  I think these flags may be useful in future peerdas / das testing, and would be useful to keep. Hence I've gated them behind a `testing` feature flag.


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-10-20 03:14:16 +00:00
Michael Sproul
2f8587301d More proposer shuffling cleanup (#8130)
Addressing more review comments from:

- https://github.com/sigp/lighthouse/pull/8101

I've also tweaked a few more things that I think are minor bugs.


  - Instrument `ensure_state_can_determine_proposers_for_epoch`
- Fix `block_root` usage in `compute_proposer_duties_from_head`. This was a regression introduced in 8101 😬 .
- Update the `state_advance_timer` to prime the next-epoch proposer cache post-Fulu.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-10-20 03:14:14 +00:00
Odinson
79716f6ec1 Max reconstruction delay as a function of slot time (#8067)
Fixes #8054


  


Co-Authored-By: PoulavBhowmick03 <bpoulav@gmail.com>
2025-10-17 08:49:13 +00:00
Jimmy Chen
76a37a0aef Revert incorrect fix made in #8179 (#8215)
This PR reverts #8179.

It turns out that the fix was invalid because an unknown root is always not a finalized descendant:

522bd9e9c6/consensus/proto_array/src/proto_array.rs (L976-L979)

so for any data columns with unknown parents, it will always penalise the gossip peer and disconnect it pretty quickly. On a small network, the node may lose all of its peers.

The impact is pretty obvious when the peer count is small and sync speed is slow, and is therefore easily reproducible by running a fresh supernode on devnet-3.

This isn't as obvious on a live testnet like holesky / sepolia, we haven't noticed this, probably due to its high peer count and sync speed - the nodes might be able to reach head quickly before losing too many peers.


  The previous behaviour isn't ideal but safe:  triggering unknown parent lookup and penalise the bad peer if it happens to be malicious or faulty. So for now it's safer to revert the change and plan for a proper fix after the v8 release.


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-10-16 23:25:30 +00:00
Mac L
f13d0615fd Add eip_3076 crate (#8206)
#7894


  Moves the `Interchange` format from `slashing_protection` and thus removes the dependency on `slashing_protection` from `eth2` which can now just depend on the slimmer `eip_3076` crate.


Co-Authored-By: Mac L <mjladson@pm.me>
2025-10-16 16:10:42 +00:00
SunnysidedJ
d1e06dc40d #6853 Adding store tests for data column pruning (#7228)
#6853 Update store tests to cover data column pruning


  Created a helper function `check_data_column_existence` which is a copy of `check_blob_existence` but checking data columns instead.
The helper function is then used to check whether data columns are also pruned when blobs are pruned if PeerDAS is enabled.


Co-Authored-By: SunnysidedJ <j@testinprod.io>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>

Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-10-16 15:20:26 +00:00
Pawan Dhananjay
73e75e3e69 Ignore extra columns in da cache (#8201)
N/A


  Found this issue in sepolia. Note: the custody requirement for this node is 100.
```
Oct 14 11:25:40.053 DEBUG Reconstructed columns                         count: 28, block_root: 0x4d7946dec0ab59f2afd46610d7c54af555cb4c2851d9eea7d83dd17cf6e96aae, slot: 8725628
Oct 14 11:25:45.568 WARN  Internal availability check failure           block_root: 0x4d7946dec0ab59f2afd46610d7c54af555cb4c2851d9eea7d83dd17cf6e96aae, error: Unexpected("too many columns got 128 expected 100")
```

So if any of the block components arrives late, then we reconstruct all 128 columns and try to add it to da cache and have more columns than needed for availability in the cache.

There are 2 ways I can think of fixing this:
1. pass only the required columns to the da cache after reconstruction here 60df5f4ab6/beacon_node/beacon_chain/src/data_availability_checker.rs (L647-L648)
2. Ensure that we add only columns that we need to sample in the da cache. I think this is safer since we can add columns to the cache from multiple code paths and this fixes it at the source.

~~This PR implements (2).~~ Thought more about it, I think (1) is cleaner since we filter gossip and rpc columns also before calling `put_kzg_verified_data_columns`/


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
2025-10-16 09:25:44 +00:00
Mac L
345faf52cb Remove safe_arith and import from crates.io (#8191)
Use the recently published `safe_arith` and remove it from Lighthouse
https://crates.io/crates/safe_arith


Co-Authored-By: Mac L <mjladson@pm.me>
2025-10-15 06:03:46 +00:00
Jimmy Chen
5886a48d96 Add max_blobs_per_block check to data column gossip validation (#8198)
Addresses this spec change
https://github.com/ethereum/consensus-specs/pull/4650

Add `max_blobs_per_block` to gossip data column check so we reject large columns before processing. (we currently do this check during processing)


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-10-15 01:52:35 +00:00
Eitan Seri-Levi
60df5f4ab6 Downgrade light client error logs (#8196)
Temporary stop gap for #7002


  Downgrade light client errors to debug

We eventually should fix our light client objects so they can consist of data across forks.


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
2025-10-14 03:18:50 +00:00
Jimmy Chen
1fb94ce432 Release v8.0.0-rc.1 (#8185) v8.0.0-rc.1 2025-10-13 20:32:43 +11:00
Pawan Dhananjay
2c328e32a6 Persist only custody columns in db (#8188)
* Only persist custody columns

* Get claude to write tests

* lint

* Address review comments and fix tests.

* Use supernode only when building chain segments

* Clean up

* Rewrite tests.

* Fix tests

* Clippy

---------

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2025-10-13 20:32:13 +11:00
Jimmy Chen
178df7a7d6 Fix duplicate fields being logged when the field exists in both the span and the event (#8183)
Closes #7995.

Fix duplicate fields being logged when the field exists in both the span and the event. Prefer event fields when this happens.

```
Sep 15 08:13:46.339 WARN State cache missed state_root: 0xc34826ff7794de63a553832b7aff13572d1c716b9e03d5ef7b29649adf98abe2, block_root: 0xf16d3f5b4cc6ec876b7faeccd9f2d4102dc56ed32e828754b62601637910ec1f, state_root: 0xc34826ff7794de63a553832b7aff13572d1c716b9e03d5ef7b29649adf98abe2, block_root: 0xf16d3f5b4cc6ec876b7faeccd9f2d4102dc56ed32e828754b62601637910ec1f
```

becomes

```
Sep 15 08:13:46.339 WARN State cache missed state_root: 0xc34826ff7794de63a553832b7aff13572d1c716b9e03d5ef7b29649adf98abe2, block_root: 0xf16d3f5b4cc6ec876b7faeccd9f2d4102dc56ed32e828754b62601637910ec1f
```


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-10-13 01:12:46 +00:00
Michael Sproul
0c9fdea28d Update ForkName::latest_stable to Fulu for tests (#8181)
Update `ForkName::latest_stable` to Fulu, reflecting our plan to stabilise Fulu in the immediate future!

This will lead to some more tests running with Fulu rather than Electra.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-10-09 13:53:51 +00:00
Jimmy Chen
538b70495c Reject data columns that does not descend from finalize root instead of ignoring it (#8179)
This issue was identified during the fusaka audit competition.

The [`verify_parent_block_and_finalized_descendant`](62d9302e0f/beacon_node/beacon_chain/src/data_column_verification.rs (L606-L627)) in data column gossip verification currently load the parent first before checking if the column descends from the finalized root.

However, the `fork_choice.get_block(&block_parent_root)` function also make the same check internally:

8a4f6cf0d5/consensus/fork_choice/src/fork_choice.rs (L1242-L1249)

Therefore, if the column does not descend from the finalized root, we return an `UnknownParent` error, before hitting the `is_finalized_checkpoint_or_descendant` check just below.

Which means we `IGNORE` the gossip message instead `REJECT`, and the gossip peer is not _immediately_ penalised. This deviates from the spec.

However, worth noting that lighthouse will currently attempt to request the parent from this peer, and if the peer is not able to serve the parent, it gets penalised with a `LowToleranceError`, and will get banned after ~5 occurences.

ffa7b2b2b9/beacon_node/network/src/sync/network_context.rs (L1530-L1532)

This PR will penalise the bad peer immediately instead of performing block lookups before penalising it.


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-10-09 07:32:43 +00:00
chonghe
3110ca325b Implement /eth/v1/beacon/blobs endpoint (#8103)
* #8085


  


Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io>

Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>
2025-10-09 05:01:30 +00:00
Pawan Dhananjay
8e382ceed9 Bump kzg library versions (#8174)
N/A


  Update c-kzg and rust-eth-kzg to their latest versions. Also removes the patch version hardcoding in Cargo.toml.


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
2025-10-09 01:47:05 +00:00
Michael Sproul
13dfa9200f Block proposal optimisations (#8156)
Closes:

- https://github.com/sigp/lighthouse/issues/4412

This should reduce Lighthouse's block proposal times on Holesky and prevent us getting reorged.


  - [x] Allow the head state to be advanced further than 1 slot. This lets us avoid epoch processing on hot paths including block production, by having new epoch boundaries pre-computed and available in the state cache.
- [x] Use the finalized state to prune the op pool. We were previously using the head state and trying to infer slashing/exit relevance based on `exit_epoch`. However some exit epochs are far in the future, despite occurring recently.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-10-08 06:09:12 +00:00
Jimmy Chen
2a433bc406 Remove deprecated CLI flags and references for v8.0.0 (#8142)
Closes #8131

- [x] Remove deprecated flags from beacon_node/src/cli.rs:
- [x] eth1-purge-cache
- [x] eth1-blocks-per-log-query
- [x] eth1-cache-follow-distance
- [x] disable-deposit-contract-sync
- [x] light-client-server
- [x] Remove deprecated flags from lighthouse/src/main.rs:
- [x] logfile
- [x] terminal-total-difficulty-override
- [x] terminal-block-hash-override
- [x] terminal-block-hash-epoch-override
- [x] safe-slots-to-import-optimistically
- [x] Remove references to deprecated flags in config.rs files
- [x] Remove warning messages for deprecated flags in main.rs
- [x] Update/remove related tests in beacon_node.rs


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>
2025-10-08 01:52:41 +00:00
Michael Sproul
b5c2a9668e Quote BeaconState::proposer_lookahead in JSON repr (#8167)
Use quoted integers for `state.proposer_lookahead` when serializing JSON. This is standard for all integer fields, but was missed for the newly added proposer lookahead. I noticed this issue while inspecting the head state on a local devnet.

I'm glad we found this before someone reported it :P


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-10-08 00:05:41 +00:00
Pawan Dhananjay
a4ad3e492f Fallback to getPayload v1 if v2 fails (#8163)
N/A


  Post fulu, we should be calling the v2 api on the relays that doesn't return the blobs/data columns.

However, we decided to start hitting the v2 api as soon as fulu is scheduled to avoid unexpected surprises at the fork.
In the ACDT call, it seems like most clients are calling v2 only after the fulu fork.
This PR aims to be the best of both worlds where we fallback to hitting v1 api if v2 fails. This way, we know beforehand if relays don't support it and can potentially alert them.


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
2025-10-07 14:32:41 +00:00
Eitan Seri-Levi
4eb89604f8 Fulu ASCII art (#8151)
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
2025-10-07 14:32:35 +00:00
Jimmy Chen
ff8b514b3f Remove unnecessary warning logs and update logging levels (#8145)
@michaelsproul noticed this warning on a devnet-3 node

```
Oct 01 16:37:29.896 WARN  Error when importing rpc custody columns      error: ParentUnknown { parent_root: 0xe4cc85a2137b76eb083d7076255094a90f10caaec0afc8fd36807db742f6ff13 }, block_hash: 0x43ce63b2344990f5f4d8911b8f14e3d3b6b006edc35bbc833360e667df0edef7
```

We're also seeing similar `WARN` logs for blobs on our live nodes.

It's normal to get parent unknown in lookups and it's handled here
a134d43446/beacon_node/network/src/sync/block_lookups/mod.rs (L611-L619)

These shouldn't be a `WARN`, and we also log the same error in block lookups at `DEBUG` level here:
a134d43446/beacon_node/network/src/sync/block_lookups/mod.rs (L643-L648)

So i've removed these extra WARN logs.

I've also lower the level of an `ERROR` log when unable to serve data column root requests - it's unexpected, but is unlikely to impact the nodes performance, so I think we can downgrade this.


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-10-06 19:26:37 +00:00
Michael Sproul
26575c594c Improve spec compliance for /eth/v1/config/spec API (#8144)
- [x] Remove the unnecessary `_MILLIS` suffix from `MAXIMUM_GOSSIP_CLOCK_DISPARITY`
- [x] Add missing Deneb preset `KZG_COMMITMENT_INCLUSION_PROOF_DEPTH`, not to be confused with `KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH` (plural) from Fulu...


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-10-01 09:29:15 +00:00
Mac L
af5cbfbd44 Bump superstruct to 0.10.0 (#8133)
Bump `superstruct` to the latest release `0.10.0`.
This version uses a later version of `darling` which is helpful for https://github.com/sigp/lighthouse/pull/8125


Co-Authored-By: Mac L <mjladson@pm.me>
2025-09-30 07:42:27 +00:00
Michael Sproul
9c6d33110b Update book for DB schema v28 (#8132)
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-09-30 05:10:42 +00:00
Jimmy Chen
e5b4983d6b Release v8.0.0 rc.0 (#8127)
Testnet release for the upcoming Fusaka fork.


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>
v8.0.0-rc.0
2025-09-29 02:17:30 +00:00
Michael Sproul
38fdaf791c Fix proposer shuffling decision slot at boundary (#8128)
Follow-up to the bug fixed in:

- https://github.com/sigp/lighthouse/pull/8121

This fixes the root cause of that bug, which was introduced by me in:

- https://github.com/sigp/lighthouse/pull/8101

Lion identified the issue here:

- https://github.com/sigp/lighthouse/pull/8101#discussion_r2382710356


  In the methods that compute the proposer shuffling decision root, ensure we don't use lookahead for the Fulu fork epoch itself. This is accomplished by checking if Fulu is enabled at `epoch - 1`, i.e. if `epoch > fulu_fork_epoch`.

I haven't updated the methods that _compute_ shufflings to use these new corrected bounds (e.g. `BeaconState::compute_proposer_indices`), although we could make this change in future. The `get_beacon_proposer_indices` method already gracefully handles the Fulu boundary case by using the `proposer_lookahead` field (if initialised).


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-09-29 01:13:33 +00:00
Pawan Dhananjay
edcfee636c Fix bug in fork calculation at fork boundaries (#8121)
N/A


  In #8101 , when we modified the logic to get the proposer index post fulu, we seem to have missed advancing the state at the fork boundaries to get the right `Fork` for signature verification.
This led to lighthouse failing all gossip verification right after transitioning to fulu that was observed on the holesky shadow fork
```
Sep 26 14:24:00.088 DEBUG Rejected gossip block                         error: "InvalidSignature(ProposerSignature)", graffiti: "grandine-geth-super-1", slot: 640
Sep 26 14:24:00.099 WARN  Could not verify block for gossip. Rejecting the block  error: InvalidSignature(ProposerSignature)
```

I'm not completely sure this is the correct fix, but this fixes the issue with `InvalidProposerSignature` on the holesky shadow fork.

Thanks to @eserilev for helping debug this


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
2025-09-28 04:03:25 +00:00
Michael Sproul
c754234b2c Fix bugs in proposer calculation post-Fulu (#8101)
As identified by a researcher during the Fusaka security competition, we were computing the proposer index incorrectly in some places by computing without lookahead.


  - [x] Add "low level" checks to computation functions in `consensus/types` to ensure they error cleanly
- [x] Re-work the determination of proposer shuffling decision roots, which are now fork aware.
- [x] Re-work and simplify the beacon proposer cache to be fork-aware.
- [x] Optimise `with_proposer_cache` to use `OnceCell`.
- [x] All tests passing.
- [x] Resolve all remaining `FIXME(sproul)`s.
- [x] Unit tests for `ProtoBlock::proposer_shuffling_root_for_child_block`.
- [x] End-to-end regression test.
- [x] Test on pre-Fulu network.
- [x] Test on post-Fulu network.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-09-26 14:44:50 +00:00
Eitan Seri-Levi
20c6ce4553 Fulu testnet configs (#8117)
Holesky - #8096
Hoodi - #8097
Sepolia - #8099


  Testnet configs for Holesky, Hoodi and Sepolia

Holesky - https://github.com/eth-clients/holesky/pull/132
Hoodi - https://github.com/eth-clients/hoodi/pull/21
Sepolia - https://github.com/eth-clients/sepolia/pull/111


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
2025-09-26 09:12:47 +00:00
Lion - dapplion
ffa7b2b2b9 Only mark block lookups as pending if block is importing from gossip (#8112)
- PR https://github.com/sigp/lighthouse/pull/8045 introduced a regression of how lookup sync interacts with the da_checker.

Now in unstable block import from the HTTP API also insert the block in the da_checker while the block is being execution verified. If lookup sync finds the block in the da_checker in `NotValidated` state it expects a `GossipBlockProcessResult` message sometime later. That message is only sent after block import in gossip.

I confirmed in our node's logs for 4/4 cases of stuck lookups are caused by this sequence of events:
- Receive block through API, insert into da_checker in fn process_block in put_pre_execution_block
- Create lookup and leave in AwaitingDownload(block in processing cache) state
- Block from HTTP API finishes importing
- Lookup is left stuck

Closes https://github.com/sigp/lighthouse/issues/8104


  - https://github.com/sigp/lighthouse/pull/8110 was my initial solution attempt but we can't send the `GossipBlockProcessResult` event from the `http_api` crate without adding new channels, which seems messy.

For a given node it's rare that a lookup is created at the same time that a block is being published. This PR solves https://github.com/sigp/lighthouse/issues/8104 by allowing lookup sync to import the block twice in that case.


Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2025-09-25 03:52:27 +00:00
Jimmy Chen
79b33214ea Only send data coumn subnet discovery requests after peerdas is scheduled (#8109)
#8105 (to be confirmed)

I noticed a large number of failed discovery requests after deploying latest `unstable` to some of our testnet and mainnet nodes. This is because of a recent PeerDAS change to attempt to maintain sufficient peers across data column subnets - this shouldn't be enabled on network without peerdas scheduled, otherwise it will keep retrying discovery on these subnets and never succeed.

Also removed some unused files.


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>
2025-09-25 02:52:07 +00:00
Eitan Seri-Levi
af274029e8 Run reconstruction inside a scoped rayon pool (#8075)
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
2025-09-24 06:37:34 +00:00
Antonio Viggiano
d80c0ff5b5 Use HTTPS for xdelta3 in Cargo.toml (#8094)
No issue


  Use HTTPS for dependency


Co-Authored-By: Antonio Viggiano <agfviggiano@gmail.com>
2025-09-24 01:20:10 +00:00
Eitan Seri-Levi
7a7fe9663c Reduce TARGET_BACKFILL_SLOTS in checkpoint sync test (#8102)
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
2025-09-23 04:37:33 +00:00
Michael Sproul
1dbc4f861b Refine HTTP status logs (#8098)
Ensure that we don't log a warning for HTTP 202s, which are expected on the blinded block endpoints after Fulu.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-09-22 05:03:47 +00:00
Michael Sproul
c1fb060ae1 Merge remote-tracking branch 'origin/stable' into unstable 2025-09-22 11:03:46 +10:00
Jimmy Chen
366fb0ee0d Always upload sim test logs (#8082)
This CI job failed

https://github.com/sigp/lighthouse/actions/runs/17815533375/job/50647915897

But we lost the logs because they aren't uploaded when the job fails. This PR changes the step to always upload job, even in the case of failure.


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-09-19 12:58:46 +00:00
Jimmy Chen
4efe47b3c3 Rename --subscribe-all-data-column-subnets to --supernode and make it visible in help (#8083)
Rename `--subscribe-all-data-column-subnets` to `--supernode` as it's now been officially accepted in the spec. Also make it visible in help in preparation for the fusaka release.

https://github.com/ethereum/consensus-specs/blob/dev/specs/fulu/p2p-interface.md#supernodes


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-09-19 07:01:16 +00:00
Jimmy Chen
78d330e4b7 Consolidate reqresp_pre_import_cache into data_availability_checker (#8045)
This PR consolidates the `reqresp_pre_import_cache` into the `data_availability_checker` for the following reasons:
- the `reqresp_pre_import_cache` suffers from the same TOCTOU bug we had with `data_availability_checker` earlier, and leads to unbounded memory leak, which we have observed over the last 6 months on some nodes.
- the `reqresp_pre_import_cache` is no longer necessary, because we now hold blocks in the `data_availability_checker` for longer since (#7961), and recent blocks can be served from the DA checker.

This PR also maintains the following functionalities
- Serving pre-executed blocks over RPC, and they're now served from the `data_availability_checker` instead.
- Using the cache for de-duplicating lookup requests.


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>
2025-09-19 07:01:13 +00:00
Jimmy Chen
4111bcb39b Use scoped rayon pool for backfill chain segment processing (#7924)
Part of #7866

- Continuation of #7921

In the above PR, we enabled rayon for batch KZG verification in chain segment processing. However, using the global rayon thread pool for backfill is likely to create resource contention with higher-priority beacon processor work.


  This PR introduces a dedicated low-priority rayon thread pool `LOW_PRIORITY_RAYON_POOL` and uses it for processing backfill chain segments.

This prevents backfill KZG verification from using the global rayon thread pool and competing with high-priority beacon processor tasks for CPU resources.

However, this PR by itself doesn't prevent CPU oversubscription because other tasks could still fill up the global rayon thread pool, and having an extra thread pool could make things worse. To address this we need the beacon
processor to coordinate total CPU allocation across all tasks, which is covered in:
- #7789


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
2025-09-18 07:10:23 +00:00