Commit Graph

3474 Commits

Author SHA1 Message Date
Jimmy Chen
3826fe91f4 Improve data column KZG verification metric buckets (#7717)
The current data column KZG verification buckets are not giving us useful info as the upper bound is too low. And we see most of numbers above 70ms for batch verification, and we don't know how much time it really takes.

This PR improves the buckets based on the numbers we got from testing. Exponential bucket seems like a good candidate here given we're expecting to increase blob count with a similar approach (possibly 2x each fork if it goes well).
2025-07-10 20:59:45 +00:00
Michael Sproul
538067f1ff Merge remote-tracking branch 'origin/stable' into unstable 2025-07-10 15:53:45 +10:00
Michael Sproul
cfb1f73310 Release v7.1.0 (#7609)
Post-Pectra release for tree-states hot 🎉

Already merged to `release-v7.1.0`:

- https://github.com/sigp/lighthouse/pull/7444
- https://github.com/sigp/lighthouse/pull/6750
- https://github.com/sigp/lighthouse/pull/7437
- https://github.com/sigp/lighthouse/pull/7133
- https://github.com/sigp/lighthouse/pull/7620
- https://github.com/sigp/lighthouse/pull/7663
2025-07-10 01:44:46 +00:00
João Oliveira
8b5ccacac9 Error from RPC send_response when request doesn't exist on the active inbound requests (#7663)
Lighthouse is currently loggign a lot errors in the `RPC` behaviour whenever a response is received for a request_id that no longer exists in active_inbound_requests. This is likely due to a data race or timing issue (e.g., the peer disconnecting before the response is handled).
This PR addresses that by removing the error logging from the RPC layer. Instead, RPC::send_response now simply returns an Err, shifting the responsibility to the main service. The main service can then determine whether the peer is still connected and only log an error if the peer remains connected.
Thanks @ackintosh for helping debug!
2025-07-09 14:26:51 +00:00
Michael Sproul
8e55684b06 Reintroduce --logfile with deprecation warning (#7723)
Reintroduce the `--logfile` flag with a deprecation warning so that it doesn't prevent nodes from starting. This is considered preferable to breaking node startups so that users fix the flag, even though it means the `--logfile` flag is completely ineffective.

The flag was initially removed in:

- https://github.com/sigp/lighthouse/pull/6339
2025-07-09 08:08:27 +00:00
Michael Sproul
7b2f138ca7 Merge remote-tracking branch 'origin/stable' into release-v7.1.0 2025-07-09 11:19:16 +10:00
Michael Sproul
b9c1a2b0c0 Fix description of DB read bytes metric (#7716)
Fix a trivial typo that mixed up reads and writes.
2025-07-08 08:50:15 +00:00
Eitan Seri-Levi
bd8a2a8ffb Gossip recently computed light client data (#7023) 2025-07-08 07:07:10 +00:00
Jimmy Chen
56485cc986 Remove unneeded spans that caused debug logs to appear when level is set to info (#7707)
Fixes #7155.

It turns out the issue is caused by calling a function that creates an info span (`chain.id()` here), e.g.

```rust
debug!(id = chain.id(), ?sync_type, reason = ?remove_reason, op, "Chain removed");
```

I've remove all unneeded spans, especially getter functions - there's little reasons for span and they often get used in logging. We should also revisit all the spans after the release - i think we could make them more useful than they are today.

I've let it run for a while and no longer seeing any `DEBUG` logs.
2025-07-08 00:37:54 +00:00
Pawan Dhananjay
0f895f3066 Bump default gas limit (#7695)
N/A


  Bump the default gas limit to 45 million based on recommendation from EL teams https://x.com/vdWijden/status/1939234101631856969 and pandas https://ethpandaops.io/posts/gaslimit-scaling/
2025-07-04 22:54:30 +00:00
Michael Sproul
c7bb3b00e4 Fix lookups of the block at oldest_block_slot (#7693)
Closes:

- https://github.com/sigp/lighthouse/issues/7690


  Another checkpoint sync related fix! See issue for a description of the bug.

We fix it by just loading the block root of the `oldest_block_slot`, rather than trying to load the slot prior, which will always fail.
2025-07-02 23:40:04 +00:00
Jimmy Chen
b35854b71f Record v2 beacon blocks http api metrics separately (#7692)
This PR adds v2 beacon block paths to the function that records http api usage, so they don't just get recorded as "/v2/beacon" like below:

<img width="934" alt="image" src="https://github.com/user-attachments/assets/8b669f0a-2821-46ee-a30a-0e344d3e63c1" />
2025-07-02 08:47:35 +00:00
Michael Sproul
a459a9af98 Fix and test checkpoint sync from genesis (#7689)
Fix a bug involving checkpoint sync from genesis reported by Sunnyside labs.


  Ensure that the store's `anchor` is initialised prior to storing the genesis state. In the case of checkpoint sync from genesis, the genesis state will be in the _hot DB_, so we need the hot DB metadata to be initialised in order to store it.

I've extended the existing checkpoint sync tests to cover this case as well. There are some subtleties around what the `state_upper_limit` should be set to in this case. I've opted to just enable state reconstruction from the start in the test so it gets set to 0, which results in an end state more consistent with the other test cases (full state reconstruction). This is required because we can't meaningfully do any state reconstruction when the split slot is 0 (there is no range of frozen slots to reconstruct).
2025-07-02 04:50:33 +00:00
Jimmy Chen
fcc602a787 Update fulu network configs and add MIN_EPOCHS_FOR_DATA_COLUMN_SIDECARS_REQUESTS (#7646)
- #6240
- Bring built-in network configs up to date with latest consensus-spec PeerDAS configs.
- Add `MIN_EPOCHS_FOR_DATA_COLUMN_SIDECARS_REQUESTS` and use it to determine data availability window after the Fulu fork.
2025-07-02 02:38:25 +00:00
Pawan Dhananjay
69c9c7038a Use prepare_beacon_proposer endpoint for validator custody registration (#7681)
N/A


  This PR switches to using `prepare_beacon_proposer` instead of `beacon_committee_subscriptions` endpoint to register validators with the custody context.

We currently use the `beacon_committee_subscriptions` endpoint for registering validators in the custody context.
Using the subscriptions endpoint has a few disadvantages:
1. The lighthouse VC tries to optimise the number of calls it makes to this endpoint to reduce the load on the subscriptions endpoint. So we would be getting different a subset of the total number of validators in each call. This will lead to a ramp up of the validator custody units instead of a one time bump. For e.g. see these logs
```
Jun 30 22:36:05.012 DEBUG Validator count at head updated               old_count: 0, new_count: 19
Jun 30 22:36:11.016 DEBUG Validator count at head updated               old_count: 19, new_count: 24
Jun 30 22:36:17.017 DEBUG Validator count at head updated               old_count: 24, new_count: 27
Jun 30 22:36:23.020 DEBUG Validator count at head updated               old_count: 27, new_count: 32
Jun 30 22:36:29.016 DEBUG Validator count at head updated               old_count: 32, new_count: 36
Jun 30 22:36:35.005 DEBUG Validator count at head updated               old_count: 36, new_count: 42
Jun 30 22:36:41.014 DEBUG Validator count at head updated               old_count: 42, new_count: 44
Jun 30 22:36:47.017 DEBUG Validator count at head updated               old_count: 44, new_count: 46
Jun 30 22:36:53.007 DEBUG Validator count at head updated               old_count: 46, new_count: 48
Jun 30 22:36:59.009 DEBUG Validator count at head updated               old_count: 48, new_count: 49
Jun 30 22:37:05.014 DEBUG Validator count at head updated               old_count: 49, new_count: 50
Jun 30 22:37:11.007 DEBUG Validator count at head updated               old_count: 50, new_count: 53
Jun 30 22:37:17.007 DEBUG Validator count at head updated               old_count: 53, new_count: 55
Jun 30 22:37:35.008 DEBUG Validator count at head updated               old_count: 55, new_count: 58
Jun 30 22:37:41.007 DEBUG Validator count at head updated               old_count: 58, new_count: 59
Jun 30 22:37:53.010 DEBUG Validator count at head updated               old_count: 59, new_count: 60
Jun 30 22:38:05.013 DEBUG Validator count at head updated               old_count: 60, new_count: 61
Jun 30 22:38:23.006 DEBUG Validator count at head updated               old_count: 61, new_count: 62
Jun 30 22:38:29.009 DEBUG Validator count at head updated               old_count: 62, new_count: 63
Jun 30 22:38:41.009 DEBUG Validator count at head updated               old_count: 63, new_count: 64
```
2. Different VCs would probably have different behaviours in terms of sending subscriptions

In contrast, the `prepare_beacon_proposer` endpoint usage would be more standard across different VCs without any filtering of validators. Not doing so could mean potentially missing proposals so VCs are incentivised to make this call on any change in the validators managed by them.
Lighthouse calls this endpoint every slot.
2025-07-02 02:38:22 +00:00
Jimmy Chen
41742ce2bd Update SAMPLES_PER_SLOT to be number of custody groups instead of data columns (#7683)
Update `SAMPLES_PER_SLOT` to be number of custody groups instead of data columns. This should have no impact on the current implementation as config currently maintains a `group:subnet:column` ratio of `1:1:1`.  **In short, this PR doesn't change anything for Fusaka, but ensures compliance with the spec and potential future changes.**

I've added separate methods to compute sampling columns and custody groups for clarity: `spec.sampling_size_columns` and `spec.sampling_size_custod_groups`

See the clarifications in this PR for more details: https://github.com/ethereum/consensus-specs/pull/4251
2025-07-02 00:08:40 +00:00
Pawan Dhananjay
e305cb1b92 Custody persist fix (#7661)
N/A


  Persist the epoch -> cgc values. This is to ensure that `ValidatorRegistrations::latest_validator_custody_requirement` always returns a `Some` value post restart assuming the `epoch_validator_custody_requirements` map has been updated in the previous runs.
2025-07-01 06:06:37 +00:00
Michael Sproul
c1f94d9b7b Test database schema stability (#7669)
This PR implements some heuristics to check for breaking database changes. The goal is to prevent accidental changes to the database schema occurring without a version bump.
2025-06-30 10:55:47 +00:00
Jimmy Chen
e45ba846ae Increase http client default timeout to 2s in http-api tests. (#7673)
The `sync_contributions_across_fork_with_skip_slots` test have been quite flaky recently on CI, we suspect it might be caused by the recent introduction of a `default` timeout in #7400, and CI is failing to consistently process those http requests within 1s likely due to limited resources.

This PR increases the `default` timeout to 2s in tests to avoid flaky tests, but keeps the remaining timeout the same (1s).

https://github.com/sigp/lighthouse/actions/runs/15965113170/job/45023976021

```
FAIL [   8.945s] http_api::bn_http_api_tests fork_tests::sync_contributions_across_fork_with_skip_slots
stdout ───

running 1 test
test fork_tests::sync_contributions_across_fork_with_skip_slots ... FAILED

failures:

failures:
fork_tests::sync_contributions_across_fork_with_skip_slots

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 175 filtered out; finished in 8.91s

stderr ───

thread 'fork_tests::sync_contributions_across_fork_with_skip_slots' panicked at beacon_node/http_api/tests/fork_tests.rs:239:10:
called `Result::unwrap()` on an `Err` value: HttpClient(url: http://127.0.0.1:41793/, kind: timeout, detail: operation timed out)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
2025-06-30 10:47:49 +00:00
Michael Sproul
6be646ca11 Bump DB schema to v25 (#7666)
When we removed the eth1 data, I wrote a v25 schema upgrade to delete the data on disk:

- https://github.com/sigp/lighthouse/pull/7133

However, I forgot to update the current schema version, so this change was never actioned.


  This PR updates the current schema version to v25 so that the migration runs.
2025-06-30 05:52:28 +00:00
Nikola Ristić
2d759f78be Fix beacon_chain metrics descriptions (#6576)
n/a


  Just fixed a small mistake in the metrics description.
2025-06-30 00:11:14 +00:00
Odinson
6ea5f14b39 feat: better error message for light_client/bootstrap endpoint (#7597)
Fixes #7229
2025-06-28 03:55:39 +00:00
Jimmy Chen
522e00f48d Fix incorrect waker update condition (#7656)
This bug was first found and partially fixed by @VolodymyrBg in #7317 - this PR applies the same fix everywhere else.

The old logic updated the waker when it already matched the context, and did nothing when it was stale:

```rust
if waker.will_wake(cx.waker()) {
self.waker = Some(cx.waker().clone());
}
```

This is the wrong way around. We only want to update the waker if it doesn't match the current context:

```rust
if !waker.will_wake(cx.waker()) {
self.waker = Some(cx.waker().clone());
}
```

I don't think we've ever noticed any issues, but it’s a subtle bug that could lead to missed wakeups.
2025-06-27 19:01:46 +00:00
Jimmy Chen
83cad25d98 Fix Rust 1.88 clippy errors & execution engine tests (#7657)
Fix Rust 1.88 clippy errors.
2025-06-27 18:21:17 +00:00
Pawan Dhananjay
9b1f3ed9d1 Add gossip check (#7652)
N/A


  Add an additional gossip condition.
2025-06-27 00:26:38 +00:00
chonghe
8e3c5d1524 Rust 1.89 compiler lint fix (#7644)
Fix lints for Rust 1.89 beta compiler
2025-06-25 05:33:17 +00:00
Eitan Seri-Levi
56b2d4b525 Remove instrumenting log level (#7636)
https://github.com/sigp/lighthouse/issues/7155


  Theres some additional places we set instrumenting log levels that wasn't covered in #7620
2025-06-24 06:29:10 +00:00
chonghe
cef04ee2ee Implement validator_identities Beacon API endpoint (#7462)
* #7442
2025-06-23 08:37:49 +00:00
Pawan Dhananjay
3fefda68e5 Send byrange responses in the correct requested range (#7611)
N/A


  For responding to by_range requests , we should ideally only respond with items in the range `req.start_slot()..req.start_slot() + req.count`.
We were not filtering the generated response for blobs and data columns, only for blocks. This PR adds the filtering for the sidecars as well.
2025-06-23 06:57:37 +00:00
Pawan Dhananjay
11bcccb353 Remove all prod eth1 related code (#7133)
N/A


  After the electra fork which includes EIP 6110, the beacon node no longer needs the eth1 bridging mechanism to include new deposits as they are provided by the EL as a `deposit_request`. So after electra + a transition period where the finalized bridge deposits pre-fork are included through the old mechanism, we no longer need the elaborate machinery we had to get deposit contract data from the execution layer.

Since holesky has already forked to electra and completed the transition period, this PR basically checks to see if removing all the eth1 related logic leads to any surprises.
2025-06-23 03:00:07 +00:00
Age Manning
d50924677a Remove instrumenting log level (#7620)
I think this should resolve #7155


This removes the level field from the instrumenting we were doing across a range of functions. The level will now default to the level of the log.
2025-06-20 07:44:59 +00:00
Eitan Seri-Levi
f67084a571 Remove reprocess channel (#7437)
Partially https://github.com/sigp/lighthouse/issues/6291


  This PR removes the reprocess event channel from being externally exposed. All work events are now sent through the single `BeaconProcessorSend` channel. I've introduced a new `Work::Reprocess` enum variant which we then use to schedule jobs for reprocess. I've also created a new scheduler module which will eventually house the different scheduler impls.

This is all needed as an initial step to generalize the beacon processor

A "full" implementation for the generalized beacon processor can be found here
https://github.com/sigp/lighthouse/pull/6448

I'm going to try to break up the full implementation into smaller PR's so it can actually be reviewed
2025-06-20 02:52:16 +00:00
Lion - dapplion
dd98534158 Hierarchical state diffs in hot DB (#6750)
This PR implements https://github.com/sigp/lighthouse/pull/5978 (tree-states) but on the hot DB. It allows Lighthouse to massively reduce its disk footprint during non-finality and overall I/O in all cases.

Closes https://github.com/sigp/lighthouse/issues/6580

Conga into https://github.com/sigp/lighthouse/pull/6744

### TODOs

- [x] Fix OOM in CI https://github.com/sigp/lighthouse/pull/7176
- [x] optimise store_hot_state to avoid storing a duplicate state if the summary already exists (should be safe from races now that pruning is cleaner)
- [x] mispelled: get_ancenstor_state_root
- [x] get_ancestor_state_root should use state summaries
- [x] Prevent split from changing during ancestor calc
- [x] Use same hierarchy for hot and cold

### TODO Good optimization for future PRs

- [ ] On the migration, if the latest hot snapshot is aligned with the cold snapshot migrate the diffs instead of the full states.
```
align slot  time
10485760    Nov-26-2024
12582912    Sep-14-2025
14680064    Jul-02-2026
```

### TODO Maybe things good to have

- [ ] Rename anchor_slot https://github.com/sigp/lighthouse/compare/tree-states-hot-rebase-oom...dapplion:lighthouse:tree-states-hot-anchor-slot-rename?expand=1
- [ ] Make anchor fields not public such that they must be mutated through a method. To prevent un-wanted changes of the anchor_slot

### NOTTODO

- [ ] Use fork-choice and a new method [`descendants_of_checkpoint`](ca2388e196 (diff-046fbdb517ca16b80e4464c2c824cf001a74a0a94ac0065e635768ac391062a8)) to filter only the state summaries that descend of finalized checkpoint]
2025-06-19 02:43:25 +00:00
Eitan Seri-Levi
6786b9d12a Single attestation "Full" implementation (#7444)
#6970


  This allows for us to receive `SingleAttestation` over gossip and process it without converting. There is still a conversion to `Attestation` as a final step in the attestation verification process, but by then the `SingleAttestation` is fully verified.

I've also fully removed the `submitPoolAttestationsV1` endpoint as its been deprecated

I've also pre-emptively deprecated supporting `Attestation` in `submitPoolAttestationsV2` endpoint. See here for more info: https://github.com/ethereum/beacon-APIs/pull/531

I tried to the minimize the diff here by only making the "required" changes. There are some unnecessary complexities with the way we manage the different attestation verification wrapper types. We could probably consolidate this to one wrapper type and refactor this even further. We could leave that to a separate PR if we feel like cleaning things up in the future.

Note that I've also updated the test harness to always submit `SingleAttestation` regardless of fork variant. I don't see a problem in that approach and it allows us to delete more code :)
2025-06-17 09:01:26 +00:00
Jimmy Chen
3d2d65bf8d Advertise --advertise-false-custody-group-count for testing PeerDAS (#7593)
#6973
2025-06-16 11:10:28 +00:00
Jimmy Chen
6135f417a2 Add data columns sidecars debug beacon API (#7591)
Beacon API spec PR: https://github.com/ethereum/beacon-APIs/pull/537
2025-06-15 14:20:16 +00:00
Akihito Nakano
dc5f5af3eb Fix flaky test_rpc_block_reprocessing (#7595)
The test occasionally fails, likely because the 10ms fixed delay after block processing isn't insufficient when the system is under load.

https://github.com/sigp/lighthouse/pull/7522#issuecomment-2914595667


  Replace single assertion with retry loop.
2025-06-14 00:54:19 +00:00
Daniel Knopik
ccd99c138c Wait before column reconstruction (#7588) 2025-06-13 18:19:06 +00:00
Jimmy Chen
a65f78222d Drop stale registrations without reducing CGC (#7594)
Currently the validator effective balance used for computing PeerDAS custody group count is only updated when the validator subscribes to the BN via  `validator/beacon_committee_subscriptions`.

If a validator stops registering with the node, the effective balance gets outdated and stays in the BN memory until the next restart. They are no longer required for CGC computation, as long as the CGC never reduces as per the spec, therefore they can be dropped.
2025-06-13 14:30:43 +00:00
Daniel Knopik
5472cb8500 Batch verify KZG proofs for getBlobsV2 (#7582) 2025-06-12 14:35:14 +00:00
Pawan Dhananjay
9803d69d80 Implement status v2 version (#7590)
N/A


  Implements status v2 as defined in https://github.com/ethereum/consensus-specs/pull/4374/
2025-06-12 07:17:06 +00:00
Pawan Dhananjay
5f208bb858 Implement basic validator custody framework (no backfill) (#7578)
Resolves #6767


  This PR implements a basic version of validator custody.
- It introduces a new `CustodyContext` object which contains info regarding number of validators attached to a node and  the custody count they contribute to the cgc.
- The `CustodyContext` is added in the da_checker and has methods for returning the current cgc and the number of columns to sample at head. Note that the logic for returning the cgc existed previously in the network globals.
- To estimate the number of validators attached, we use the `beacon_committee_subscriptions` endpoint. This might overestimate the number of validators actually publishing attestations from the node in the case of multi BN setups. We could also potentially use the `publish_attestations` endpoint to get a more conservative estimate at a later point.
- Anytime there's a change in the `custody_group_count` due to addition/removal of validators, the custody context should send an event on a broadcast channnel. The only subscriber for the channel exists in the network service which simply subscribes to more subnets. There can be additional subscribers in sync that will start a backfill once the cgc changes.

TODO

- [ ] **NOT REQUIRED:** Currently, the logic only handles an increase in validator count and does not handle a decrease. We should ideally unsubscribe from subnets when the cgc has decreased.
- [ ] **NOT REQUIRED:** Add a service in the `CustodyContext` that emits an event once `MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS ` passes after updating the current cgc. This event should be picked up by a subscriber which updates the enr and metadata.
- [x] Add more tests
2025-06-11 18:10:06 +00:00
Pawan Dhananjay
076a1c3fae Data column sidecar event (#7587)
N/A


  Implement events for data column sidecar https://github.com/ethereum/beacon-APIs/pull/535
2025-06-11 16:39:22 +00:00
Jimmy Chen
8c6abc0b69 Optimise parallelism in compute cells operations by zipping first (#7574)
We're seeing slow KZG performance on `fusaka-devnet-0` and looking for optimisations to improve performance.

Zipping the list first then `into_par_iter` shows a 10% improvement in performance benchmark, i suspect this might be even more material when running on a beacon node.

Before:
```
blobs_to_data_column_sidecars_20
time:   [11.583 ms 12.041 ms 12.534 ms]
Found 5 outliers among 100 measurements (5.00%)
```

After:
```
blobs_to_data_column_sidecars_20
time:   [10.506 ms 10.724 ms 10.982 ms]
change: [-14.925% -10.941% -6.5452%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
```
2025-06-09 12:41:14 +00:00
ethDreamer
b08d49c4cb Changes for fusaka-devnet-1 (#7559)
Changes for [fusaka-devnet-1](https://notes.ethereum.org/@ethpandaops/fusaka-devnet-1)


  [Consensus Specs v1.6.0-alpha.1](https://github.com/ethereum/consensus-specs/pull/4346)
* [EIP-7917: Deterministic Proposer Lookahead](https://eips.ethereum.org/EIPS/eip-7917)
* [EIP-7892: Blob Parameter Only Hardforks](https://eips.ethereum.org/EIPS/eip-7892)
2025-06-09 09:10:08 +00:00
Lion - dapplion
d457ceeaaf Don't create child lookup if parent is faulty (#7118)
Issue discovered on PeerDAS devnet (node `lighthouse-geth-2.peerdas-devnet-5.ethpandaops.io`). Summary:

- A lookup is created for block root `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`
- That block or a parent is faulty and `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` is added to the failed chains cache
- We later receive a block that is a child of a child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`
- We create a lookup, which attempts to process the child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` and hit a processor error `UnknownParent`, hitting this line

bf955c7543/beacon_node/network/src/sync/block_lookups/mod.rs (L686-L688)

`search_parent_of_child` does not create a parent lookup because the parent root is in the failed chain cache. However, we have **already** marked the child as awaiting the parent. This results in an inconsistent state of lookup sync, as there's a lookup awaiting a parent that doesn't exist.

Now we have a lookup (the child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`) that is awaiting a parent lookup that doesn't exist: hence stuck.

### Impact

This bug can affect Mainnet as well as PeerDAS devnets.

This bug may stall lookup sync for a few minutes (up to `LOOKUP_MAX_DURATION_STUCK_SECS = 15 min`) until the stuck prune routine deletes it. By that time the root will be cleared from the failed chain cache and sync should succeed. During that time the user will see a lot of `WARN` logs when attempting to add each peer to the inconsistent lookup. We may also sync the block through range sync if we fall behind by more than 2 epochs. We may also create the parent lookup successfully after the failed cache clears and complete the child lookup.

This bug is triggered if:
- We have a lookup that fails and its root is added to the failed chain cache (much more likely to happen in PeerDAS networks)
- We receive a block that builds on a child of the block added to the failed chain cache


  Ensure that we never create (or leave existing) a lookup that references a non-existing parent.

I added `must_use` lints to the functions that create lookups. To fix the specific bug we must recursively drop the child lookup if the parent is not created. So if `search_parent_of_child` returns `false` now return `LookupRequestError::Failed` instead of `LookupResult::Pending`.

As a bonus I have a added more logging and reason strings to the errors
2025-06-05 08:53:43 +00:00
Jimmy Chen
357a8ccbb9 Checkpoint sync without the blobs from Fulu (#7549)
Lighthouse currently requires checkpoint sync to be performed against a supernode in a PeerDAS network, as only supernodes can serve blobs.

This PR lifts that requirement, enabling Lighthouse to checkpoint sync from either a fullnode or a supernode (See https://github.com/sigp/lighthouse/issues/6837#issuecomment-2933094923)

Missing data columns for the checkpoint block isn't a big issue, but we should be able to easily implement backfill once we have the logic to backfill data columns.
2025-06-04 00:31:27 +00:00
ethDreamer
ae30480926 Implement EIP-7892 BPO hardforks (#7521)
[EIP-7892: Blob Parameter Only Hardforks](https://eips.ethereum.org/EIPS/eip-7892)

#7467
2025-06-02 06:54:42 +00:00
Jimmy Chen
94a1446ac9 Fix unexpected blob error and duplicate import in fetch blobs (#7541)
Getting this error on a non-PeerDAS network:

```
May 29 13:30:13.484 ERROR Error fetching or processing blobs from EL    error: BlobProcessingError(AvailabilityCheck(Unexpected("empty blobs"))), block_root: 0x98aa3927056d453614fefbc79eb1f9865666d1f119d0e8aa9e6f4d02aa9395d9
```

It appears we're passing an empty `Vec` to DA checker, because all blobs were already seen on gossip and filtered out, this causes a `AvailabilityCheckError::Unexpected("empty blobs")`.

I've added equivalent unit tests for `getBlobsV1` to cover all the scenarios we test in `getBlobsV2`. This would have caught the bug if I had added it earlier. It also caught another bug which could trigger duplicate block import.

Thanks Santito for reporting this! 🙏
2025-06-02 01:51:09 +00:00
Jimmy Chen
4d21846aba Prevent AvailabilityCheckError when there's no new custody columns to import (#7533)
Addresses a regression recently introduced when we started gossip verifying data columns from EL blobs

```
failures:
network_beacon_processor::tests::accept_processed_gossip_data_columns_without_import

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 90 filtered out; finished in 16.60s

stderr ───

thread 'network_beacon_processor::tests::accept_processed_gossip_data_columns_without_import' panicked at beacon_node/network/src/network_beacon_processor/tests.rs:829:10:
should put data columns into availability cache: Unexpected("empty columns")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```

https://github.com/sigp/lighthouse/actions/runs/15309278812/job/43082341868?pr=7521

If an empty `Vec` is passed to the DA checker, it causes an unexpected error.

This PR addresses it by not passing an empty `Vec` for processing, and not spawning a task to publish.
2025-05-29 02:54:34 +00:00