Commit Graph

1028 Commits

Author SHA1 Message Date
Lion - dapplion
d457ceeaaf Don't create child lookup if parent is faulty (#7118)
Issue discovered on PeerDAS devnet (node `lighthouse-geth-2.peerdas-devnet-5.ethpandaops.io`). Summary:

- A lookup is created for block root `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`
- That block or a parent is faulty and `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` is added to the failed chains cache
- We later receive a block that is a child of a child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`
- We create a lookup, which attempts to process the child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` and hit a processor error `UnknownParent`, hitting this line

bf955c7543/beacon_node/network/src/sync/block_lookups/mod.rs (L686-L688)

`search_parent_of_child` does not create a parent lookup because the parent root is in the failed chain cache. However, we have **already** marked the child as awaiting the parent. This results in an inconsistent state of lookup sync, as there's a lookup awaiting a parent that doesn't exist.

Now we have a lookup (the child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`) that is awaiting a parent lookup that doesn't exist: hence stuck.

### Impact

This bug can affect Mainnet as well as PeerDAS devnets.

This bug may stall lookup sync for a few minutes (up to `LOOKUP_MAX_DURATION_STUCK_SECS = 15 min`) until the stuck prune routine deletes it. By that time the root will be cleared from the failed chain cache and sync should succeed. During that time the user will see a lot of `WARN` logs when attempting to add each peer to the inconsistent lookup. We may also sync the block through range sync if we fall behind by more than 2 epochs. We may also create the parent lookup successfully after the failed cache clears and complete the child lookup.

This bug is triggered if:
- We have a lookup that fails and its root is added to the failed chain cache (much more likely to happen in PeerDAS networks)
- We receive a block that builds on a child of the block added to the failed chain cache


  Ensure that we never create (or leave existing) a lookup that references a non-existing parent.

I added `must_use` lints to the functions that create lookups. To fix the specific bug we must recursively drop the child lookup if the parent is not created. So if `search_parent_of_child` returns `false` now return `LookupRequestError::Failed` instead of `LookupResult::Pending`.

As a bonus I have a added more logging and reason strings to the errors
2025-06-05 08:53:43 +00:00
ethDreamer
ae30480926 Implement EIP-7892 BPO hardforks (#7521)
[EIP-7892: Blob Parameter Only Hardforks](https://eips.ethereum.org/EIPS/eip-7892)

#7467
2025-06-02 06:54:42 +00:00
Michael Sproul
7c89b970af Handle attestation validation errors (#7382)
Partly addresses:

- https://github.com/sigp/lighthouse/issues/7379


  Handle attestation validation errors from `get_attesting_indices` to prevent an error log, downscore the peer, and reject the message.
2025-05-27 01:55:17 +00:00
Jimmy Chen
e6ef644db4 Verify getBlobsV2 response and avoid reprocessing imported data columns (#7493)
#7461 and partly #6439.

Desired behaviour after receiving `engine_getBlobs` response:

1. Gossip verify the blobs and proofs, but don't mark them as observed yet. This is because not all blobs are published immediately (due to staggered publishing). If we mark them as observed and not publish them, we could end up blocking the gossip propagation.
2. Blobs are marked as observed _either_ when:
* They are received from gossip and forwarded to the network .
* They are published by the node.

Current behaviour:
-  We only gossip verify `engine_getBlobsV1` responses, but not `engine_getBlobsV2` responses (PeerDAS).
-  After importing EL blobs AND before they're published, if the same blobs arrive via gossip, they will get re-processed, which may result in a re-import.


  1. Perform gossip verification on data columns computed from EL `getBlobsV2` response. We currently only do this for `getBlobsV1` to prevent importing blobs with invalid proofs into the `DataAvailabilityChecker`, this should be done on V2 responses too.
2. Add additional gossip verification to make sure we don't re-process a ~~blob~~ or data column that was imported via the EL `getBlobs` but not yet "seen" on the gossip network. If an "unobserved" gossip blob is found in the availability cache, then we know it has passed verification so we can immediately propagate the `ACCEPT` result and forward it to the network, but without re-processing it.

**UPDATE:** I've left blobs out for the second change mentioned above, as the likelihood and impact is very slow and we haven't seen it enough, but under PeerDAS this issue is a regular occurrence and we do see the same block getting imported many times.
2025-05-26 19:55:58 +00:00
Akihito Nakano
a2797d4bbd Fix formatting errors from cargo-sort (#7512)
[cargo-sort is currently failing on CI](https://github.com/sigp/lighthouse/actions/runs/15198128212/job/42746931918?pr=7025), likely due to new checks introduced in version [2.0.0](https://github.com/DevinR528/cargo-sort/releases/tag/v2.0.0).


  Fixed the errors by running cargo-sort with formatting enabled.
2025-05-23 05:25:56 +00:00
Akihito Nakano
537fc5bde8 Revive network-test logs files in CI (#7459)
https://github.com/sigp/lighthouse/issues/7187


  This PR adds a writer that implements `tracing_subscriber::fmt::MakeWriter`, which writes logs to separate files for each test.
2025-05-22 02:51:22 +00:00
Eitan Seri-Levi
268809a530 Rust clippy 1.87 lint fixes (#7471)
Fix clippy lints for `rustc` 1.87


  clippy complains about `BeaconChainError` being too large. I went on a bit of a boxing spree because of this. We may instead want to `Box` some of the `BeaconChainError` variants?
2025-05-16 05:03:00 +00:00
SunnysidedJ
593390162f peerdas-devnet-7: update DataColumnSidecarsByRoot request to use DataColumnsByRootIdentifier (#7399)
Update DataColumnSidecarsByRoot request to use DataColumnsByRootIdentifier #7377


  As described in https://github.com/ethereum/consensus-specs/pull/4284
2025-05-12 00:20:55 +00:00
Lion - dapplion
a497ec601c Retry custody requests after peer metadata updates (#6975)
Closes https://github.com/sigp/lighthouse/issues/6895

We need sync to retry custody requests when a peer CGC updates. A higher CGC can result in a data column subnet peer count increasing from 0 to 1, allowing requests to happen.


  Add new sync event `SyncMessage::UpdatedPeerCgc`. It's sent by the router when a metadata response updates the known CGC
2025-05-09 08:27:17 +00:00
Jimmy Chen
0f13029c7d Don't publish data columns reconstructed from RPC columns to the gossip network (#7409)
Don't publish data columns reconstructed from RPC columns to the gossip network, as this may result in peer downscoring if we're sending columns from past slots.
2025-05-07 23:24:48 +00:00
Lion - dapplion
beb0ce68bd Make range sync peer loadbalancing PeerDAS-friendly (#6922)
- Re-opens https://github.com/sigp/lighthouse/pull/6864 targeting unstable

Range sync and backfill sync still assume that each batch request is done by a single peer. This assumption breaks with PeerDAS, where we request custody columns to N peers.

Issues with current unstable:

- Peer prioritization counts batch requests per peer. This accounting is broken now, data columns by range request are not accounted
- Peer selection for data columns by range ignores the set of peers on a syncing chain, instead draws from the global pool of peers
- The implementation is very strict when we have no peers to request from. After PeerDAS this case is very common and we want to be flexible or easy and handle that case better than just hard failing everything.


  - [x] Upstream peer prioritization to the network context, it knows exactly how many active requests a peer (including columns by range)
- [x] Upstream peer selection to the network context, now `block_components_by_range_request` gets a set of peers to choose from instead of a single peer. If it can't find a peer, it returns the error `RpcRequestSendError::NoPeer`
- [ ] Range sync and backfill sync handle `RpcRequestSendError::NoPeer` explicitly
- [ ] Range sync: leaves the batch in `AwaitingDownload` state and does nothing. **TODO**: we should have some mechanism to fail the chain if it's stale for too long - **EDIT**: Not done in this PR
- [x] Backfill sync: pauses the sync until another peer joins - **EDIT**: Same logic as unstable

### TODOs

- [ ] Add tests :)
- [x] Manually test backfill sync

Note: this touches the mainnet path!
2025-05-07 02:03:07 +00:00
Lion - dapplion
2aa5d5c25e Make sure to log SyncingChain ID (#7359)
Debugging an sync issue from @pawanjay176 I'm missing some key info where instead of logging the ID of the SyncingChain we just log "Finalized" (the sync type). This looks like some typo or something was lost in translation when refactoring things.

```
Apr 17 12:12:00.707 DEBUG Syncing new finalized chain                   chain: Finalized, component: "range_sync"
```

This log should include more info about the new chain but just logs "Finalized"

```
Apr 17 12:12:00.810 DEBUG New chain added to sync                       peer_id: "16Uiu2HAmHP8QLYQJwZ4cjMUEyRgxzpkJF87qPgNecLTpUdruYbdA", sync_type: Finalized, new_chain: Finalized, component: "range_sync"
```


  - Remove the Display impl and log the ID explicitly for all logs.
- Log more details when creating a new SyncingChain
2025-05-01 19:53:29 +00:00
Jimmy Chen
476f3a593c Add MAX_BLOBS_PER_BLOCK_FULU config (#7161)
Add `MAX_BLOBS_PER_BLOCK_FULU` config.
2025-04-15 00:20:46 +00:00
Lion - dapplion
be68dd24d0 Fix wrong custody column count for lookup blocks (#7281)
Fixes
- https://github.com/sigp/lighthouse/issues/7278


  Don't assume 0 columns for `RpcBlockInner::Block`
2025-04-11 22:00:57 +00:00
Mac L
39eb8145f8 Merge branch 'release-v7.0.0' into unstable 2025-04-11 21:32:24 +10:00
SunnysidedJ
d96b73152e Fix for #6296: Deterministic RNG in peer DAS publish block tests (#7192)
#6296: Deterministic RNG in peer DAS publish block tests


  Made test functions to call publish-block APIs with true for the deterministic RNG boolean parameter while production code with false. This will deterministically shuffle columns for unit tests under broadcast_validation_tests.rs.
2025-04-09 15:35:15 +00:00
Jimmy Chen
759b0612b3 Offloading KZG Proof Computation from the beacon node (#7117)
Addresses #7108

- Add EL integration for `getPayloadV5` and `getBlobsV2`
- Offload proof computation and use proofs from EL RPC APIs
2025-04-08 07:37:16 +00:00
Jimmy Chen
e924264e17 Fullnodes to publish data columns from EL getBlobs (#7258)
Previously only supernode contributes to data column publishing in Lighthouse.

Recently we've [updated the spec](https://github.com/ethereum/consensus-specs/pull/4183) to have full nodes publishing data columns as well, to ensure all nodes contributes to propagation.

This also prevents already imported data columns from being imported again (because we don't "observe" them), and ensures columns that are observed in the [gossip seen cache](d60c24ef1c/beacon_node/beacon_chain/src/data_column_verification.rs (L492)) are forwarded to its peers, rather than being ignored.
2025-04-08 03:20:31 +00:00
Lion - dapplion
d511ca0494 Compute roots for unfinalized by_range requests with fork-choice (#7098)
Includes PRs

- https://github.com/sigp/lighthouse/pull/7058
- https://github.com/sigp/lighthouse/pull/7066

Cleaner for the `release-v7.0.0` branch
2025-04-07 03:16:41 +00:00
Jimmy Chen
7cc64cab83 Add missing error log and remove redundant id field from lookup logs (#6990)
Partially #6989.

This PR adds the missing error log when a batch fails due to issues with converting the response into `RpcBlock`. See the above linked issue for more details.

Adding this log reveals that we're completing range requests with missing columns, hence causing the batch to fail. It looks like we've hit the case where we've received enough stream terminations, but not all columns are returned.

```
Feb 12 06:12:16.558 DEBG Failed to convert range block components into RpcBlock, error: No column for block 0xc5b6c7fa02f5ef603d45819c08c6519f1dba661fd5d44a2fc849d3e7028b6007 index 18, id: 3456/RangeSync/116/3432, service: sync, module: network::sync::network_context:488
```

I've also removed some redundant `id` logging, as the `id` debug representation is difficult to read, and is now being logged as part of `req_id` in a more succinct format (relevant PR: #6914)
2025-04-04 09:01:42 +00:00
Mac L
0e6da0fcaf Merge branch 'release-v7.0.0' into v7-backmerge 2025-04-04 13:32:58 +11:00
Mac L
82d1674455 Rust 1.86.0 lints (#7254)
Implement lints for the new Rust compiler version 1.86.0.
2025-04-04 02:30:22 +00:00
Age Manning
d6cd049a45 RPC RequestId Cleanup (#7238)
I've been working at updating another library to latest Lighthouse and got very confused with RPC request Ids.

There were types that had fields called `request_id` and `id`. And interchangeably could have types `PeerRequestId`, `rpc::RequestId`, `AppRequestId`, `api_types::RequestId` or even `Request.id`.

I couldn't keep track of which Id was linked to what and what each type meant.

So this PR mainly does a few things:
- Changes the field naming to match the actual type. So any field that has an  `AppRequestId` will be named `app_request_id` rather than `id` or `request_id` for example.
- I simplified the types. I removed the two different `RequestId` types (one in Lighthouse_network the other in the rpc) and grouped them into one. It has one downside tho. I had to add a few unreachable lines of code in the beacon processor, which the extra type would prevent, but I feel like it might be worth it. Happy to add an extra type to avoid those few lines.
- I also removed the concept of `PeerRequestId` which sometimes went alongside a `request_id`. There were times were had a `PeerRequest` and a `Request` being returned, both of which contain a `RequestId` so we had redundant information. I've simplified the logic by removing `PeerRequestId` and made a `ResponseId`. I think if you look at the code changes, it simplifies things a bit and removes the redundant extra info.

I think with this PR things are a little bit easier to reasonable about what is going on with all these RPC Ids.

NOTE: I did this with the help of AI, so probably should be checked
2025-04-03 10:10:15 +00:00
Jimmy Chen
80626e58d2 Attempt to fix flaky network tests (#7244) 2025-04-03 04:01:34 +00:00
Michael Sproul
bde0f1ef0b Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 2025-03-29 13:01:58 +11:00
Pawan Dhananjay
54aef2d043 Admin add/remove peer (#7198)
N/A


  Adds endpoints to add and remove trusted peers from the http api. The added peers are trusted peers so they won't be disconnected for bad scores. We try to maintain a connection to the peer in case they disconnect from us by trying to dial it every heartbeat.
2025-03-28 12:59:09 +00:00
Lion - dapplion
6f31d44343 Remove CGC from data_availability checker (#7033)
- Part of https://github.com/sigp/lighthouse/issues/6767

Validator custody makes the CGC and set of sampling columns dynamic. Right now this information is stored twice:
- in the data availability checker
- in the network globals

If that state becomes dynamic we must make sure it is in sync updating it twice, or guarding it behind a mutex. However, I noted that we don't really have to keep the CGC inside the data availability checker. All consumers can actually read it from the network globals, and we can update `make_available` to read the expected count of data columns from the block.
2025-03-26 05:19:51 +00:00
Eitan Seri-Levi
2f37bf4de5 Fix more merge conflicts between unstable and release-v7.0.0 2025-03-24 09:13:28 +11:00
Eitan Seri-Levi
cbf1c04a14 resolve merge conflicts between untstable and release-v7.0.0 2025-03-23 11:09:02 -06:00
Lion - dapplion
ca237652f1 Track request IDs in RangeBlockComponentsRequest (#6998)
Part of
- https://github.com/sigp/lighthouse/issues/6258


  `RangeBlockComponentsRequest` handles a set of by_range requests. It's quite lose on these requests, not tracking them by ID. We want to implement individual request retries, so we must make `RangeBlockComponentsRequest` aware of its requests IDs. We don't want the result of a prior by_range request to affect the state of a future retry. Lookup sync uses this mechanism.

Now `RangeBlockComponentsRequest` tracks:

```rust
pub struct RangeBlockComponentsRequest<E: EthSpec> {
blocks_request: ByRangeRequest<BlocksByRangeRequestId, Vec<Arc<SignedBeaconBlock<E>>>>,
block_data_request: RangeBlockDataRequest<E>,
}

enum RangeBlockDataRequest<E: EthSpec> {
NoData,
Blobs(ByRangeRequest<BlobsByRangeRequestId, Vec<Arc<BlobSidecar<E>>>>),
DataColumns {
requests: HashMap<
DataColumnsByRangeRequestId,
ByRangeRequest<DataColumnsByRangeRequestId, DataColumnSidecarList<E>>,
>,
expected_custody_columns: Vec<ColumnIndex>,
},
}

enum ByRangeRequest<I: PartialEq + std::fmt::Display, T> {
Active(I),
Complete(T),
}
```

I have merged `is_finished` and `Into_responses` into the same function. Otherwise, we need to duplicate the logic to figure out if the requests are done.
2025-03-21 04:07:30 +00:00
Eitan Seri-Levi
27aabe8159 Pseudo finalization endpoint (#7103)
This is a backport of:
-  https://github.com/sigp/lighthouse/pull/7059
- https://github.com/sigp/lighthouse/pull/7071

For:
- https://github.com/sigp/lighthouse/issues/7039


  Introduce a new lighthouse endpoint that allows a user to force a pseudo finalization. This migrates data to the freezer db and prunes sidechains which may help reduce disk space issues on non finalized networks like Holesky

We also ban peers that send us blocks that conflict with the manually finalized checkpoint.

There were some CI fixes in https://github.com/sigp/lighthouse/pull/7071 that I tried including here

Co-authored with: @jimmygchen  @pawanjay176 @michaelsproul
2025-03-18 05:21:05 +00:00
Eitan Seri-Levi
8ce9edc584 Add block ban flag --invalid-block-roots (#7042) 2025-03-17 13:18:22 +00:00
João Oliveira
c095a0a58f update gossipsub to the latest upstream revision (#7130)
I feel it's preferable to do this explicitly by updating the revision on `Cargo.toml` rather than implicitly by letting `Cargo.lock` control the revision of the branch.
2025-03-17 00:22:19 +00:00
Lion - dapplion
a6bdc474db Log range sync download errors (#6991)
Currently range sync download log errors just say `error: rpc_error` which isn't helpful. The actual error is suppressed unless logged somewhere else.


  Log the actual error that caused the batch download to fail as part of the log that states that the batch download failed.
2025-03-13 13:26:43 +00:00
ThreeHrSleep
d60c24ef1c Integrate tracing (#6339)
Tracing Integration
- [reference](5bbf1859e9/projects/project-ideas.md (L297))


  - [x] replace slog & log with tracing throughout the codebase
- [x] implement custom crit log
- [x] make relevant changes in the formatter
- [x] replace sloggers
- [x] re-write SSE logging components

cc: @macladson @eserilev
2025-03-12 22:31:05 +00:00
João Oliveira
f23f984f85 switch to upstream gossipsub (#7057)
We forked `gossipsub` into the lighthouse repo sometime ago so that we could iterate quicker on implementing back pressure and IDONTWANT.
Meanwhile we have pushed all our changes upstream and we are now the main maintainers of `rust-libp2p` this allows us to use upstream `gossipsub` again.
Nonetheless we still use our forked repo to give us freedom to experiment with features before submitting them upstream
2025-03-11 12:59:16 +00:00
Michael Sproul
7d598ed8a5 Optimise status processing (#7082)
This is a backport from `holesky-rescue`.

Part of:

- https://github.com/sigp/lighthouse/issues/7039

Original PR to `holesky-rescue`:

- https://github.com/sigp/lighthouse/pull/7054


  Avoid doing database lookups for slots that lie in the hot database when processing status messages. This avoids a DoS vector during non-finality, as loading hot states to iterate block roots is very expensive.
2025-03-10 04:51:49 +00:00
Michael Sproul
e5e43ecd81 Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 2025-02-24 13:59:40 +11:00
Pawan Dhananjay
b3b6aea1c5 Rust 1.85 lints (#7019)
N/A


  2 changes:
1. Replace Option::map_or(true, ...) with is_none_or(...)
2. Remove unnecessary `Into::into` blocks where the type conversion is apparent from the types
2025-02-24 02:36:13 +00:00
Lion - dapplion
0055af56b6 Unsubscribe blob topics at Fulu fork (#6932)
Addresses #6854.

PeerDAS requires unsubscribing a Gossip topic at a fork boundary. This is not possible with our current topic machinery.


  Instead of defining which topics have to be **added** at a given fork, we define the complete set of topics at a given fork. The new start of the show and key function is:

```rust
pub fn core_topics_to_subscribe<E: EthSpec>(
fork_name: ForkName,
opts: &TopicConfig,
spec: &ChainSpec,
) -> Vec<GossipKind> {
// ...
if fork_name.deneb_enabled() && !fork_name.fulu_enabled() {
// All of deneb blob topics are core topics
for i in 0..spec.blob_sidecar_subnet_count(fork_name) {
topics.push(GossipKind::BlobSidecar(i));
}
}
// ...
}
```

`core_topics_to_subscribe` only returns the blob topics if `fork < Fulu`. Then at the fork boundary, we subscribe with the new fork digest to `core_topics_to_subscribe(next_fork)`, which excludes the blob topics.

I added `is_fork_non_core_topic` to carry on to the next fork the aggregator topics for attestations and sync committee messages. This approach is future-proof if those topics ever become fork-dependent.

Closes https://github.com/sigp/lighthouse/issues/6854
2025-02-11 23:40:14 +00:00
Lion - dapplion
d60388134d Add PeerDAS metrics to track subnets without peers (#6928)
Currently we track a key metric `PEERS_PER_COLUMN_SUBNET` in a getter `good_peers_on_sampling_subnets`. Another PR https://github.com/sigp/lighthouse/pull/6922 deletes that function, so we have to move the metric anyway. This PR moves that metric computation to the metrics spawned task which is refreshed every 5 seconds.

I also added a few more useful metrics. The total set and intended usage is:

- `sync_peers_per_column_subnet`: Track health of overall subnets in your node
- `sync_peers_per_custody_column_subnet`: Track health of the subnets your node needs. We should track this metric closely in our dashboards with a heatmap and bar plot
- ~~`sync_column_subnets_with_zero_peers`: Is equivalent to the Grafana query `count(sync_peers_per_column_subnet == 0) by (instance)`. We may prefer to skip it, but I believe it's the most important metric as if `sync_column_subnets_with_zero_peers > 0` your node stalls.~~
- ~~`sync_custody_column_subnets_with_zero_peers`: `count(sync_peers_per_custody_column_subnet == 0) by (instance)`~~
2025-02-11 06:56:38 +00:00
Lion - dapplion
3992d6ba74 Fix misc PeerDAS todos (#6862)
Address misc PeerDAS TODOs that are not too big for a dedicated PR


  I'll justify each TODO on an inlined comment
2025-02-11 06:07:13 +00:00
Lion - dapplion
d5a03c9d86 Add more range sync tests (#6872)
Currently we have very poor coverage of range sync with unit tests. With the event driven test framework we could cover much more ground and be confident when modifying the code.


  Add two basic cases:
- Happy path, complete a finalized sync for 2 epochs
- Post-PeerDAS case where we start without enough custody peers and later we find enough

⚠️  If you have ideas for more test cases, please let me know! I'll write them
2025-02-10 07:55:22 +00:00
Lion - dapplion
f35213ebe7 Sync active request byrange ids logs (#6914)
- Re-opened PR from https://github.com/sigp/lighthouse/pull/6869

Writing and running tests I noted that the sync RPC requests are very verbose now.

`DataColumnsByRootRequestId { id: 123, requester: Custody(CustodyId { requester: CustodyRequester(SingleLookupReqId { req_id: 121, lookup_id: 101 }) }) }`

Since this Id is logged rather often I believe there's value in
1. Making them more succinct for log verbosity
2. Make them a string that's easy to copy and work with elastic


  Write custom `Display` implementations to render Ids in a more DX format

_ DataColumnsByRootRequestId with a block lookup_

```
123/Custody/121/Lookup/101
```

_DataColumnsByRangeRequestId_

```
123/122/RangeSync/0/5492900659401505034
```

- This one will be shorter after https://github.com/sigp/lighthouse/pull/6868

Also made the logs format and text consistent across all methods
2025-02-10 01:27:05 +00:00
Michael Sproul
2bd5bbdffb Optimise and refine SingleAttestation conversion (#6934)
Closes

- https://github.com/sigp/lighthouse/issues/6805


  - Use a new `WorkEvent::GossipAttestationToConvert` to handle the conversion from `SingleAttestation` to `Attestation` _on_ the beacon processor (prevents a Tokio thread being blocked).
- Improve the error handling for single attestations. I think previously we had no ability to reprocess single attestations for unknown blocks -- we would just error. This seemed to be the case in both gossip processing and processing of `SingleAttestation`s from the HTTP API.
- Move the `SingleAttestation -> Attestation` conversion function into `beacon_chain` so that it can return the `attestation_verification::Error` type, which has well-defined error handling and peer penalties. The now-unused variants of `types::Attestation::Error` have been removed.
2025-02-07 23:18:57 +00:00
chonghe
d6596dbe21 Keep execution payload during historical backfill when prune-payloads set to false (#6766)
- #6510


  - Keep execution payload during historical backfill when `--prune-payloads false` is set
- Add a field in the historical backfill debug log to indicate if execution payload is kept
- Add a test to check historical blocks has execution payload when `--prune-payloads false is set
- Very minor typo correction that I notice when working on this
2025-02-07 09:19:29 +00:00
Lion - dapplion
921d95217d Remove un-used batch sync error condition (#6917)
- PR https://github.com/sigp/lighthouse/pull/6497 made obsolete some consistency checks inside the batch

I forgot to remove the consumers of those errors


  Remove un-used batch sync error condition, which was a nested `Result<_, Result<_, E>>`
2025-02-07 07:48:58 +00:00
Lion - dapplion
2193f6a4d4 Add individual by_range sync requests (#6497)
Part of
- https://github.com/sigp/lighthouse/issues/6258

To address PeerDAS sync issues we need to make individual by_range requests within a batch retriable. We should adopt the same pattern for lookup sync where each request (block/blobs/columns) is tracked individually within a "meta" request that group them all and handles retry logic.


  - Building on https://github.com/sigp/lighthouse/pull/6398

second step is to add individual request accumulators for `blocks_by_range`, `blobs_by_range`, and `data_columns_by_range`. This will allow each request to progress independently and be retried separately.

Most of the logic is just piping, excuse the large diff. This PR does not change the logic of how requests are handled or retried. This will be done in a future PR changing the logic of `RangeBlockComponentsRequest`.

### Before

- Sync manager receives block with `SyncRequestId::RangeBlockAndBlobs`
- Insert block into `SyncNetworkContext::range_block_components_requests`
- (If received stream terminators of all requests)
- Return `Vec<RpcBlock>`, and insert into `range_sync`

### Now

- Sync manager receives block with `SyncRequestId::RangeBlockAndBlobs`
- Insert block into `SyncNetworkContext:: blocks_by_range_requests`
- (If received stream terminator of this request)
- Return `Vec<SignedBlock>`, and insert into `SyncNetworkContext::components_by_range_requests `
- (If received a result for all requests)
- Return `Vec<RpcBlock>`, and insert into `range_sync`
2025-02-05 07:08:28 +00:00
Lion - dapplion
95cec45c38 Use data column batch verification consistently (#6851)
Resolve a `TODO(das)` to use KZG batch verification in `put_rpc_custody_columns`


  Uses `verify_kzg_for_data_column_list_with_scoring` in all paths that send more than one column. To use batch verification and have attributability of which peer is sending a bad column.

Needs to move `verify_kzg_for_data_column_list_with_scoring` into the type's module to convert to the KZG verified type.
2025-02-03 06:07:45 +00:00
Lion - dapplion
55d1e754b4 Subscribe to PeerDAS topics on Fulu fork (#6849)
`TODO(das)` now that PeerDAS is scheduled in a hard fork we can subscribe to its topics on the fork activation. In current stable we subscribe to PeerDAS topics as soon as the node starts if PeerDAS is scheduled.

This PR adds another todo to unsubscribe to blob topics at the fork. This other PR included solution for that, but I can include it in a separate PR
- https://github.com/sigp/lighthouse/pull/5899/files


  Include PeerDAS topics as part of Fulu fork in `fork_core_topics`.
2025-02-03 06:07:39 +00:00