lighthouse

mirror of https://github.com/sigp/lighthouse.git synced 2026-04-17 12:58:31 +00:00

Author	SHA1	Message	Date
Eitan Seri-Levi	6786b9d12a	Single attestation "Full" implementation (#7444 ) #6970 This allows for us to receive `SingleAttestation` over gossip and process it without converting. There is still a conversion to `Attestation` as a final step in the attestation verification process, but by then the `SingleAttestation` is fully verified. I've also fully removed the `submitPoolAttestationsV1` endpoint as its been deprecated I've also pre-emptively deprecated supporting `Attestation` in `submitPoolAttestationsV2` endpoint. See here for more info: https://github.com/ethereum/beacon-APIs/pull/531 I tried to the minimize the diff here by only making the "required" changes. There are some unnecessary complexities with the way we manage the different attestation verification wrapper types. We could probably consolidate this to one wrapper type and refactor this even further. We could leave that to a separate PR if we feel like cleaning things up in the future. Note that I've also updated the test harness to always submit `SingleAttestation` regardless of fork variant. I don't see a problem in that approach and it allows us to delete more code :)	2025-06-17 09:01:26 +00:00
Jimmy Chen	3d2d65bf8d	Advertise `--advertise-false-custody-group-count` for testing PeerDAS (#7593 ) #6973	2025-06-16 11:10:28 +00:00
Jimmy Chen	6135f417a2	Add data columns sidecars debug beacon API (#7591 ) Beacon API spec PR: https://github.com/ethereum/beacon-APIs/pull/537	2025-06-15 14:20:16 +00:00
Akihito Nakano	dc5f5af3eb	Fix flaky test_rpc_block_reprocessing (#7595 ) The test occasionally fails, likely because the 10ms fixed delay after block processing isn't insufficient when the system is under load. https://github.com/sigp/lighthouse/pull/7522#issuecomment-2914595667 Replace single assertion with retry loop.	2025-06-14 00:54:19 +00:00
Daniel Knopik	ccd99c138c	Wait before column reconstruction (#7588 )	2025-06-13 18:19:06 +00:00
Jimmy Chen	a65f78222d	Drop stale registrations without reducing CGC (#7594 ) Currently the validator effective balance used for computing PeerDAS custody group count is only updated when the validator subscribes to the BN via `validator/beacon_committee_subscriptions`. If a validator stops registering with the node, the effective balance gets outdated and stays in the BN memory until the next restart. They are no longer required for CGC computation, as long as the CGC never reduces as per the spec, therefore they can be dropped.	2025-06-13 14:30:43 +00:00
Daniel Knopik	5472cb8500	Batch verify KZG proofs for getBlobsV2 (#7582 )	2025-06-12 14:35:14 +00:00
Pawan Dhananjay	9803d69d80	Implement status v2 version (#7590 ) N/A Implements status v2 as defined in https://github.com/ethereum/consensus-specs/pull/4374/	2025-06-12 07:17:06 +00:00
Pawan Dhananjay	5f208bb858	Implement basic validator custody framework (no backfill) (#7578 ) Resolves #6767 This PR implements a basic version of validator custody. - It introduces a new `CustodyContext` object which contains info regarding number of validators attached to a node and the custody count they contribute to the cgc. - The `CustodyContext` is added in the da_checker and has methods for returning the current cgc and the number of columns to sample at head. Note that the logic for returning the cgc existed previously in the network globals. - To estimate the number of validators attached, we use the `beacon_committee_subscriptions` endpoint. This might overestimate the number of validators actually publishing attestations from the node in the case of multi BN setups. We could also potentially use the `publish_attestations` endpoint to get a more conservative estimate at a later point. - Anytime there's a change in the `custody_group_count` due to addition/removal of validators, the custody context should send an event on a broadcast channnel. The only subscriber for the channel exists in the network service which simply subscribes to more subnets. There can be additional subscribers in sync that will start a backfill once the cgc changes. TODO - [ ] NOT REQUIRED: Currently, the logic only handles an increase in validator count and does not handle a decrease. We should ideally unsubscribe from subnets when the cgc has decreased. - [ ] NOT REQUIRED: Add a service in the `CustodyContext` that emits an event once `MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS ` passes after updating the current cgc. This event should be picked up by a subscriber which updates the enr and metadata. - [x] Add more tests	2025-06-11 18:10:06 +00:00
Pawan Dhananjay	076a1c3fae	Data column sidecar event (#7587 ) N/A Implement events for data column sidecar https://github.com/ethereum/beacon-APIs/pull/535	2025-06-11 16:39:22 +00:00
Jimmy Chen	8c6abc0b69	Optimise parallelism in compute cells operations by zipping first (#7574 ) We're seeing slow KZG performance on `fusaka-devnet-0` and looking for optimisations to improve performance. Zipping the list first then `into_par_iter` shows a 10% improvement in performance benchmark, i suspect this might be even more material when running on a beacon node. Before: ``` blobs_to_data_column_sidecars_20 time: [11.583 ms 12.041 ms 12.534 ms] Found 5 outliers among 100 measurements (5.00%) ``` After: ``` blobs_to_data_column_sidecars_20 time: [10.506 ms 10.724 ms 10.982 ms] change: [-14.925% -10.941% -6.5452%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) ```	2025-06-09 12:41:14 +00:00
ethDreamer	b08d49c4cb	Changes for `fusaka-devnet-1` (#7559 ) Changes for [fusaka-devnet-1](https://notes.ethereum.org/@ethpandaops/fusaka-devnet-1) [Consensus Specs v1.6.0-alpha.1](https://github.com/ethereum/consensus-specs/pull/4346) * [EIP-7917: Deterministic Proposer Lookahead](https://eips.ethereum.org/EIPS/eip-7917) * [EIP-7892: Blob Parameter Only Hardforks](https://eips.ethereum.org/EIPS/eip-7892)	2025-06-09 09:10:08 +00:00
Lion - dapplion	d457ceeaaf	Don't create child lookup if parent is faulty (#7118 ) Issue discovered on PeerDAS devnet (node `lighthouse-geth-2.peerdas-devnet-5.ethpandaops.io`). Summary: - A lookup is created for block root `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` - That block or a parent is faulty and `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` is added to the failed chains cache - We later receive a block that is a child of a child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` - We create a lookup, which attempts to process the child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` and hit a processor error `UnknownParent`, hitting this line `bf955c7543/beacon_node/network/src/sync/block_lookups/mod.rs (L686-L688)` `search_parent_of_child` does not create a parent lookup because the parent root is in the failed chain cache. However, we have already marked the child as awaiting the parent. This results in an inconsistent state of lookup sync, as there's a lookup awaiting a parent that doesn't exist. Now we have a lookup (the child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`) that is awaiting a parent lookup that doesn't exist: hence stuck. ### Impact This bug can affect Mainnet as well as PeerDAS devnets. This bug may stall lookup sync for a few minutes (up to `LOOKUP_MAX_DURATION_STUCK_SECS = 15 min`) until the stuck prune routine deletes it. By that time the root will be cleared from the failed chain cache and sync should succeed. During that time the user will see a lot of `WARN` logs when attempting to add each peer to the inconsistent lookup. We may also sync the block through range sync if we fall behind by more than 2 epochs. We may also create the parent lookup successfully after the failed cache clears and complete the child lookup. This bug is triggered if: - We have a lookup that fails and its root is added to the failed chain cache (much more likely to happen in PeerDAS networks) - We receive a block that builds on a child of the block added to the failed chain cache Ensure that we never create (or leave existing) a lookup that references a non-existing parent. I added `must_use` lints to the functions that create lookups. To fix the specific bug we must recursively drop the child lookup if the parent is not created. So if `search_parent_of_child` returns `false` now return `LookupRequestError::Failed` instead of `LookupResult::Pending`. As a bonus I have a added more logging and reason strings to the errors	2025-06-05 08:53:43 +00:00
Jimmy Chen	357a8ccbb9	Checkpoint sync without the blobs from Fulu (#7549 ) Lighthouse currently requires checkpoint sync to be performed against a supernode in a PeerDAS network, as only supernodes can serve blobs. This PR lifts that requirement, enabling Lighthouse to checkpoint sync from either a fullnode or a supernode (See https://github.com/sigp/lighthouse/issues/6837#issuecomment-2933094923) Missing data columns for the checkpoint block isn't a big issue, but we should be able to easily implement backfill once we have the logic to backfill data columns.	2025-06-04 00:31:27 +00:00
ethDreamer	ae30480926	Implement EIP-7892 BPO hardforks (#7521 ) [EIP-7892: Blob Parameter Only Hardforks](https://eips.ethereum.org/EIPS/eip-7892) #7467	2025-06-02 06:54:42 +00:00
Jimmy Chen	94a1446ac9	Fix unexpected blob error and duplicate import in fetch blobs (#7541 ) Getting this error on a non-PeerDAS network: ``` May 29 13:30:13.484 ERROR Error fetching or processing blobs from EL error: BlobProcessingError(AvailabilityCheck(Unexpected("empty blobs"))), block_root: 0x98aa3927056d453614fefbc79eb1f9865666d1f119d0e8aa9e6f4d02aa9395d9 ``` It appears we're passing an empty `Vec` to DA checker, because all blobs were already seen on gossip and filtered out, this causes a `AvailabilityCheckError::Unexpected("empty blobs")`. I've added equivalent unit tests for `getBlobsV1` to cover all the scenarios we test in `getBlobsV2`. This would have caught the bug if I had added it earlier. It also caught another bug which could trigger duplicate block import. Thanks Santito for reporting this! 🙏	2025-06-02 01:51:09 +00:00
Jimmy Chen	4d21846aba	Prevent `AvailabilityCheckError` when there's no new custody columns to import (#7533 ) Addresses a regression recently introduced when we started gossip verifying data columns from EL blobs ``` failures: network_beacon_processor::tests::accept_processed_gossip_data_columns_without_import test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 90 filtered out; finished in 16.60s stderr ─── thread 'network_beacon_processor::tests::accept_processed_gossip_data_columns_without_import' panicked at beacon_node/network/src/network_beacon_processor/tests.rs:829:10: should put data columns into availability cache: Unexpected("empty columns") note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ``` https://github.com/sigp/lighthouse/actions/runs/15309278812/job/43082341868?pr=7521 If an empty `Vec` is passed to the DA checker, it causes an unexpected error. This PR addresses it by not passing an empty `Vec` for processing, and not spawning a task to publish.	2025-05-29 02:54:34 +00:00
Akihito Nakano	5cda6a6f9e	Mitigate flakiness in test_delayed_rpc_response (#7522 ) https://github.com/sigp/lighthouse/issues/7466 Expanded the margin from 100ms to 500ms.	2025-05-29 01:37:04 +00:00
Mac L	0ddf9a99d6	Remove support for database migrations prior to schema version v22 (#7332 ) Remove deprecated database migrations prior to v22 along with v22 migration specific code.	2025-05-28 13:47:21 +00:00
Akihito Nakano	8989ef8fb1	Enable arithmetic lint in rate-limiter (#7025 ) https://github.com/sigp/lighthouse/issues/6875 - Enabled the linter in rate-limiter and fixed errors. - Changed the type of `Quota::max_tokens` from `u64` to `NonZeroU64` because `max_tokens` cannot be zero. - Added a test to ensure that a large value for `tokens`, which causes an overflow, is handled properly.	2025-05-27 15:43:22 +00:00
Michael Sproul	7c89b970af	Handle attestation validation errors (#7382 ) Partly addresses: - https://github.com/sigp/lighthouse/issues/7379 Handle attestation validation errors from `get_attesting_indices` to prevent an error log, downscore the peer, and reject the message.	2025-05-27 01:55:17 +00:00
Jimmy Chen	e6ef644db4	Verify `getBlobsV2` response and avoid reprocessing imported data columns (#7493 ) #7461 and partly #6439. Desired behaviour after receiving `engine_getBlobs` response: 1. Gossip verify the blobs and proofs, but don't mark them as observed yet. This is because not all blobs are published immediately (due to staggered publishing). If we mark them as observed and not publish them, we could end up blocking the gossip propagation. 2. Blobs are marked as observed _either_ when: * They are received from gossip and forwarded to the network . * They are published by the node. Current behaviour: - ❗ We only gossip verify `engine_getBlobsV1` responses, but not `engine_getBlobsV2` responses (PeerDAS). - ❗ After importing EL blobs AND before they're published, if the same blobs arrive via gossip, they will get re-processed, which may result in a re-import. 1. Perform gossip verification on data columns computed from EL `getBlobsV2` response. We currently only do this for `getBlobsV1` to prevent importing blobs with invalid proofs into the `DataAvailabilityChecker`, this should be done on V2 responses too. 2. Add additional gossip verification to make sure we don't re-process a ~~blob~~ or data column that was imported via the EL `getBlobs` but not yet "seen" on the gossip network. If an "unobserved" gossip blob is found in the availability cache, then we know it has passed verification so we can immediately propagate the `ACCEPT` result and forward it to the network, but without re-processing it. UPDATE: I've left blobs out for the second change mentioned above, as the likelihood and impact is very slow and we haven't seen it enough, but under PeerDAS this issue is a regular occurrence and we do see the same block getting imported many times.	2025-05-26 19:55:58 +00:00
Jimmy Chen	f01dc556d1	Update `engine_getBlobsV2` response type and add `getBlobsV2` tests (#7505 ) Update `engine_getBlobsV2` response type to `Option<Vec<BlobsAndProofV2>>`. See recent spec change [here](https://github.com/ethereum/execution-apis/pull/630). Added some tests to cover basic fetch blob scenarios.	2025-05-26 04:33:34 +00:00
Akihito Nakano	a2797d4bbd	Fix formatting errors from cargo-sort (#7512 ) [cargo-sort is currently failing on CI](https://github.com/sigp/lighthouse/actions/runs/15198128212/job/42746931918?pr=7025), likely due to new checks introduced in version [2.0.0](https://github.com/DevinR528/cargo-sort/releases/tag/v2.0.0). Fixed the errors by running cargo-sort with formatting enabled.	2025-05-23 05:25:56 +00:00
ethDreamer	6af8c187e0	Publish EL Info in Metrics (#7052 ) Since we now know the EL version, we should publish this to our metrics periodically.	2025-05-22 02:51:30 +00:00
Akihito Nakano	cf0f959855	Improve log readability during rpc_tests (#7180 ) It is unclear from the logs during rpc_tests whether the output comes from the sender or the receiver. ``` 2025-03-20T11:21:50.038868Z DEBUG rpc_tests: Sending message 2 2025-03-20T11:21:50.041129Z DEBUG rpc_tests: Sender received a response 2025-03-20T11:21:50.041242Z DEBUG rpc_tests: Chunk received 2025-03-20T11:21:51.040837Z DEBUG rpc_tests: Sending message 3 2025-03-20T11:21:51.043635Z DEBUG rpc_tests: Sender received a response 2025-03-20T11:21:51.043855Z DEBUG rpc_tests: Chunk received 2025-03-20T11:21:52.043427Z DEBUG rpc_tests: Sending message 4 2025-03-20T11:21:52.052831Z DEBUG rpc_tests: Sender received a response 2025-03-20T11:21:52.052953Z DEBUG rpc_tests: Chunk received 2025-03-20T11:21:53.045589Z DEBUG rpc_tests: Sending message 5 2025-03-20T11:21:53.052718Z DEBUG rpc_tests: Sender received a response 2025-03-20T11:21:53.052825Z DEBUG rpc_tests: Chunk received 2025-03-20T11:21:54.049157Z DEBUG rpc_tests: Sending message 6 2025-03-20T11:21:54.058072Z DEBUG rpc_tests: Sender received a response 2025-03-20T11:21:54.058603Z DEBUG rpc_tests: Chunk received 2025-03-20T11:21:55.018822Z DEBUG Swarm::poll: libp2p_gossipsub::behaviour: Starting heartbeat 2025-03-20T11:21:55.018953Z DEBUG Swarm::poll: libp2p_gossipsub::behaviour: Completed Heartbeat 2025-03-20T11:21:55.027100Z DEBUG Swarm::poll: libp2p_gossipsub::behaviour: Starting heartbeat 2025-03-20T11:21:55.027199Z DEBUG Swarm::poll: libp2p_gossipsub::behaviour: Completed Heartbeat ``` Added `info_span` to both the sender and receiver in each test. ``` 2025-03-20T11:20:04.172699Z DEBUG Receiver: rpc_tests: Sending message 2 2025-03-20T11:20:04.179147Z DEBUG Sender: rpc_tests: Sender received a response 2025-03-20T11:20:04.179281Z DEBUG Sender: rpc_tests: Chunk received 2025-03-20T11:20:05.175300Z DEBUG Receiver: rpc_tests: Sending message 3 2025-03-20T11:20:05.177202Z DEBUG Sender: rpc_tests: Sender received a response 2025-03-20T11:20:05.177292Z DEBUG Sender: rpc_tests: Chunk received 2025-03-20T11:20:06.176868Z DEBUG Receiver: rpc_tests: Sending message 4 2025-03-20T11:20:06.179379Z DEBUG Sender: rpc_tests: Sender received a response 2025-03-20T11:20:06.179460Z DEBUG Sender: rpc_tests: Chunk received 2025-03-20T11:20:07.179257Z DEBUG Receiver: rpc_tests: Sending message 5 2025-03-20T11:20:07.181386Z DEBUG Sender: rpc_tests: Sender received a response 2025-03-20T11:20:07.181503Z DEBUG Sender: rpc_tests: Chunk received 2025-03-20T11:20:08.181428Z DEBUG Receiver: rpc_tests: Sending message 6 2025-03-20T11:20:08.190231Z DEBUG Sender: rpc_tests: Sender received a response 2025-03-20T11:20:08.190358Z DEBUG Sender: rpc_tests: Chunk received 2025-03-20T11:20:09.151699Z DEBUG Sender:Swarm::poll: libp2p_gossipsub::behaviour: Starting heartbeat 2025-03-20T11:20:09.151748Z DEBUG Sender:Swarm::poll: libp2p_gossipsub::behaviour: Completed Heartbeat 2025-03-20T11:20:09.160244Z DEBUG Receiver:Swarm::poll: libp2p_gossipsub::behaviour: Starting heartbeat 2025-03-20T11:20:09.160288Z DEBUG Receiver:Swarm::poll: libp2p_gossipsub::behaviour: Completed Heartbeat ```	2025-05-22 02:51:25 +00:00
Akihito Nakano	537fc5bde8	Revive network-test logs files in CI (#7459 ) https://github.com/sigp/lighthouse/issues/7187 This PR adds a writer that implements `tracing_subscriber::fmt::MakeWriter`, which writes logs to separate files for each test.	2025-05-22 02:51:22 +00:00
Pawan Dhananjay	817f14c349	Send execution_requests in fulu (#7500 ) N/A Sends execution requests with fulu builder bid.	2025-05-22 02:51:20 +00:00
Akihito Nakano	a8035d7395	Enable stdout logging in rpc_tests (#7506 ) Currently `test_delayed_rpc_response` is flaky (possibly specific to Windows?), but I'm not sure why. Enabled stdout logging in rpc_tests. Note that in nextest, std output is only displayed when a test fails.	2025-05-22 02:14:48 +00:00
Michael Sproul	2e96e9769b	Use slice.is_sorted now that it's stable (#7507 ) Use slice.is_sorted which was stabilised in Rust 1.82.0 I thought there would be more places we could use this, but it seems we often want to check strict monotonicity (i.e. sorted + no duplicates)	2025-05-22 02:14:46 +00:00
chonghe	7e2df6b602	Empty list `[]` to return all validators balances (#7474 ) The endpoint `/eth/v1/beacon/states/head/validator_balances` returns an empty data when the data field is `[]`. According to the beacon API spec, it should return the balances of all validators: Reference: https://ethereum.github.io/beacon-APIs/#/Beacon/postStateValidatorBalances `If the supplied list is empty (i.e. the body is []) or no body is supplied then balances will be returned for all validators.` This PR changes so that: `curl -X 'POST' 'http://localhost:5052/eth/v1/beacon/states/head/validator_balances' -d '[]' \| jq` returns balances of all validators.	2025-05-20 07:18:29 +00:00
Michael Sproul	805c2dc831	Correct reward denominator in op pool (#5047 ) Closes #5016 The op pool was using the wrong denominator when calculating proposer block rewards! This was mostly inconsequential as our studies of Lighthouse's block profitability already showed that it is very close to optimal. The wrong denominator was leftover from phase0 code, and wasn't properly updated for Altair.	2025-05-20 01:06:40 +00:00
ethDreamer	7684d1f866	ContextDeserialize and Beacon API Improvements (#7372 ) * #7286 * BeaconAPI is not returning a versioned response when it should for some V1 endpoints * these [strange functions with vX in the name that still accept `endpoint_version` arguments](https://github.com/sigp/lighthouse/blob/stable/beacon_node/http_api/src/produce_block.rs#L192) This refactor is a prerequisite to get the fulu EF tests running.	2025-05-19 05:05:16 +00:00
Pawan Dhananjay	23ad833747	Change default EngineState to online (#7417 ) Resolves https://github.com/sigp/lighthouse/issues/7414 The health endpoint returns a 503 if the engine state is offline. The default state for the engine is `Offline`. So until the first request to the EL is made and the state is updated, the health endpoint will keep returning 503s. This PR changes the default state to Online to avoid that. I don't think this causes any issues because in case the EL is actually offline, the first fcu will set the state to offline. Pending testing on kurtosis.	2025-05-16 19:04:30 +00:00
Eitan Seri-Levi	268809a530	Rust clippy 1.87 lint fixes (#7471 ) Fix clippy lints for `rustc` 1.87 clippy complains about `BeaconChainError` being too large. I went on a bit of a boxing spree because of this. We may instead want to `Box` some of the `BeaconChainError` variants?	2025-05-16 05:03:00 +00:00
Odinson	1853d836b7	Added E::slots_per_epoch() to deneb time calculation (#7458 ) Which issue # does this PR address? Closes #7457 Added `E::slots_per_epoch()` and now it ensures conversion from epochs to slots while calculating deneb time	2025-05-15 07:31:31 +00:00
Michael Sproul	c2c7fb87a8	Make DAG construction more permissive (#7460 ) Workaround/fix for: - https://github.com/sigp/lighthouse/issues/7323 - Remove the `StateSummariesNotContiguousError`. This allows us to continue with DAG construction and pruning, even in the case where the DAG is disjointed. We will treat any disjoint summaries as roots of their own tree, and prune them (as they are not descended from finalized). This should be safe, as canonical summaries should not be disjoint (if they are, then the DB is already corrupt).	2025-05-15 02:15:35 +00:00
Eitan Seri-Levi	807848bc7a	Next sync committee branch bug (#7443 ) #7441 Make sure we're correctly caching light client data	2025-05-13 01:13:15 +00:00
SunnysidedJ	593390162f	`peerdas-devnet-7`: update `DataColumnSidecarsByRoot` request to use `DataColumnsByRootIdentifier` (#7399 ) Update DataColumnSidecarsByRoot request to use DataColumnsByRootIdentifier #7377 As described in https://github.com/ethereum/consensus-specs/pull/4284	2025-05-12 00:20:55 +00:00
Lion - dapplion	a497ec601c	Retry custody requests after peer metadata updates (#6975 ) Closes https://github.com/sigp/lighthouse/issues/6895 We need sync to retry custody requests when a peer CGC updates. A higher CGC can result in a data column subnet peer count increasing from 0 to 1, allowing requests to happen. Add new sync event `SyncMessage::UpdatedPeerCgc`. It's sent by the router when a metadata response updates the known CGC	2025-05-09 08:27:17 +00:00
Jimmy Chen	4b9c16fc71	Add Electra forks to basic sim tests (#7199 ) This PR adds transitions to Electra ~~and Fulu~~ fork epochs in the simulator tests. ~~It also covers blob inclusion verification and data column syncing on a full node in Fulu.~~ UPDATE: Remove fulu fork from sim tests due to https://github.com/sigp/lighthouse/pull/7199#issuecomment-2852281176	2025-05-08 08:43:44 +00:00
Jimmy Chen	0f13029c7d	Don't publish data columns reconstructed from RPC columns to the gossip network (#7409 ) Don't publish data columns reconstructed from RPC columns to the gossip network, as this may result in peer downscoring if we're sending columns from past slots.	2025-05-07 23:24:48 +00:00
Lion - dapplion	beb0ce68bd	Make range sync peer loadbalancing PeerDAS-friendly (#6922 ) - Re-opens https://github.com/sigp/lighthouse/pull/6864 targeting unstable Range sync and backfill sync still assume that each batch request is done by a single peer. This assumption breaks with PeerDAS, where we request custody columns to N peers. Issues with current unstable: - Peer prioritization counts batch requests per peer. This accounting is broken now, data columns by range request are not accounted - Peer selection for data columns by range ignores the set of peers on a syncing chain, instead draws from the global pool of peers - The implementation is very strict when we have no peers to request from. After PeerDAS this case is very common and we want to be flexible or easy and handle that case better than just hard failing everything. - [x] Upstream peer prioritization to the network context, it knows exactly how many active requests a peer (including columns by range) - [x] Upstream peer selection to the network context, now `block_components_by_range_request` gets a set of peers to choose from instead of a single peer. If it can't find a peer, it returns the error `RpcRequestSendError::NoPeer` - [ ] Range sync and backfill sync handle `RpcRequestSendError::NoPeer` explicitly - [ ] Range sync: leaves the batch in `AwaitingDownload` state and does nothing. TODO: we should have some mechanism to fail the chain if it's stale for too long - EDIT: Not done in this PR - [x] Backfill sync: pauses the sync until another peer joins - EDIT: Same logic as unstable ### TODOs - [ ] Add tests :) - [x] Manually test backfill sync Note: this touches the mainnet path!	2025-05-07 02:03:07 +00:00
Lion - dapplion	2aa5d5c25e	Make sure to log SyncingChain ID (#7359 ) Debugging an sync issue from @pawanjay176 I'm missing some key info where instead of logging the ID of the SyncingChain we just log "Finalized" (the sync type). This looks like some typo or something was lost in translation when refactoring things. ``` Apr 17 12:12:00.707 DEBUG Syncing new finalized chain chain: Finalized, component: "range_sync" ``` This log should include more info about the new chain but just logs "Finalized" ``` Apr 17 12:12:00.810 DEBUG New chain added to sync peer_id: "16Uiu2HAmHP8QLYQJwZ4cjMUEyRgxzpkJF87qPgNecLTpUdruYbdA", sync_type: Finalized, new_chain: Finalized, component: "range_sync" ``` - Remove the Display impl and log the ID explicitly for all logs. - Log more details when creating a new SyncingChain	2025-05-01 19:53:29 +00:00
Jimmy Chen	93ec9df137	Compute proposer shuffling only once in gossip verification (#7304 ) When we perform data column gossip verification, we sometimes see multiple proposer shuffling cache miss simultaneously and this results in multiple threads computing the shuffling cache and potentially slows down the gossip verification. Proposal here is to use a `OnceCell` for each shuffling key to make sure it's only computed once. I have only implemented this in data column verification as a PoC, but this can also be applied to blob and block verification Related issues: - https://github.com/sigp/lighthouse/issues/4447 - https://github.com/sigp/lighthouse/issues/7203	2025-05-01 01:30:42 +00:00
Eitan Seri-Levi	9779b4ba2c	Optimize `validate_data_columns` (#7326 )	2025-04-30 04:36:50 +00:00
Akihito Nakano	1324d3d3c4	Delayed RPC Send Using Tokens (#5923 ) closes https://github.com/sigp/lighthouse/issues/5785 The diagram below shows the differences in how the receiver (responder) behaves before and after this PR. The following sentences will detail the changes. ```mermaid flowchart TD subgraph "* After " Start2([START]) --> AA[Receive request] AA --> COND1{Is there already an active request <br> with the same protocol?} COND1 --> \|Yes\| CC[Send error response] CC --> End2([END]) %% COND1 --> \|No\| COND2{Request is too large?} %% COND2 --> \|Yes\| CC COND1 --> \|No\| DD[Process request] DD --> EE{Rate limit reached?} EE --> \|Yes\| FF[Wait until tokens are regenerated] FF --> EE EE --> \|No\| GG[Send response] GG --> End2 end subgraph " Before *" Start([START]) --> A[Receive request] A --> B{Rate limit reached <br> or <br> request is too large?} B -->\|Yes\| C[Send error response] C --> End([END]) B -->\|No\| E[Process request] E --> F[Send response] F --> End end ``` ### `Is there already an active request with the same protocol?` This check is not performed in `Before`. This is taken from the PR in the consensus-spec, which proposes updates regarding rate limiting and response timeout. https://github.com/ethereum/consensus-specs/pull/3767/files > The requester MUST NOT make more than two concurrent requests with the same ID. The PR mentions the requester side. In this PR, I introduced the `ActiveRequestsLimiter` for the `responder` side to restrict more than two requests from running simultaneously on the same protocol per peer. If the limiter disallows a request, the responder sends a rate-limited error and penalizes the requester. ### `Rate limit reached?` and `Wait until tokens are regenerated` UPDATE: I moved the limiter logic to the behaviour side. https://github.com/sigp/lighthouse/pull/5923#issuecomment-2379535927 ~~The rate limiter is shared between the behaviour and the handler. (`Arc<Mutex<RateLimiter>>>`) The handler checks the rate limit and queues the response if the limit is reached. The behaviour handles pruning.~~ ~~I considered not sharing the rate limiter between the behaviour and the handler, and performing all of these either within the behaviour or handler. However, I decided against this for the following reasons:~~ - ~~Regarding performing everything within the behaviour: The behaviour is unable to recognize the response protocol when `RPC::send_response()` is called, especially when the response is `RPCCodedResponse::Error`. Therefore, the behaviour can't rate limit responses based on the response protocol.~~ - ~~Regarding performing everything within the handler: When multiple connections are established with a peer, there could be multiple handlers interacting with that peer. Thus, we cannot enforce rate limiting per peer solely within the handler. (Any ideas? 🤔 )~~	2025-04-24 03:46:16 +00:00
chonghe	c13e069c9c	Revise logging when `queue is full` (#7324 )	2025-04-22 22:46:30 +00:00
Michael Sproul	e61e92b926	Merge remote-tracking branch 'origin/stable' into unstable	2025-04-22 18:55:06 +10:00
Michael Sproul	54f7bc5b2c	Release v7.0.0 (#7288 ) New v7.0.0 release for Electra on mainnet.	2025-04-22 09:21:03 +10:00

1 2 3 4 5 ...

3439 Commits