lighthouse

mirror of https://github.com/sigp/lighthouse.git synced 2026-07-01 11:54:40 +00:00

Author	SHA1	Message	Date
Daniel Knopik	560f90611e	Fix and improve handling of empty columns after getBlobs response (#9361 ) This PR fixes two issues: 1. This condition is inverted: `dfb259171a/beacon_node/network/src/network_beacon_processor/gossip_methods.rs (L1507-L1508)` We are supposed to filter out incomplete columns when we DON'T have local blobs yet! 2. When the EL returns no blobs, we never store a partial in the assembler, and this code fails to publish our need to the network, as no partials are returned: `dfb259171a/beacon_node/network/src/network_beacon_processor/mod.rs (L1038-L1050)` The simple fix for 1 would be to invert the condition, but we can improve the flow here: Instead of not publishing anything, we can publish what we got, but not request anything. This ties into the fix for 2: After get blobs completes, we not only publish anything in the partial assembler, but also for every missing custody column in there, publish an empty column and a request for all cells. In particular: - When sending a partial message to `network`, allow specifying a request bitmap instead of hardcoding an all-ones bitmap. - For clarity and to prepare for Gloas integration, add a `PubsubPartialMessage` enum with a `DataColumnFulu` variant. - On republishing after merging a gossip column: always publish, but only request cells if local blobs are known or get blobs is disabled. This also prepares us to request only some cells, e.g. in cases where we are aware of the blobs that the EL is going to send us, e.g. via `engine_hasBlobs`. - Move guards in `fetch_engine_blobs_and_publish` to ensure everything works fine if there are no blobs or if get_blobs is disabled. Co-Authored-By: Daniel Knopik <daniel@dknopik.de>	2026-06-19 00:50:24 +00:00
Lion - dapplion	a46620155b	Gloas attestation payload reprocess (#9440 ) Handle payload-present attestations before the payload is seen (gloas) A gloas beacon_attestation with index == 1 claims a past block's payload is already present. If we haven't seen that block's payload envelope yet, we shouldn't reject it the envelope may just be in flight. So instead we IGNORE it (new AttnError::UnknownPayloadEnvelope), ask sync to fetch the envelope, and park the attestation in the reprocess queue. When the envelope is imported, the parked attestations are released and re-verified. The envelope lookup itself is stubbed here and wired up in #9155 or a follow up PR Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>	2026-06-17 12:44:21 +00:00
Mac L	e0ff3b5709	Use `hashlink` over `lru` for `LruCache` (#8911 ) Use the `LruCache` implementation provided by `hashlink` instead of the current `lru` one. This is mostly a 1-to-1 swap with only slight API incompatibilities. I have decided to leave some config files which previously used `NonZeroUsize` but they may not be required anymore and could potentially switch to `usize`. Co-Authored-By: Mac L <mjladson@pm.me>	2026-06-16 06:54:11 +00:00
Eitan Seri-Levi	8396dc87d0	Deprecate gossip blobs (#9126 ) #9124 Deprecate unneeded pre-Fulu blob features - blob gossip - blob lookup sync - engine getBlobsV1 Also deprecates some tests and cleans up production code paths I think this is blocked until gnosis forks to fulu? Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com> Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io> Co-Authored-By: Daniel Knopik <daniel@dknopik.de> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>	2026-05-29 02:59:23 +00:00
Daniel Knopik	1a68631180	Gloas payload cache (#9209 ) In Gloas, beacon blocks are imported into fork choice immediately - the payload envelope and data columns arrive separately. KZG commitments moved from the column sidecar into the execution payload bid, so the existing `DataAvailabilityChecker` (which assumes block and data are coupled) can't be used for Gloas. * Introduced `PendingPayloadCache` to keep track of payload and data columns per block root. * Added gossip column verification * Added support for Gloas data column reconstruction * Payload envelope verification simplified: removed `MaybeAvailableEnvelope`, `ExecutedEnvelope`, `EnvelopeImportData` Not yet implemented (tracked with TODOs): - Proper lookup sync for Gloas columns arriving before blocks - Partial column merging for Gloas - Moving `load_gloas_payload_bid` disk reads off the async runtime - Backfill/range sync for Gloas Based on @eserilev's PR and work in progress. See also #9202 for verification. Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Daniel Knopik <daniel@dknopik.de> Co-Authored-By: Daniel Knopik <107140945+dknopik@users.noreply.github.com> Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2026-05-13 07:03:34 +00:00
Daniel Knopik	8a384ff445	Cell Dissemination (Partial messages) (#8314 ) - https://github.com/ethereum/consensus-specs/pull/4558 - https://eips.ethereum.org/EIPS/eip-8136 Co-Authored-By: Daniel Knopik <daniel@dknopik.de> Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2026-04-23 18:52:28 +00:00
Mac L	7c2dcfc0d6	Refactor `timestamp_now` (#9094 ) #9077 Where possible replaces all instances of `validator_monitor::timestamp_now` with `chain.slot_clock.now_duration().unwrap_or_default()`. Where chain/slot_clock is not available, instead replace it with a convenience function `slot_clock::timestamp_now`. Remove the `validator_monitor::timestamp_now` function. Co-Authored-By: Mac L <mjladson@pm.me>	2026-04-09 08:41:02 +00:00
Eitan Seri-Levi	99f5a92b98	Automatically pass spans into blocking handles (#8158 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2026-04-01 02:13:20 +00:00
Eitan Seri-Levi	9bec8df37a	Add Gloas data column support (#8682 ) Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2026-01-28 04:52:12 +00:00
Mac L	3903e1c67f	More `consensus/types` re-export cleanup (#8665 ) Remove more of the temporary re-exports from `consensus/types` Co-Authored-By: Mac L <mjladson@pm.me>	2026-01-16 04:43:05 +00:00
Pawan Dhananjay	c91345782a	Get blobs v2 metrics (#8641 ) N/A Add standardized metrics for getBlobsV2 from https://github.com/ethereum/beacon-metrics/pull/14. Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>	2026-01-13 07:50:40 +00:00
Mac L	f5809aff87	Bump `ssz_types` to `v0.12.2` (#8032 ) https://github.com/sigp/lighthouse/issues/8012 Replace all instances of `VariableList::from` and `FixedVector::from` to their `try_from` variants. While I tried to use proper error handling in most cases, there were certain situations where adding an `expect` for situations where `try_from` can trivially never fail avoided adding a lot of extra complexity. Co-Authored-By: Mac L <mjladson@pm.me> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-10-28 04:01:09 +00:00
chonghe	522bd9e9c6	Update Rust Edition to 2024 (#7766 ) * #7749 Thanks @dknopik and @michaelsproul for your help!	2025-08-13 03:04:31 +00:00
Jimmy Chen	40c2fd5ff4	Instrument tracing spans for block processing and import (#7816 ) #7815 - removes all existing spans, so some span fields that appear in logs like `service_name` may be lost. - instruments a few key code paths in the beacon node, starting from root spans named below: * Gossip block and blobs * `process_gossip_data_column_sidecar` * `process_gossip_blob` * `process_gossip_block` * Rpc block and blobs * `process_rpc_block` * `process_rpc_blobs` * `process_rpc_custody_columns` * Rpc blocks (range and backfill) * `process_chain_segment` * `PendingComponents` lifecycle * `pending_components` To test locally: * Run Grafana and Tempo with https://github.com/sigp/lighthouse-metrics/pull/57 * Run Lighthouse BN with `--telemetry-collector-url http://localhost:4317` Some captured traces can be found here: https://hackmd.io/@jimmygchen/r1sLOxPPeg Removing the old spans seem to have reduced the memory usage quite a lot - i think we were using them on long running tasks and too excessively: <img width="910" height="495" alt="image" src="https://github.com/user-attachments/assets/5208bbe4-53b2-4ead-bc71-0b782c788669" />	2025-08-08 05:32:22 +00:00
Jimmy Chen	8bc6693dac	Fix wrong columns getting processed on a CGC change (#7792 ) This PR fixes a bug where wrong columns could get processed immediately after a CGC increase. Scenario: - The node's CGC increased due to additional validators attached to it (lets say from 10 to 11) - The new CGC is advertised and new subnets are subscribed immediately, however the change won't be effective in the data availability check until the next epoch (See [this](`ab0e8870b4/beacon_node/beacon_chain/src/validator_custody.rs (L93-L99)`)). Data availability checker still only require 10 columns for the current epoch. - During this time, data columns for the additional custody column (lets say column 11) may arrive via gossip as we're already subscribed to the topic, and it may be incorrectly used to satisfy the existing data availability requirement (10 columns), and result in this additional column (instead of a required one) getting persisted, resulting in database inconsistency.	2025-08-07 00:45:04 +00:00
Jimmy Chen	2aae08a8aa	Remove KZG verification on blobs fetched from the EL (#7771 ) Continuation of #7713, addresses comment about skipping KZG verification on EL fetched blobs: https://github.com/sigp/lighthouse/pull/7713#discussion_r2198542501	2025-07-25 06:49:50 +00:00
Jimmy Chen	b48879a566	Remove KZG verification from local block production and blobs fetched from the EL (#7713 ) #7700 As described in title, the EL already performs KZG verification on all blobs when they entered the mempool, so it's redundant to perform extra validation on blobs returned from the EL. This PR removes - KZG verification for both blobs and data columns during block production - KZG verification for data columns after fetch engine blobs call. I have not done this for blobs because it requires extra changes to check the observed cache, and doesn't feel like it's a worthy optimisation given the number of blobs per block. This PR does not remove KZG verification on the block publishing path yet.	2025-07-22 10:48:49 +00:00
Daniel Knopik	5472cb8500	Batch verify KZG proofs for getBlobsV2 (#7582 )	2025-06-12 14:35:14 +00:00
Jimmy Chen	94a1446ac9	Fix unexpected blob error and duplicate import in fetch blobs (#7541 ) Getting this error on a non-PeerDAS network: ``` May 29 13:30:13.484 ERROR Error fetching or processing blobs from EL error: BlobProcessingError(AvailabilityCheck(Unexpected("empty blobs"))), block_root: 0x98aa3927056d453614fefbc79eb1f9865666d1f119d0e8aa9e6f4d02aa9395d9 ``` It appears we're passing an empty `Vec` to DA checker, because all blobs were already seen on gossip and filtered out, this causes a `AvailabilityCheckError::Unexpected("empty blobs")`. I've added equivalent unit tests for `getBlobsV1` to cover all the scenarios we test in `getBlobsV2`. This would have caught the bug if I had added it earlier. It also caught another bug which could trigger duplicate block import. Thanks Santito for reporting this! 🙏	2025-06-02 01:51:09 +00:00
Jimmy Chen	4d21846aba	Prevent `AvailabilityCheckError` when there's no new custody columns to import (#7533 ) Addresses a regression recently introduced when we started gossip verifying data columns from EL blobs ``` failures: network_beacon_processor::tests::accept_processed_gossip_data_columns_without_import test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 90 filtered out; finished in 16.60s stderr ─── thread 'network_beacon_processor::tests::accept_processed_gossip_data_columns_without_import' panicked at beacon_node/network/src/network_beacon_processor/tests.rs:829:10: should put data columns into availability cache: Unexpected("empty columns") note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ``` https://github.com/sigp/lighthouse/actions/runs/15309278812/job/43082341868?pr=7521 If an empty `Vec` is passed to the DA checker, it causes an unexpected error. This PR addresses it by not passing an empty `Vec` for processing, and not spawning a task to publish.	2025-05-29 02:54:34 +00:00
Jimmy Chen	e6ef644db4	Verify `getBlobsV2` response and avoid reprocessing imported data columns (#7493 ) #7461 and partly #6439. Desired behaviour after receiving `engine_getBlobs` response: 1. Gossip verify the blobs and proofs, but don't mark them as observed yet. This is because not all blobs are published immediately (due to staggered publishing). If we mark them as observed and not publish them, we could end up blocking the gossip propagation. 2. Blobs are marked as observed _either_ when: * They are received from gossip and forwarded to the network . * They are published by the node. Current behaviour: - ❗ We only gossip verify `engine_getBlobsV1` responses, but not `engine_getBlobsV2` responses (PeerDAS). - ❗ After importing EL blobs AND before they're published, if the same blobs arrive via gossip, they will get re-processed, which may result in a re-import. 1. Perform gossip verification on data columns computed from EL `getBlobsV2` response. We currently only do this for `getBlobsV1` to prevent importing blobs with invalid proofs into the `DataAvailabilityChecker`, this should be done on V2 responses too. 2. Add additional gossip verification to make sure we don't re-process a ~~blob~~ or data column that was imported via the EL `getBlobs` but not yet "seen" on the gossip network. If an "unobserved" gossip blob is found in the availability cache, then we know it has passed verification so we can immediately propagate the `ACCEPT` result and forward it to the network, but without re-processing it. UPDATE: I've left blobs out for the second change mentioned above, as the likelihood and impact is very slow and we haven't seen it enough, but under PeerDAS this issue is a regular occurrence and we do see the same block getting imported many times.	2025-05-26 19:55:58 +00:00
Jimmy Chen	f01dc556d1	Update `engine_getBlobsV2` response type and add `getBlobsV2` tests (#7505 ) Update `engine_getBlobsV2` response type to `Option<Vec<BlobsAndProofV2>>`. See recent spec change [here](https://github.com/ethereum/execution-apis/pull/630). Added some tests to cover basic fetch blob scenarios.	2025-05-26 04:33:34 +00:00

22 Commits