lighthouse

mirror of https://github.com/sigp/lighthouse.git synced 2026-04-19 22:08:30 +00:00

Author	SHA1	Message	Date
Pawan Dhananjay	092aaae961	Sync cleanups (#8230 ) N/A 1. In the batch retry logic, we were failing to set the batch state to `AwaitingDownload` before attempting a retry. This PR sets it to `AwaitingDownload` before the retry and sets it back to `Downloading` if the retry suceeded in sending out a request 2. Remove all peer scoring logic from retrying and rely on just de priorotizing the failed peer. I finally concede the point to @dapplion 😄 3. Changes `block_components_by_range_request` to accept `block_peers` and `column_peers`. This is to ensure that we use the full synced peerset for requesting columns in order to avoid splitting the column peers among multiple head chains. During forward sync, we want the block peers to be the peers from the syncing chain and column peers to be all synced peers from the peerdb. Also, fixes a typo and calls `attempt_send_awaiting_download_batches` from more places Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>	2025-10-20 11:50:00 +00:00
Jimmy Chen	c012f46cb9	Fix `get_header` JSON deserialization. (#8228 ) #8224 Please list or describe the changes introduced by this PR. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-10-20 07:10:40 +00:00
chonghe	2b30c96f16	Avoid attempting to serve blobs after Fulu fork (#7756 ) * #7122 Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io> Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>	2025-10-20 06:29:21 +00:00
Jimmy Chen	da93b89e90	Feature gate test CLI flags (#8231 ) Closes #6980 I think these flags may be useful in future peerdas / das testing, and would be useful to keep. Hence I've gated them behind a `testing` feature flag. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-10-20 03:14:16 +00:00
Michael Sproul	2f8587301d	More proposer shuffling cleanup (#8130 ) Addressing more review comments from: - https://github.com/sigp/lighthouse/pull/8101 I've also tweaked a few more things that I think are minor bugs. - Instrument `ensure_state_can_determine_proposers_for_epoch` - Fix `block_root` usage in `compute_proposer_duties_from_head`. This was a regression introduced in 8101 😬 . - Update the `state_advance_timer` to prime the next-epoch proposer cache post-Fulu. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-10-20 03:14:14 +00:00
Odinson	79716f6ec1	Max reconstruction delay as a function of slot time (#8067 ) Fixes #8054 Co-Authored-By: PoulavBhowmick03 <bpoulav@gmail.com>	2025-10-17 08:49:13 +00:00
Jimmy Chen	76a37a0aef	Revert incorrect fix made in #8179 (#8215 ) This PR reverts #8179. It turns out that the fix was invalid because an unknown root is always not a finalized descendant: `522bd9e9c6/consensus/proto_array/src/proto_array.rs (L976-L979)` so for any data columns with unknown parents, it will always penalise the gossip peer and disconnect it pretty quickly. On a small network, the node may lose all of its peers. The impact is pretty obvious when the peer count is small and sync speed is slow, and is therefore easily reproducible by running a fresh supernode on devnet-3. This isn't as obvious on a live testnet like holesky / sepolia, we haven't noticed this, probably due to its high peer count and sync speed - the nodes might be able to reach head quickly before losing too many peers. The previous behaviour isn't ideal but safe: triggering unknown parent lookup and penalise the bad peer if it happens to be malicious or faulty. So for now it's safer to revert the change and plan for a proper fix after the v8 release. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-10-16 23:25:30 +00:00
SunnysidedJ	d1e06dc40d	#6853 Adding store tests for data column pruning (#7228 ) #6853 Update store tests to cover data column pruning Created a helper function `check_data_column_existence` which is a copy of `check_blob_existence` but checking data columns instead. The helper function is then used to check whether data columns are also pruned when blobs are pruned if PeerDAS is enabled. Co-Authored-By: SunnysidedJ <j@testinprod.io> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-10-16 15:20:26 +00:00
Pawan Dhananjay	73e75e3e69	Ignore extra columns in da cache (#8201 ) N/A Found this issue in sepolia. Note: the custody requirement for this node is 100. ``` Oct 14 11:25:40.053 DEBUG Reconstructed columns count: 28, block_root: 0x4d7946dec0ab59f2afd46610d7c54af555cb4c2851d9eea7d83dd17cf6e96aae, slot: 8725628 Oct 14 11:25:45.568 WARN Internal availability check failure block_root: 0x4d7946dec0ab59f2afd46610d7c54af555cb4c2851d9eea7d83dd17cf6e96aae, error: Unexpected("too many columns got 128 expected 100") ``` So if any of the block components arrives late, then we reconstruct all 128 columns and try to add it to da cache and have more columns than needed for availability in the cache. There are 2 ways I can think of fixing this: 1. pass only the required columns to the da cache after reconstruction here `60df5f4ab6/beacon_node/beacon_chain/src/data_availability_checker.rs (L647-L648)` 2. Ensure that we add only columns that we need to sample in the da cache. I think this is safer since we can add columns to the cache from multiple code paths and this fixes it at the source. ~~This PR implements (2).~~ Thought more about it, I think (1) is cleaner since we filter gossip and rpc columns also before calling `put_kzg_verified_data_columns`/ Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>	2025-10-16 09:25:44 +00:00
Jimmy Chen	5886a48d96	Add `max_blobs_per_block` check to data column gossip validation (#8198 ) Addresses this spec change https://github.com/ethereum/consensus-specs/pull/4650 Add `max_blobs_per_block` to gossip data column check so we reject large columns before processing. (we currently do this check during processing) Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-10-15 01:52:35 +00:00
Eitan Seri-Levi	60df5f4ab6	Downgrade light client error logs (#8196 ) Temporary stop gap for #7002 Downgrade light client errors to debug We eventually should fix our light client objects so they can consist of data across forks. Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-10-14 03:18:50 +00:00
Jimmy Chen	1fb94ce432	Release v8.0.0-rc.1 (#8185 )	2025-10-13 20:32:43 +11:00
Pawan Dhananjay	2c328e32a6	Persist only custody columns in db (#8188 ) * Only persist custody columns * Get claude to write tests * lint * Address review comments and fix tests. * Use supernode only when building chain segments * Clean up * Rewrite tests. * Fix tests * Clippy --------- Co-authored-by: Jimmy Chen <jchen.tc@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2025-10-13 20:32:13 +11:00
Jimmy Chen	538b70495c	Reject data columns that does not descend from finalize root instead of ignoring it (#8179 ) This issue was identified during the fusaka audit competition. The [`verify_parent_block_and_finalized_descendant`](`62d9302e0f/beacon_node/beacon_chain/src/data_column_verification.rs (L606-L627)`) in data column gossip verification currently load the parent first before checking if the column descends from the finalized root. However, the `fork_choice.get_block(&block_parent_root)` function also make the same check internally: `8a4f6cf0d5/consensus/fork_choice/src/fork_choice.rs (L1242-L1249)` Therefore, if the column does not descend from the finalized root, we return an `UnknownParent` error, before hitting the `is_finalized_checkpoint_or_descendant` check just below. Which means we `IGNORE` the gossip message instead `REJECT`, and the gossip peer is not _immediately_ penalised. This deviates from the spec. However, worth noting that lighthouse will currently attempt to request the parent from this peer, and if the peer is not able to serve the parent, it gets penalised with a `LowToleranceError`, and will get banned after ~5 occurences. `ffa7b2b2b9/beacon_node/network/src/sync/network_context.rs (L1530-L1532)` This PR will penalise the bad peer immediately instead of performing block lookups before penalising it. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-10-09 07:32:43 +00:00
chonghe	3110ca325b	Implement `/eth/v1/beacon/blobs` endpoint (#8103 ) * #8085 Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io> Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>	2025-10-09 05:01:30 +00:00
Michael Sproul	13dfa9200f	Block proposal optimisations (#8156 ) Closes: - https://github.com/sigp/lighthouse/issues/4412 This should reduce Lighthouse's block proposal times on Holesky and prevent us getting reorged. - [x] Allow the head state to be advanced further than 1 slot. This lets us avoid epoch processing on hot paths including block production, by having new epoch boundaries pre-computed and available in the state cache. - [x] Use the finalized state to prune the op pool. We were previously using the head state and trying to infer slashing/exit relevance based on `exit_epoch`. However some exit epochs are far in the future, despite occurring recently. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-10-08 06:09:12 +00:00
Jimmy Chen	2a433bc406	Remove deprecated CLI flags and references for v8.0.0 (#8142 ) Closes #8131 - [x] Remove deprecated flags from beacon_node/src/cli.rs: - [x] eth1-purge-cache - [x] eth1-blocks-per-log-query - [x] eth1-cache-follow-distance - [x] disable-deposit-contract-sync - [x] light-client-server - [x] Remove deprecated flags from lighthouse/src/main.rs: - [x] logfile - [x] terminal-total-difficulty-override - [x] terminal-block-hash-override - [x] terminal-block-hash-epoch-override - [x] safe-slots-to-import-optimistically - [x] Remove references to deprecated flags in config.rs files - [x] Remove warning messages for deprecated flags in main.rs - [x] Update/remove related tests in beacon_node.rs Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-10-08 01:52:41 +00:00
Pawan Dhananjay	a4ad3e492f	Fallback to getPayload v1 if v2 fails (#8163 ) N/A Post fulu, we should be calling the v2 api on the relays that doesn't return the blobs/data columns. However, we decided to start hitting the v2 api as soon as fulu is scheduled to avoid unexpected surprises at the fork. In the ACDT call, it seems like most clients are calling v2 only after the fulu fork. This PR aims to be the best of both worlds where we fallback to hitting v1 api if v2 fails. This way, we know beforehand if relays don't support it and can potentially alert them. Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>	2025-10-07 14:32:41 +00:00
Eitan Seri-Levi	4eb89604f8	Fulu ASCII art (#8151 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-10-07 14:32:35 +00:00
Jimmy Chen	ff8b514b3f	Remove unnecessary warning logs and update logging levels (#8145 ) @michaelsproul noticed this warning on a devnet-3 node ``` Oct 01 16:37:29.896 WARN Error when importing rpc custody columns error: ParentUnknown { parent_root: 0xe4cc85a2137b76eb083d7076255094a90f10caaec0afc8fd36807db742f6ff13 }, block_hash: 0x43ce63b2344990f5f4d8911b8f14e3d3b6b006edc35bbc833360e667df0edef7 ``` We're also seeing similar `WARN` logs for blobs on our live nodes. It's normal to get parent unknown in lookups and it's handled here `a134d43446/beacon_node/network/src/sync/block_lookups/mod.rs (L611-L619)` These shouldn't be a `WARN`, and we also log the same error in block lookups at `DEBUG` level here: `a134d43446/beacon_node/network/src/sync/block_lookups/mod.rs (L643-L648)` So i've removed these extra WARN logs. I've also lower the level of an `ERROR` log when unable to serve data column root requests - it's unexpected, but is unlikely to impact the nodes performance, so I think we can downgrade this. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-10-06 19:26:37 +00:00
Jimmy Chen	e5b4983d6b	Release v8.0.0 rc.0 (#8127 ) Testnet release for the upcoming Fusaka fork. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-29 02:17:30 +00:00
Michael Sproul	38fdaf791c	Fix proposer shuffling decision slot at boundary (#8128 ) Follow-up to the bug fixed in: - https://github.com/sigp/lighthouse/pull/8121 This fixes the root cause of that bug, which was introduced by me in: - https://github.com/sigp/lighthouse/pull/8101 Lion identified the issue here: - https://github.com/sigp/lighthouse/pull/8101#discussion_r2382710356 In the methods that compute the proposer shuffling decision root, ensure we don't use lookahead for the Fulu fork epoch itself. This is accomplished by checking if Fulu is enabled at `epoch - 1`, i.e. if `epoch > fulu_fork_epoch`. I haven't updated the methods that _compute_ shufflings to use these new corrected bounds (e.g. `BeaconState::compute_proposer_indices`), although we could make this change in future. The `get_beacon_proposer_indices` method already gracefully handles the Fulu boundary case by using the `proposer_lookahead` field (if initialised). Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-29 01:13:33 +00:00
Pawan Dhananjay	edcfee636c	Fix bug in fork calculation at fork boundaries (#8121 ) N/A In #8101 , when we modified the logic to get the proposer index post fulu, we seem to have missed advancing the state at the fork boundaries to get the right `Fork` for signature verification. This led to lighthouse failing all gossip verification right after transitioning to fulu that was observed on the holesky shadow fork ``` Sep 26 14:24:00.088 DEBUG Rejected gossip block error: "InvalidSignature(ProposerSignature)", graffiti: "grandine-geth-super-1", slot: 640 Sep 26 14:24:00.099 WARN Could not verify block for gossip. Rejecting the block error: InvalidSignature(ProposerSignature) ``` I'm not completely sure this is the correct fix, but this fixes the issue with `InvalidProposerSignature` on the holesky shadow fork. Thanks to @eserilev for helping debug this Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>	2025-09-28 04:03:25 +00:00
Michael Sproul	c754234b2c	Fix bugs in proposer calculation post-Fulu (#8101 ) As identified by a researcher during the Fusaka security competition, we were computing the proposer index incorrectly in some places by computing without lookahead. - [x] Add "low level" checks to computation functions in `consensus/types` to ensure they error cleanly - [x] Re-work the determination of proposer shuffling decision roots, which are now fork aware. - [x] Re-work and simplify the beacon proposer cache to be fork-aware. - [x] Optimise `with_proposer_cache` to use `OnceCell`. - [x] All tests passing. - [x] Resolve all remaining `FIXME(sproul)`s. - [x] Unit tests for `ProtoBlock::proposer_shuffling_root_for_child_block`. - [x] End-to-end regression test. - [x] Test on pre-Fulu network. - [x] Test on post-Fulu network. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-26 14:44:50 +00:00
Lion - dapplion	ffa7b2b2b9	Only mark block lookups as pending if block is importing from gossip (#8112 ) - PR https://github.com/sigp/lighthouse/pull/8045 introduced a regression of how lookup sync interacts with the da_checker. Now in unstable block import from the HTTP API also insert the block in the da_checker while the block is being execution verified. If lookup sync finds the block in the da_checker in `NotValidated` state it expects a `GossipBlockProcessResult` message sometime later. That message is only sent after block import in gossip. I confirmed in our node's logs for 4/4 cases of stuck lookups are caused by this sequence of events: - Receive block through API, insert into da_checker in fn process_block in put_pre_execution_block - Create lookup and leave in AwaitingDownload(block in processing cache) state - Block from HTTP API finishes importing - Lookup is left stuck Closes https://github.com/sigp/lighthouse/issues/8104 - https://github.com/sigp/lighthouse/pull/8110 was my initial solution attempt but we can't send the `GossipBlockProcessResult` event from the `http_api` crate without adding new channels, which seems messy. For a given node it's rare that a lookup is created at the same time that a block is being published. This PR solves https://github.com/sigp/lighthouse/issues/8104 by allowing lookup sync to import the block twice in that case. Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>	2025-09-25 03:52:27 +00:00
Jimmy Chen	79b33214ea	Only send data coumn subnet discovery requests after peerdas is scheduled (#8109 ) #8105 (to be confirmed) I noticed a large number of failed discovery requests after deploying latest `unstable` to some of our testnet and mainnet nodes. This is because of a recent PeerDAS change to attempt to maintain sufficient peers across data column subnets - this shouldn't be enabled on network without peerdas scheduled, otherwise it will keep retrying discovery on these subnets and never succeed. Also removed some unused files. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-25 02:52:07 +00:00
Eitan Seri-Levi	af274029e8	Run reconstruction inside a scoped rayon pool (#8075 ) Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>	2025-09-24 06:37:34 +00:00
Michael Sproul	1dbc4f861b	Refine HTTP status logs (#8098 ) Ensure that we don't log a warning for HTTP 202s, which are expected on the blinded block endpoints after Fulu. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-22 05:03:47 +00:00
Jimmy Chen	4efe47b3c3	Rename `--subscribe-all-data-column-subnets` to `--supernode` and make it visible in help (#8083 ) Rename `--subscribe-all-data-column-subnets` to `--supernode` as it's now been officially accepted in the spec. Also make it visible in help in preparation for the fusaka release. https://github.com/ethereum/consensus-specs/blob/dev/specs/fulu/p2p-interface.md#supernodes Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-19 07:01:16 +00:00
Jimmy Chen	78d330e4b7	Consolidate `reqresp_pre_import_cache` into `data_availability_checker` (#8045 ) This PR consolidates the `reqresp_pre_import_cache` into the `data_availability_checker` for the following reasons: - the `reqresp_pre_import_cache` suffers from the same TOCTOU bug we had with `data_availability_checker` earlier, and leads to unbounded memory leak, which we have observed over the last 6 months on some nodes. - the `reqresp_pre_import_cache` is no longer necessary, because we now hold blocks in the `data_availability_checker` for longer since (#7961), and recent blocks can be served from the DA checker. This PR also maintains the following functionalities - Serving pre-executed blocks over RPC, and they're now served from the `data_availability_checker` instead. - Using the cache for de-duplicating lookup requests. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-19 07:01:13 +00:00
Jimmy Chen	4111bcb39b	Use scoped rayon pool for backfill chain segment processing (#7924 ) Part of #7866 - Continuation of #7921 In the above PR, we enabled rayon for batch KZG verification in chain segment processing. However, using the global rayon thread pool for backfill is likely to create resource contention with higher-priority beacon processor work. This PR introduces a dedicated low-priority rayon thread pool `LOW_PRIORITY_RAYON_POOL` and uses it for processing backfill chain segments. This prevents backfill KZG verification from using the global rayon thread pool and competing with high-priority beacon processor tasks for CPU resources. However, this PR by itself doesn't prevent CPU oversubscription because other tasks could still fill up the global rayon thread pool, and having an extra thread pool could make things worse. To address this we need the beacon processor to coordinate total CPU allocation across all tasks, which is covered in: - #7789 Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>	2025-09-18 07:10:23 +00:00
Michael Sproul	51321daabb	Make the block cache optional (#8066 ) Address contention on the store's `block_cache` by allowing it to be disabled when `--block-cache-size 0` is provided, and also making this the default. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-18 07:10:18 +00:00
Michael Sproul	3543a20192	Add experimental complete-blob-backfill flag (#7751 ) A different (and complementary) approach for: - https://github.com/sigp/lighthouse/issues/5391 This PR adds a flag to set the DA boundary to the Deneb fork. The effect of this change is that Lighthouse will try to backfill _all_ blobs. Most peers do not have this data, but I'm thinking that combined with `trusted-peers` this could be quite effective. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-18 05:17:03 +00:00
Michael Sproul	684632df73	Fix reprocess queue memory leak (#8065 ) Fix a memory leak in the reprocess queue. If the vec of attestation IDs for a block is never evicted from the reprocess queue by a `BlockImported` event, then it stays in the map forever consuming memory. The fix is to remove the entry when its last attestation times out. We do similarly for light client updates. In practice this will only occur if there is a race between adding an attestation to the queue and processing the `BlockImported` event, or if there are attestations for block roots that we never import (e.g. random block roots, block roots of invalid blocks). Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-18 05:16:59 +00:00
Eitan Seri-Levi	521be2b757	Prevent silently dropping cell proof chunks (#8023 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-09-18 01:33:42 +00:00
Toki	5928407ce4	fix(rate_limiter): add missing prune calls for light client protocols (#8058 ) Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io> Co-Authored-By: gitToki <tokipro@proton.me>	2025-09-17 04:51:43 +00:00
Lion - dapplion	b7d78a91e0	Don't penalize peers for extending ignored chains (#8042 ) Lookup sync has a cache of block roots "failed_chains". If a peer triggers a lookup for a block or descendant of a root in failed_chains the lookup is dropped and the peer penalized. However blocks are inserted into failed_chains for a single reason: - If a chain is longer than 32 blocks the lookup is dropped to prevent OOM risks. However the peer is not at fault, since discovering an unknown chain longer than 32 blocks is not malicious. We just drop the lookup to sync the blocks from range forward sync. This discrepancy is probably an oversight when changing old code. Before we used to add blocks that failed too many times to process to that cache. However, we don't do that anymore. Adding a block that fails too many times to process is an optimization to save resources in rare cases where peers keep sending us invalid blocks. In case that happens, today we keep trying to process the block, downscoring the peers and eventually disconnecting them. _IF_ we found that optimization to be necessary we should merge this PR (_Stricter match of BlockError in lookup sync_) first. IMO we are fine without the failed_chains cache and the ignored_chains cache will be obsolete with [tree sync](https://github.com/sigp/lighthouse/issues/7678) as the OOM risk of long lookup chains does not exist anymore. Closes https://github.com/sigp/lighthouse/issues/7577 Rename `failed_chains` for `ignored_chains` and don't penalize peers that trigger lookups for those blocks Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>	2025-09-17 01:02:29 +00:00
Jimmy Chen	3de646c8b3	Enable reconstruction for nodes custodying more than 50% of columns and instrument tracing (#8052 ) Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-16 08:17:43 +00:00
Eitan Seri-Levi	242bdfcf12	Add instrumentation to `recompute_head_at_slot` (#8049 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-09-16 05:18:31 +00:00
Eitan Seri-Levi	aba3627099	Reduce reconstruction queue capacity (#8053 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-09-16 05:18:28 +00:00
Eitan Seri-Levi	4409500f63	Remove column reconstruction when processing rpc requests (#8051 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-09-16 05:18:25 +00:00
Michael Sproul	f04d5ecddd	Another check to prevent duplicate block imports (#8050 ) Attempt to address performance issues caused by importing the same block multiple times. - Check fork choice "after" obtaining the fork choice write lock in `BeaconChain::import_block`. We actually use an upgradable read lock, but this is semantically equivalent (the upgradable read has the advantage of not excluding regular reads). The hope is that this change has several benefits: 1. By preventing duplicate block imports we save time repeating work inside `import_block` that is unnecessary, e.g. writing the state to disk. Although the store itself now takes some measures to avoid re-writing diffs, it is even better if we avoid a disk write entirely. 2. By returning `DuplicateFullyImported`, we reduce some duplicated work downstream. E.g. if multiple threads importing columns trigger `import_block`, now only _one_ of them will get a notification of the block import completing successfully, and only this one will run `recompute_head`. This should help avoid a situation where multiple beacon processor workers are consumed by threads blocking on the `recompute_head_lock`. However, a similar block-fest is still possible with the upgradable fork choice lock (a large number of threads can be blocked waiting for the first thread to complete block import). Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-16 04:10:42 +00:00
Jimmy Chen	b8178515cd	Update engine methods in notifier (#8038 ) Fulu uses `getPayloadV5`, this PR updates the notifier logging prior to the fork. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-14 23:41:12 +00:00
Eitan Seri-Levi	aef8291f94	Add max delay to reconstruction (#7976 ) #7697 If we're three seconds into the current slot just trigger reconstruction. I don't know what the correct reconstruction deadline number is, but it should probably be at least half a second before the attestation deadline Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>	2025-09-12 06:05:42 +00:00
Jimmy Chen	fb77ce9e19	Add missing event in `PendingComponent` span and clean up sync logs (#8033 ) I was looking into some long `PendingComponents` span and noticed the block event wasn't added to the span, so it wasn't possible to see when the block was added from the trace view, this PR fixes this. <img width="637" height="430" alt="image" src="https://github.com/user-attachments/assets/65040b1c-11e7-43ac-951b-bdfb34b665fb" /> Additionally I've noticed a lot of noises and confusion in sync logs due to the initial`peer_id` being included as part of the syncing chain span, causing all logs under the span to have that `peer_id`, which may not be accurate for some sync logs, I've removed `peer_id` from the `SyncingChain` span, and also cleaned up a bunch of spans to use `%` (display) for slots and epochs to make logs easier to read. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-12 05:11:30 +00:00
Michael Sproul	a080bb5cee	Increase HTTP timeouts on CI (#8031 ) Since we re-enabled HTTP API tests on CI (https://github.com/sigp/lighthouse/pull/7943) there have been a few spurious failures: - https://github.com/sigp/lighthouse/actions/runs/17608432465/job/50024519938?pr=7783 That error is awkward, but running locally with a short timeout confirms it to be a timeout. Change the request timeout to 5s everywhere. We had kept it shorter to try to detect performance regressions, but I think this is better suited to being done with metrics & traces. On CI we really just want things to pass reliably without flakiness, so I think a longer timeout to handle slower test code (like mock-builder) and overworked CI boxes makes sense. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-11 00:47:39 +00:00
kevaundray	f71d69755d	chore: add comment to PendingComponents (#7979 ) Adds doc comment Co-Authored-By: Kevaundray Wedderburn <kevtheappdev@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-10 13:48:11 +00:00
Daniel Knopik	ee1b6bc81b	Create `network_utils` crate (#7761 ) Anchor currently depends on `lighthouse_network` for a few types and utilities that live within. As we use our own libp2p behaviours, we actually do not use the core logic in that crate. This makes us transitively depend on a bunch of unneeded crates (even a whole separate libp2p if the versions mismatch!) Move things we require into it's own lightweight crate. Co-Authored-By: Daniel Knopik <daniel@dknopik.de>	2025-09-10 12:59:24 +00:00
Eitan Seri-Levi	caa1df6fc3	Skip column gossip verification logic during block production (#7973 ) #7950 Skip column gossip verification logic during block production as its redundant and potentially computationally expensive. Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-10 12:29:56 +00:00
hopinheimer	38205192ca	Fix http api tests ci (#7943 ) Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Michael Sproul <micsproul@gmail.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io> Co-Authored-By: hopinheimer <knmanas6@gmail.com>	2025-09-10 06:46:48 +00:00

1 2 3 4 5 ...

3684 Commits