lighthouse

mirror of https://github.com/sigp/lighthouse.git synced 2026-04-17 04:48:21 +00:00

Author	SHA1	Message	Date
Michael Sproul	c1fb060ae1	Merge remote-tracking branch 'origin/stable' into unstable	2025-09-22 11:03:46 +10:00
Jimmy Chen	366fb0ee0d	Always upload sim test logs (#8082 ) This CI job failed https://github.com/sigp/lighthouse/actions/runs/17815533375/job/50647915897 But we lost the logs because they aren't uploaded when the job fails. This PR changes the step to always upload job, even in the case of failure. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-19 12:58:46 +00:00
Jimmy Chen	4efe47b3c3	Rename `--subscribe-all-data-column-subnets` to `--supernode` and make it visible in help (#8083 ) Rename `--subscribe-all-data-column-subnets` to `--supernode` as it's now been officially accepted in the spec. Also make it visible in help in preparation for the fusaka release. https://github.com/ethereum/consensus-specs/blob/dev/specs/fulu/p2p-interface.md#supernodes Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-19 07:01:16 +00:00
Jimmy Chen	78d330e4b7	Consolidate `reqresp_pre_import_cache` into `data_availability_checker` (#8045 ) This PR consolidates the `reqresp_pre_import_cache` into the `data_availability_checker` for the following reasons: - the `reqresp_pre_import_cache` suffers from the same TOCTOU bug we had with `data_availability_checker` earlier, and leads to unbounded memory leak, which we have observed over the last 6 months on some nodes. - the `reqresp_pre_import_cache` is no longer necessary, because we now hold blocks in the `data_availability_checker` for longer since (#7961), and recent blocks can be served from the DA checker. This PR also maintains the following functionalities - Serving pre-executed blocks over RPC, and they're now served from the `data_availability_checker` instead. - Using the cache for de-duplicating lookup requests. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-19 07:01:13 +00:00
Jimmy Chen	4111bcb39b	Use scoped rayon pool for backfill chain segment processing (#7924 ) Part of #7866 - Continuation of #7921 In the above PR, we enabled rayon for batch KZG verification in chain segment processing. However, using the global rayon thread pool for backfill is likely to create resource contention with higher-priority beacon processor work. This PR introduces a dedicated low-priority rayon thread pool `LOW_PRIORITY_RAYON_POOL` and uses it for processing backfill chain segments. This prevents backfill KZG verification from using the global rayon thread pool and competing with high-priority beacon processor tasks for CPU resources. However, this PR by itself doesn't prevent CPU oversubscription because other tasks could still fill up the global rayon thread pool, and having an extra thread pool could make things worse. To address this we need the beacon processor to coordinate total CPU allocation across all tasks, which is covered in: - #7789 Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>	2025-09-18 07:10:23 +00:00
Michael Sproul	51321daabb	Make the block cache optional (#8066 ) Address contention on the store's `block_cache` by allowing it to be disabled when `--block-cache-size 0` is provided, and also making this the default. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-18 07:10:18 +00:00
Jimmy Chen	92f60b8fd2	Add release helper script to list PRs and breaking changes (#7737 ) Output for 7.1.0 release: ``` # Commit SHA PR Number Has backwards-incompat Label PR Title --- ------------ ----------- ------------------------------ -------------------------------------------- 1 `d5a03c9d86` 6872 False Add more range sync tests (#6872) 2 `ec2fe3812e` - - [NO PR MATCH]: Merge remote-tracking branch 'origin/release-v7.0.0-beta.0' into unstable 3 `3992d6ba74` 6862 False Fix misc PeerDAS todos (#6862) 4 `d60388134d` 6928 False Add PeerDAS metrics to track subnets without peers (#6928) 5 `431dd7c398` 6917 False Remove un-used batch sync error condition (#6917) 6 `0055af56b6` 6932 False Unsubscribe blob topics at Fulu fork (#6932) 7 `6ab6eae40c` - - [NO PR MATCH]: Merge remote-tracking branch 'origin/release-v7.0.0-beta.0' into unstable 8 `193061ff73` 6634 False Use RpcSend on RPC::self_limiter::ready_requests (#6634) 9 `e5e43ecd81` - - [NO PR MATCH]: Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 10 `b4be514182` 7012 False Add spamoor_blob in network_params.yaml (#7012) 11 `01df433dfd` 7021 False update codeowners, to be more specific (#7021) 12 `60964fc7b5` 6829 False Expose blst internals (#6829) 13 `3fab6a2c0b` 6866 False Block availability data enum (#6866) 14 `6e11bddd4b` 6947 False feat: adds CLI flags to delay publishing for edge case testing on PeerDAS devnets (#6947) 15 `454c7d05c4` 7017 False Remove LC server config from HTTP API (#7017) 16 `54b4150a62` 7030 False Add test flag to override `SYNC_TOLERANCE_EPOCHS` for range sync testing (#7030) 17 `cf4104abe5` - - [NO PR MATCH]: Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 18 `8a772520a5` 7034 False Cache validator registration only after successful publish (#7034) 19 `1235d44802` 7048 False Remove `watch` (#7048) 20 `3bc5f1f2a5` 7081 False Validator Registration ssz support (#7081) 21 `b4e79edf2a` - - [NO PR MATCH]: Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 22 `8d1abce26e` 6915 False Bump SSZ version for larger bitfield `SmallVec` (#6915) 23 `1916a2ac5a` 7020 False chore: update to rust-eth-kzg to 0.5.4 (#7020) 24 `1a08e6f0a0` 7109 False Remove duplicate sync_tolerance_epochs config (#7109) 25 `f23f984f85` 7057 False switch to upstream gossipsub (#7057) 26 `d60c24ef1c` 6339 True Integrate tracing (#6339) 27 `a6bdc474db` 6991 False Log range sync download errors (#6991) 28 `574b204bdb` 6680 False decouple `eth2` from `store` and `lighthouse_network` (#6680) 29 `c095a0a58f` 7130 False update gossipsub to the latest upstream revision (#7130) 30 `5cda1641ea` 7137 False Log `file appender` initialization errors properly (#7137) 31 `d96123b028` 7149 False Remove unnecessary `filter_layer` in logger builder (#7149) 32 `a1b1d7ae58` 7150 False Remove `discv5` logs from logfile output (#7150) 33 `ca237652f1` 6998 False Track request IDs in RangeBlockComponentsRequest (#6998) 34 `d323699fde` 7183 False Add missing `osaka-time` lcli param (#7183) 35 `cbf1c04a14` - - [NO PR MATCH]: resolve merge conflicts between untstable and release-v7.0.0 36 `2f37bf4de5` - - [NO PR MATCH]: Fix more merge conflicts between unstable and release-v7.0.0 37 `3f6c11db0e` 6995 False Some updates to Lighthouse book (#6995) 38 `9dce729cb6` 7182 False Ensure sqlite and rusqlite are optional in `consensus/types` (#7182) 39 `6f31d44343` 7033 False Remove CGC from data_availability checker (#7033) 40 `ca8eaea116` 7169 True Remove `crit` as an option from the CLI entirely (#7169) 41 `bde0f1ef0b` - - [NO PR MATCH]: Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 42 `fb7ec0d151` 7112 False Change `genesis-state-url-timeout` (#7112) 43 `4839ed620f` 7168 False Tracing cleanup (#7168) 44 `578db67755` - - [NO PR MATCH]: Merge remote-tracking branch 'origin/release-v7.0.0' into backmerge-apr-2 45 `80626e58d2` 7244 False Attempt to fix flaky network tests (#7244) 46 `d6cd049a45` 7238 False RPC RequestId Cleanup (#7238) 47 `0e6da0fcaf` - - [NO PR MATCH]: Merge branch 'release-v7.0.0' into v7-backmerge 48 `57abffcd99` 7240 False Disable log color when running in non-interactive mode (#7240) 49 `6a75f24ab1` 7188 False Fix the `getBlobs` metric and ensure it is recorded promptly to prevent miscounts (#7188) 50 `7cc64cab83` 6990 False Add missing error log and remove redundant id field from lookup logs (#6990) 51 `591fb7df14` - - [NO PR MATCH]: Merge branch 'release-v7.0.0' into backmerge-for-openssl 52 `e77fb01a06` 7265 False Remove CLI conflict for secrets-dir and datadir (#7265) 53 `b5d40e3db0` 7256 False Align logs (#7256) 54 `70850fe58d` 6744 True Drop head tracker for summaries DAG (#6744) 55 `47a85cd118` 7269 False Bump version to v7.1.0-beta.0 (not a release) (#7269) 56 `e924264e17` 7258 False Fullnodes to publish data columns from EL `getBlobs` (#7258) 57 `759b0612b3` 7117 False Offloading KZG Proof Computation from the beacon node (#7117) 58 `d96b73152e` 7192 False Fix for #6296: Deterministic RNG in peer DAS publish block tests (#7192) 59 `39eb8145f8` - - [NO PR MATCH]: Merge branch 'release-v7.0.0' into unstable 60 `70f8ab9a6f` 7309 False Add riscv64 build support (#7309) 61 `be68dd24d0` 7281 False Fix wrong custody column count for lookup blocks (#7281) 62 `08882c64ca` 6996 False Fix execution engine integration tests with latest geth version (#6996) 63 `476f3a593c` 7161 False Add `MAX_BLOBS_PER_BLOCK_FULU` config (#7161) 64 `c32569ab83` 7225 False Restore HTTP API logging and add more metrics (#7225) 65 `410af7c5f5` 7279 False feat: update mainnet bootnodes (#7279) 66 `80fe133d2c` 7280 False Update Lighthouse Book for Electra features (#7280) 67 `9f4b0cdc28` 7343 False Fix Kurtosis doppelganger CI (#7343) 68 `e61e92b926` - - [NO PR MATCH]: Merge remote-tracking branch 'origin/stable' into unstable 69 `5527125f5e` 7340 False Fix GitHub releases page looks bad in GitHub dark theme (#7340) 70 `c13e069c9c` 7324 False Revise logging when `queue is full` (#7324) 71 `1dd37048b9` 7346 False Enable cross-compiling for riscv64 architecture (#7346) 72 `402a81cdd7` 7350 False Fix Kurtosis testnet (#7350) 73 `1324d3d3c4` 5923 False Delayed RPC Send Using Tokens (#5923) 74 `6fad18644b` 6747 False feat: presign for validator account (#6747) 75 `2e2b0d2176` 7351 False Revise consolidation info in Lighthouse book (#7351) 76 `63a10eaaea` 6956 True Changing `boot_enr.yaml` to expect `bootstap_nodes.yaml` for pectra devnet (#6956) 77 `34a6c3a930` 6897 True vc: increase default gas limit (#6897) 78 `94ccd7608e` 6653 False Add documentation for VC API `/lighthouse/beacon/health` (#6653) 79 `9779b4ba2c` 7326 False Optimize `validate_data_columns` (#7326) 80 `93ec9df137` 7304 False Compute proposer shuffling only once in gossip verification (#7304) 81 `2aa5d5c25e` 7359 False Make sure to log SyncingChain ID (#7359) 82 `c8224c8d5e` 7387 False docs: fix broken link to voluntary exit guide (#7387) 83 `43c38a6fa0` 7378 False Change slog to tracing in comments (#7378) 84 `beb0ce68bd` 6922 False Make range sync peer loadbalancing PeerDAS-friendly (#6922) 85 `3d92e3663b` 6705 False Modularize validator store (#6705) 86 `058dae0641` 7405 False Add requires --http when using vc subcommands --http-port (#7405) 87 `0f13029c7d` 7409 False Don't publish data columns reconstructed from RPC columns to the gossip network (#7409) 88 `8dc3d23af0` 7400 False Add a default timeout to all `BeaconNodeHttpClient` requests (#7400) 89 `e90fcbe657` 7416 False Add ARM binary for macOS in release (#7416) 90 `4b9c16fc71` 7199 False Add Electra forks to basic sim tests (#7199) 91 `a497ec601c` 6975 False Retry custody requests after peer metadata updates (#6975) 92 `e0c1f27e13` 7394 False simulator: Persist beacon logs (#7394) 93 `92391cdac6` 7284 False update gossipsub to the latest upstream revision (#7284) 94 `593390162f` 7399 False `peerdas-devnet-7`: update `DataColumnSidecarsByRoot` request to use `DataColumnsByRootIdentifier` (#7399) 95 `5b25a48af3` 7404 False Siren installation improvement (#7404) 96 `e051c7ca89` 7396 False Siren Pectra Feature Updates (#7396) 97 `0a917989b2` 7370 False impl test random for some types (#7370) 98 `807848bc7a` 7443 False Next sync committee branch bug (#7443) 99 `851ee2bced` 7454 False Extract get_domain for VoluntaryExit (#7454) 100 `c2c7fb87a8` 7460 False Make DAG construction more permissive (#7460) 101 `b1138c28fb` 7451 False Add additional mergify rules to automate triaging (#7451) 102 `cc6ae9d3f0` 7463 False Fix mergify infinite loop. (#7463) 103 `1853d836b7` 7458 False Added E::slots_per_epoch() to deneb time calculation (#7458) 104 `c4182e362b` 7433 False simulator: Write dependency logs to separate files (#7433) 105 `e0ee148d6a` 7470 False Prevent mergify from updating labels while CI is still running. (#7470) 106 `e21198c08b` 7472 False One more attempt to fix mergify condition. (#7472) 107 `268809a530` 7471 False Rust clippy 1.87 lint fixes (#7471) 108 `b051a5d6cc` 7469 False Delete `at-most` in `lighthouse vm create` (#7469) 109 `1d27855db7` 7369 False impl from hash256 for `ExecutionBlockHash` (#7369) 110 `23ad833747` 7417 False Change default EngineState to online (#7417) 111 `fcfcbf9a11` 7481 False Update mdlint to disable descriptive-link-text (#7481) 112 `7684d1f866` 7372 False ContextDeserialize and Beacon API Improvements (#7372) 113 `5393d33af8` 7411 False Silence `Uninitialized` warn log on start-up (#7411) 114 `1e6cdeb88a` 6799 False feat: Add docker reproducible builds (#6799) 115 `50dbfdf612` 7455 False Some updates to Lighthouse book (#7455) 116 `af87135e30` 7484 False Move MD059 rule to configuration file (#7484) 117 `805c2dc831` 5047 False Correct reward denominator in op pool (#5047) 118 `7e2df6b602` 7474 False Empty list `[]` to return all validators balances (#7474) 119 `f06d1d0346` 7495 False Fix blob download from checkpointz servers (#7495) 120 `0688932de2` 7497 False Pass blobs into `ValidatorStore::sign_block` (#7497) 121 `e29b607257` 7427 False Move notifier and latency service to `validator_services` (#7427) 122 `7759cb8f91` 7494 False Update mergify rule to not evaluate PRs that are not ready for review - to reduce noise and avoid updating stale PRs. (#7494) 123 `2e96e9769b` 7507 False Use slice.is_sorted now that it's stable (#7507) 124 `a8035d7395` 7506 False Enable stdout logging in rpc_tests (#7506) 125 `817f14c349` 7500 False Send execution_requests in fulu (#7500) 126 `537fc5bde8` 7459 False Revive network-test logs files in CI (#7459) 127 `cf0f959855` 7180 False Improve log readability during rpc_tests (#7180) 128 `ce8d0814ad` 7246 False Ensure logfile permissions are maintained after rotation (#7246) 129 `6af8c187e0` 7052 False Publish EL Info in Metrics (#7052) 130 `a2797d4bbd` 7512 False Fix formatting errors from cargo-sort (#7512) 131 `f01dc556d1` 7505 False Update `engine_getBlobsV2` response type and add `getBlobsV2` tests (#7505) 132 `e6ef644db4` 7493 False Verify `getBlobsV2` response and avoid reprocessing imported data columns (#7493) 133 `7c89b970af` 7382 False Handle attestation validation errors (#7382) 134 `8dde5bdb44` - - [NO PR MATCH]: Update mergify rules so that I can add `waiting-on-author` on a PR that's passing CI. Remove noisy comments. 135 `8989ef8fb1` 7025 False Enable arithmetic lint in rate-limiter (#7025) 136 `b7fc03437b` - - [NO PR MATCH]: Fix condition 137 `9e9c51be6f` - - [NO PR MATCH]: Remove redundant `and` 138 `999b04517e` - - [NO PR MATCH]: Merge pull request #7525 from jimmygchen/mergify-again 139 `0ddf9a99d6` 7332 False Remove support for database migrations prior to schema version v22 (#7332) 140 `5cda6a6f9e` 7522 False Mitigate flakiness in test_delayed_rpc_response (#7522) 141 `4d21846aba` 7533 False Prevent `AvailabilityCheckError` when there's no new custody columns to import (#7533) 142 `39744df93f` 7393 False simulator: Fix `Failed to initialize dependency logging` (#7393) 143 `38a5f338fa` 7529 False Add `console-subscriber` feature for debugging (#7529) 144 `886ceb7e25` 6882 False Run Assertoor tests in CI (#6882) 145 `94a1446ac9` 7541 False Fix unexpected blob error and duplicate import in fetch blobs (#7541) 146 `ae30480926` 7521 False Implement EIP-7892 BPO hardforks (#7521) 147 `f67068e1ec` 7518 False Update `staking-deposit-cli` to `ethstaker-deposit-cli` (#7518) 148 `cd83d8d95d` 7544 False Add a name to the Tokio task (#7544) 149 `357a8ccbb9` 7549 False Checkpoint sync without the blobs from Fulu (#7549) 150 `2d9fc34d43` 7540 False Fulu EF tests v1.6.0-alpha.0 (#7540) 151 `dcee76c0dc` 7548 False Update key generation in validator manager (#7548) 152 `9a4972053e` 7530 False Add e2e sync tests to CI (#7530) 153 `d457ceeaaf` 7118 False Don't create child lookup if parent is faulty (#7118) 154 `2f807e21be` 7538 False Add support for nightly tests (#7538) 155 `e098f66738` 7570 False Update kurtosis config and EL images (#7570) 156 `b2e8b67e34` 7566 False Reduce number of basic sim test nodes from 7 to 4 (#7566) 157 `170cd0f587` 7579 False Store the libp2p/discv5 logs when stopping local-testnet (#7579) 158 `b08d49c4cb` 7559 False Changes for `fusaka-devnet-1` (#7559) 159 `8c6abc0b69` 7574 False Optimise parallelism in compute cells operations by zipping first (#7574) 160 `7416d06dce` 7561 False Add genesis sync test to CI (#7561) 161 `076a1c3fae` 7587 False Data column sidecar event (#7587) 162 `5f208bb858` 7578 True Implement basic validator custody framework (no backfill) (#7578) 163 `9803d69d80` 7590 False Implement status v2 version (#7590) 164 `5472cb8500` 7582 False Batch verify KZG proofs for getBlobsV2 (#7582) 165 `a65f78222d` 7594 False Drop stale registrations without reducing CGC (#7594) 166 `ccd99c138c` 7588 False Wait before column reconstruction (#7588) 167 `dc5f5af3eb` 7595 False Fix flaky test_rpc_block_reprocessing (#7595) 168 `4fc0665ccd` 7592 False Add more context to Late Block Re-orgs (#7592) 169 `6135f417a2` 7591 False Add data columns sidecars debug beacon API (#7591) 170 `3d2d65bf8d` 7593 False Advertise `--advertise-false-custody-group-count` for testing PeerDAS (#7593) 171 `6786b9d12a` 7444 True Single attestation "Full" implementation (#7444) 172 `dd98534158` 6750 True Hierarchical state diffs in hot DB (#6750) 173 `f67084a571` 7437 False Remove reprocess channel (#7437) 174 `d50924677a` 7620 False Remove instrumenting log level (#7620) 175 `11bcccb353` 7133 True Remove all prod eth1 related code (#7133) 176 `e34a9a0c65` 6551 False Allow the `--beacon-nodes` list to be updated at runtime (#6551) 177 `3fefda68e5` 7611 False Send byrange responses in the correct requested range (#7611) 178 `cef04ee2ee` 7462 False Implement `validator_identities` Beacon API endpoint (#7462) 179 `fd643c310c` 7632 False Un-ignore EF test for v1.6.0-alpha.1 (#7632) 180 `56b2d4b525` 7636 False Remove instrumenting log level (#7636) 181 `8e3c5d1524` 7644 False Rust 1.89 compiler lint fix (#7644) 182 `a0a6b9300f` 7551 False Do not compute sync selection proofs for the sync duty at the current slot (#7551) 183 `9b1f3ed9d1` 7652 False Add gossip check (#7652) 184 `83cad25d98` 7657 False Fix Rust 1.88 clippy errors & execution engine tests (#7657) 185 `522e00f48d` 7656 False Fix incorrect `waker` update condition (#7656) 186 `6ea5f14b39` 7597 False feat: better error message for light_client/bootstrap endpoint (#7597) 187 `2d759f78be` 6576 False Fix beacon_chain metrics descriptions (#6576) 188 `6be646ca11` 7666 True Bump DB schema to v25 (#7666) 189 `e45ba846ae` 7673 False Increase http client default timeout to 2s in `http-api` tests. (#7673) 190 `25ea8a83b7` 7667 False Add Michael as codeowner for store crate (#7667) 191 `c1f94d9b7b` 7669 False Test database schema stability (#7669) 192 `257d270718` 6612 False Add voluntary exit via validator manager (#6612) 193 `e305cb1b92` 7661 True Custody persist fix (#7661) 194 `41742ce2bd` 7683 False Update `SAMPLES_PER_SLOT` to be number of custody groups instead of data columns (#7683) 195 `69c9c7038a` 7681 False Use prepare_beacon_proposer endpoint for validator custody registration (#7681) 196 `fcc602a787` 7646 False Update fulu network configs and add `MIN_EPOCHS_FOR_DATA_COLUMN_SIDECARS_REQUESTS` (#7646) 197 `a459a9af98` 7689 False Fix and test checkpoint sync from genesis (#7689) 198 `b35854b71f` 7692 False Record v2 beacon blocks http api metrics separately (#7692) 199 `c7bb3b00e4` 7693 False Fix lookups of the block at `oldest_block_slot` (#7693) 200 `0f895f3066` 7695 False Bump default gas limit (#7695) 201 `56485cc986` 7707 False Remove unneeded spans that caused debug logs to appear when level is set to `info` (#7707) 202 `bd8a2a8ffb` 7023 False Gossip recently computed light client data (#7023) 203 `7b2f138ca7` - - [NO PR MATCH]: Merge remote-tracking branch 'origin/stable' into release-v7.1.0 204 `8e55684b06` 7723 False Reintroduce `--logfile` with deprecation warning (#7723) 205 `8b5ccacac9` 7663 False Error from RPC `send_response` when request doesn't exist on the active inbound requests (#7663) 206 `cfb1f73310` 7609 False Release v7.1.0 (#7609) ``` Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-18 06:13:27 +00:00
Michael Sproul	3543a20192	Add experimental complete-blob-backfill flag (#7751 ) A different (and complementary) approach for: - https://github.com/sigp/lighthouse/issues/5391 This PR adds a flag to set the DA boundary to the Deneb fork. The effect of this change is that Lighthouse will try to backfill _all_ blobs. Most peers do not have this data, but I'm thinking that combined with `trusted-peers` this could be quite effective. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-18 05:17:03 +00:00
Michael Sproul	684632df73	Fix reprocess queue memory leak (#8065 ) Fix a memory leak in the reprocess queue. If the vec of attestation IDs for a block is never evicted from the reprocess queue by a `BlockImported` event, then it stays in the map forever consuming memory. The fix is to remove the entry when its last attestation times out. We do similarly for light client updates. In practice this will only occur if there is a race between adding an attestation to the queue and processing the `BlockImported` event, or if there are attestations for block roots that we never import (e.g. random block roots, block roots of invalid blocks). Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-18 05:16:59 +00:00
Eitan Seri-Levi	521be2b757	Prevent silently dropping cell proof chunks (#8023 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-09-18 01:33:42 +00:00
Jimmy Chen	3cb7e59be2	Update issue template (#7938 ) * Update issue template * Delete old issue template	2025-09-18 11:17:31 +10:00
Toki	5928407ce4	fix(rate_limiter): add missing prune calls for light client protocols (#8058 ) Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io> Co-Authored-By: gitToki <tokipro@proton.me>	2025-09-17 04:51:43 +00:00
Lion - dapplion	b7d78a91e0	Don't penalize peers for extending ignored chains (#8042 ) Lookup sync has a cache of block roots "failed_chains". If a peer triggers a lookup for a block or descendant of a root in failed_chains the lookup is dropped and the peer penalized. However blocks are inserted into failed_chains for a single reason: - If a chain is longer than 32 blocks the lookup is dropped to prevent OOM risks. However the peer is not at fault, since discovering an unknown chain longer than 32 blocks is not malicious. We just drop the lookup to sync the blocks from range forward sync. This discrepancy is probably an oversight when changing old code. Before we used to add blocks that failed too many times to process to that cache. However, we don't do that anymore. Adding a block that fails too many times to process is an optimization to save resources in rare cases where peers keep sending us invalid blocks. In case that happens, today we keep trying to process the block, downscoring the peers and eventually disconnecting them. _IF_ we found that optimization to be necessary we should merge this PR (_Stricter match of BlockError in lookup sync_) first. IMO we are fine without the failed_chains cache and the ignored_chains cache will be obsolete with [tree sync](https://github.com/sigp/lighthouse/issues/7678) as the OOM risk of long lookup chains does not exist anymore. Closes https://github.com/sigp/lighthouse/issues/7577 Rename `failed_chains` for `ignored_chains` and don't penalize peers that trigger lookups for those blocks Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>	2025-09-17 01:02:29 +00:00
jking-aus	191570e4a1	chore: Bump discv5 and remove generic DefaultProtocolId in metrics (#8056 ) Bump discv5 version Co-Authored-By: Josh King <josh@sigmaprime.io>	2025-09-16 18:27:37 +00:00
Jimmy Chen	3de646c8b3	Enable reconstruction for nodes custodying more than 50% of columns and instrument tracing (#8052 ) Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-16 08:17:43 +00:00
Eitan Seri-Levi	242bdfcf12	Add instrumentation to `recompute_head_at_slot` (#8049 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-09-16 05:18:31 +00:00
Eitan Seri-Levi	aba3627099	Reduce reconstruction queue capacity (#8053 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-09-16 05:18:28 +00:00
Eitan Seri-Levi	4409500f63	Remove column reconstruction when processing rpc requests (#8051 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-09-16 05:18:25 +00:00
Michael Sproul	f04d5ecddd	Another check to prevent duplicate block imports (#8050 ) Attempt to address performance issues caused by importing the same block multiple times. - Check fork choice "after" obtaining the fork choice write lock in `BeaconChain::import_block`. We actually use an upgradable read lock, but this is semantically equivalent (the upgradable read has the advantage of not excluding regular reads). The hope is that this change has several benefits: 1. By preventing duplicate block imports we save time repeating work inside `import_block` that is unnecessary, e.g. writing the state to disk. Although the store itself now takes some measures to avoid re-writing diffs, it is even better if we avoid a disk write entirely. 2. By returning `DuplicateFullyImported`, we reduce some duplicated work downstream. E.g. if multiple threads importing columns trigger `import_block`, now only _one_ of them will get a notification of the block import completing successfully, and only this one will run `recompute_head`. This should help avoid a situation where multiple beacon processor workers are consumed by threads blocking on the `recompute_head_lock`. However, a similar block-fest is still possible with the upgradable fork choice lock (a large number of threads can be blocked waiting for the first thread to complete block import). Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-16 04:10:42 +00:00
Jimmy Chen	b8178515cd	Update engine methods in notifier (#8038 ) Fulu uses `getPayloadV5`, this PR updates the notifier logging prior to the fork. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-14 23:41:12 +00:00
Eitan Seri-Levi	aef8291f94	Add max delay to reconstruction (#7976 ) #7697 If we're three seconds into the current slot just trigger reconstruction. I don't know what the correct reconstruction deadline number is, but it should probably be at least half a second before the attestation deadline Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>	2025-09-12 06:05:42 +00:00
Jimmy Chen	fb77ce9e19	Add missing event in `PendingComponent` span and clean up sync logs (#8033 ) I was looking into some long `PendingComponents` span and noticed the block event wasn't added to the span, so it wasn't possible to see when the block was added from the trace view, this PR fixes this. <img width="637" height="430" alt="image" src="https://github.com/user-attachments/assets/65040b1c-11e7-43ac-951b-bdfb34b665fb" /> Additionally I've noticed a lot of noises and confusion in sync logs due to the initial`peer_id` being included as part of the syncing chain span, causing all logs under the span to have that `peer_id`, which may not be accurate for some sync logs, I've removed `peer_id` from the `SyncingChain` span, and also cleaned up a bunch of spans to use `%` (display) for slots and epochs to make logs easier to read. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-12 05:11:30 +00:00
Michael Sproul	87ae301d09	Remove unused logging metrics (#7997 ) @chong-he noticed that the INFO/WARN/ERRO log counts on our dashboards had stopped working. Since switching to `tracing` we are now tracking total events _per crate_, and the global counters are unused. Per-crate metrics are here: `cfb1f73310/common/logging/src/tracing_metrics_layer.rs (L61-L63)` Delete the unused global counters from the source. We can sum across the per-crate metric in our dashboards to restore the previous functionality. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-12 02:48:49 +00:00
Daniel Knopik	58156815f1	Expose functions to do preliminary slashing checks (#7783 ) Co-Authored-By: Daniel Knopik <daniel@dknopik.de> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-11 06:11:58 +00:00
Michael Sproul	a080bb5cee	Increase HTTP timeouts on CI (#8031 ) Since we re-enabled HTTP API tests on CI (https://github.com/sigp/lighthouse/pull/7943) there have been a few spurious failures: - https://github.com/sigp/lighthouse/actions/runs/17608432465/job/50024519938?pr=7783 That error is awkward, but running locally with a short timeout confirms it to be a timeout. Change the request timeout to 5s everywhere. We had kept it shorter to try to detect performance regressions, but I think this is better suited to being done with metrics & traces. On CI we really just want things to pass reliably without flakiness, so I think a longer timeout to handle slower test code (like mock-builder) and overworked CI boxes makes sense. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-11 00:47:39 +00:00
Jimmy Chen	02d519e957	Fixed orphaned `verify_cell_proof_chunk` span. (#8026 ) Fixed orphaned kzg verify cell proof chunk spans. See screenshot: <img width="1898" height="574" alt="image" src="https://github.com/user-attachments/assets/d60d8768-f995-407d-b7af-59722429e175" /> The parent span needs to be passed explicitly to the chunk verification span as parent, as rayon runs the function in a separate thread. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>	2025-09-10 21:02:27 +00:00
Daniel Ramirez-Chiquillo	2ecbb7f90b	Remove cargo test targets, use nextest exclusively (#7874 ) Fixes #7835 - Remove cargo test-based Make targets (`test-release`, `test-debug`, `run-ef-tests`) - Update aliases (`test`, `test-full`, `test-ef`) to use existing nextest equivalents - Update contributing documentation to use nextest examples - Fix example commands that previously referenced non-existing packages (`ssz`/`eth2_ssz`) Co-Authored-By: Daniel Ramirez-Chiquillo <hi@danielrachi.com>	2025-09-10 13:52:34 +00:00
kevaundray	f71d69755d	chore: add comment to PendingComponents (#7979 ) Adds doc comment Co-Authored-By: Kevaundray Wedderburn <kevtheappdev@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-10 13:48:11 +00:00
Daniel Knopik	ee1b6bc81b	Create `network_utils` crate (#7761 ) Anchor currently depends on `lighthouse_network` for a few types and utilities that live within. As we use our own libp2p behaviours, we actually do not use the core logic in that crate. This makes us transitively depend on a bunch of unneeded crates (even a whole separate libp2p if the versions mismatch!) Move things we require into it's own lightweight crate. Co-Authored-By: Daniel Knopik <daniel@dknopik.de>	2025-09-10 12:59:24 +00:00
Eitan Seri-Levi	caa1df6fc3	Skip column gossip verification logic during block production (#7973 ) #7950 Skip column gossip verification logic during block production as its redundant and potentially computationally expensive. Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-10 12:29:56 +00:00
hopinheimer	38205192ca	Fix http api tests ci (#7943 ) Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Michael Sproul <micsproul@gmail.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io> Co-Authored-By: hopinheimer <knmanas6@gmail.com>	2025-09-10 06:46:48 +00:00
Jimmy Chen	811eccdf34	Reduce noise in `Debug` impl of `RuntimeVariableList` (#8007 ) The default debug output of these types contains a lot of unnecessary noise making it hard to read. This PR removes the type and extra fields from debug output to make logs easier to read. `len` could be potentially useful in some cases, but this gives us flexibility to only log it separately if we need it. Related PR in `ssz_types`: - https://github.com/sigp/ssz_types/pull/57 Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-10 04:59:22 +00:00
Jimmy Chen	8a4f6cf0d5	Instrument tracing on block production code path (#8017 ) Partially #7814. Instrument block production code path. New root spans: * `produce_block_v3` * `produce_block_v2` Example traces: <img width="518" height="432" alt="image" src="https://github.com/user-attachments/assets/a9413d25-501c-49dc-95cc-623db5988981" /> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-10 03:30:51 +00:00
Odinson	2b22903fba	fix: extra fields in logs (#8009 ) Potentially fixes #7995 changed `span_data` to a `HashMap` and added a new check to remove span fields whose base names are already present on the event. Co-Authored-By: PoulavBhowmick03 <bpoulav@gmail.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>	2025-09-09 08:09:03 +00:00
Jimmy Chen	ee734d1456	Fix stuck data column lookups by improving peer selection and retry logic (#8005 ) Fixes the issue described in #7980 where Lighthouse repeatedly sends `DataColumnsByRoot` requests to the same peers that return empty responses, causing sync to get stuck. The root cause was we don't count empty responses as failures, leading to excessive retries to unresponsive peers. - Track per peer attempts to limit retry attempts per peer (`MAX_CUSTODY_PEER_ATTEMPTS = 3`) - Replaced random peer selection with hashing within each lookup to prevent splitting lookup into too many small requests and improve request batching efficiency. - Added `single_block_lookup` root span to track all lookups created and added more debug logs: <img width="1264" height="501" alt="image" src="https://github.com/user-attachments/assets/983629ba-b6d0-41cf-8e93-88a5b96c2f31" /> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-09 06:18:05 +00:00
Eitan Seri-Levi	8ec2640e04	Don't penalize peers if locally constructed light client data is stale (#7996 ) #7994 We seem to be penalizing peers in situations where locally constructed light client data is stale. This PR ignores incoming light client data if our locally constructed light client data isn't up to date. Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>	2025-09-05 03:23:34 +00:00
Jimmy Chen	fd10b63274	Add co-author to mergify commits (#7993 ) * Add co-author to mergify commits. * Remove unnecessary pull request rules from mergify config. * Revert automation removals	2025-09-05 07:54:30 +10:00
Jimmy Chen	9d2f55a399	Fix data column reconstruction error (#7998 ) Addresses #7991	2025-09-04 20:17:52 +00:00
Jimmy Chen	677de70025	Fix incorrect prune test logic (#7999 ) I just noticed that one of the tests i added in #7915 is incorrect, after it was running flaky for a bit. This PR fixes the scenario and ensure the outcome will always be the same.	2025-09-04 19:53:38 +00:00
Pawan Dhananjay	84ec209eba	Allow AwaitingDownload to be a valid in-between state (#7984 ) N/A Extracts (3) from https://github.com/sigp/lighthouse/pull/7946. Prior to peerdas, a batch should never have been in `AwaitingDownload` state because we immediataly try to move from `AwaitingDownload` to `Downloading` state by sending batches. This was always possible as long as we had peers in the `SyncingChain` in the pre-peerdas world. However, this is no longer the case as a batch can be stuck waiting in `AwaitingDownload` state if we have no peers to request the columns from. This PR makes `AwaitingDownload` to be an allowable in between state. If a batch is found to be in this state, then we attempt to send the batch instead of erroring like before. Note to reviewer: We need to make sure that this doesn't lead to a bunch of batches stuck in `AwaitingDownload` state if the chain can be progressed. Backfill already retries all batches in AwaitingDownload state so we just need to make `AwaitingDownload` a valid state during processing and validation. This PR explicitly adds the same logic for forward sync to download batches stuck in `AwaitingDownload`. Apart from that, we also force download of the `processing_target` when sync stops progressing. This is required in cases where `self.batches` has > `BATCH_BUFFER_SIZE` batches that are waiting to get processed but the `processing_batch` has repeatedly failed at download/processing stage. This leads to sync getting stuck and never recovering.	2025-09-04 07:39:16 +00:00
Jimmy Chen	c2a92f1a8c	Maintain peers across all data column subnets (#7915 ) Closes: - #7865 - #7855 Changes extracted from earlier PR #7876 This PR fixes two main things with a few other improvements mentioned below: - Prevent Lighthouse from repeatedly sending `DataColumnByRoot` requests to an unsynced peer, causing lookup sync to get stuck - Allows Lighthouse to send discovery requests if there isn't enough synced peers in the required sampling subnets - this fixes the stuck sync scenario where there isn't enough usable peers in sampling subnet but no discovery is attempted. - Make peer discovery queries if custody subnet peer count drops below the minimum threshold - Update peer pruning logic to prioritise uniform distribution across all data column subnets and avoid pruning sampling peers if the count is below the target threshold (2) - Check sync status when making discovery requests, to make sure we don't ignore requests if there isn't enough synced peers in the required sampling subnets - Optimise some of the `PeerDB` functions checking custody peers - Only send lookup requests to peers that are synced or advanced	2025-09-04 05:36:20 +00:00
Michael Sproul	76adedff27	Simplify length methods on BeaconBlockBody (#7989 ) Just the low-hanging fruit from: - https://github.com/sigp/lighthouse/pull/7988	2025-09-04 00:08:29 +00:00
Jimmy Chen	10e72df331	Add `tls-roots` feature to `opentelemetry_otlp` to support exporting traces over https (#7987 )	2025-09-03 08:05:09 +00:00
chonghe	a93cafee08	Implement `selections` Beacon API endpoints to support DVT middleware (#7016 ) * #6610 - [x] Add `beacon_committee_selections` endpoint - [x] Test beacon committee aggregator and confirmed working - [x] Add `sync_committee_selections` endpoint - [x] Test sync committee aggregator and confirmed working	2025-09-03 03:50:41 +00:00
Akihito Nakano	7b5be8b1e7	Remove ttfb_timeout and resp_timeout (#7925 ) `TTFB_TIMEOUT` was deprecated in https://github.com/ethereum/consensus-specs/pull/3767. Remove `ttfb_timeout` from `InboundUpgrade` and other related structs. (Update) Also removed `resp_timeout` and also removed the `NetworkParams` struct since its fields are no longer used. https://github.com/sigp/lighthouse/pull/7925#issuecomment-3226886352	2025-09-03 02:00:15 +00:00
Pawan Dhananjay	a9db8523a2	Update tracing (#7981 ) Update tracing subscriber for cargo audit failure https://rustsec.org/advisories/RUSTSEC-2025-0055	2025-09-03 02:00:12 +00:00
Jimmy Chen	eef02afc93	Fix data availability checker race condition causing partial data columns to be served over RPC (#7961 ) Partially resolves #6439, an simpler alternative to #7931. Race condition occurs when RPC data columns arrives after a block has been imported and removed from the DA checker: 1. Block becomes available via gossip 2. RPC columns arrive and pass fork choice check (block hasn't been imported) 3. Block import completes (removing block from DA checker) 4. RPC data columns finish verification and get imported into DA checker This causes two issues: 1. Partial data serving: Already imported components get re-inserted, potentially causing LH to serve incomplete data 2. State cache misses: Leads to state reconstruction, holding the availability cache write lock longer and increasing race likelihood ### Proposed Changes 1. Never manually remove pending components from DA checker. Components are only removed via LRU eviction as finality advances. This makes sure we don't run into the issue described above. 2. Use `get` instead of `pop` when recovering the executed block, this prevents cache misses in race condition. This should reduce the likelihood of the race condition 3. Refactor DA checker to drop write lock as soon as components are added. This should also reduce the likelihood of the race condition Trade-offs: This solution eliminates a few nasty race conditions while allowing simplicity, with the cost of allowing block re-import (already existing). The increase in memory in DA checker can be partially offset by a reduction in block cache size if this really comes an issue (as we now serve recent blocks from DA checker).	2025-09-02 07:18:23 +00:00
Jimmy Chen	979ed2557c	Remove `expect` usage in `kzg_utils` (#7957 ) Remove `expect` usage in `kzg_utils` to handle the case where EL sends us invalid proof size instead of crashing.	2025-09-01 09:21:26 +00:00
kevaundray	9cc3c0553b	chore: small refactor of `epoch` method (#7902 ) Stylistic; mostly using early returns to avoid the nested logic Which issue # does this PR address? Please list or describe the changes introduced by this PR.	2025-09-01 09:21:23 +00:00
Eitan Seri-Levi	c7492f1c27	Update to `1.6.0 alpha.6` spec (#7967 ) Upgrade `rust_eth_kzg` library to `0.9` to support the new cell index sorting tests in `recover_cells_and_kzg_proofs` https://github.com/ethereum/consensus-specs/releases https://github.com/crate-crypto/rust-eth-kzg/compare/v0.8.1...v0.9.0	2025-09-01 08:56:25 +00:00

1 2 3 4 5 ...

7063 Commits