Commit Graph

6685 Commits

Author SHA1 Message Date
Eitan Seri-Levi
27aabe8159 Pseudo finalization endpoint (#7103)
This is a backport of:
-  https://github.com/sigp/lighthouse/pull/7059
- https://github.com/sigp/lighthouse/pull/7071

For:
- https://github.com/sigp/lighthouse/issues/7039


  Introduce a new lighthouse endpoint that allows a user to force a pseudo finalization. This migrates data to the freezer db and prunes sidechains which may help reduce disk space issues on non finalized networks like Holesky

We also ban peers that send us blocks that conflict with the manually finalized checkpoint.

There were some CI fixes in https://github.com/sigp/lighthouse/pull/7071 that I tried including here

Co-authored with: @jimmygchen  @pawanjay176 @michaelsproul
2025-03-18 05:21:05 +00:00
Michael Sproul
58482586f5 Support Hoodi testnet (#7145)
Hardcode config for the upcoming `hoodi` testnet so we can run with `--network hoodi`.
2025-03-18 02:10:24 +00:00
Michael Sproul
4de062626b State cache tweaks (#7095)
Backport of:

- https://github.com/sigp/lighthouse/pull/7067

For:

- https://github.com/sigp/lighthouse/issues/7039


  - Prevent writing to state cache when migrating the database
- Add `state-cache-headroom` flag to control pruning
- Prune old epoch boundary states ahead of mid-epoch states
- Never prune head block's state
- Avoid caching ancestor states unless they are on an epoch boundary
- Log when states enter/exit the cache

Co-authored-by: Eitan Seri-Levi <eserilev@ucsc.edu>
2025-03-18 02:10:21 +00:00
Eitan Seri-Levi
8ce9edc584 Add block ban flag --invalid-block-roots (#7042) 2025-03-17 13:18:22 +00:00
Eitan Seri-Levi
9db29b023b Ensure finalized block is the correct fork variant when constructing light client updates (#7085) 2025-03-17 02:39:35 +00:00
Jun Song
50b5a72c58 feat: implement new beacon APIs(accessors for pending_deposits/pending_partial_withdrawals) (#7006)
Resolves #7003


  Added two endpoints as https://github.com/ethereum/beacon-APIs/pull/500 proposed:
- `/eth/v1/beacon/states/{state_id}/pending_deposits`
- `/eth/v1/beacon/states/{state_id}/pending_partial_withdrawals`
2025-03-17 01:46:50 +00:00
Eitan Seri-Levi
3a555f571f Address cargo audit failure RUSTSEC-2024-0437 (#7114)
Resolves #7091


  The `prometheus` crate pulls in `protobuf 2.x` which fails cargo audit. We actually dont use any `protobuf` related features in LH. By disabling default features for `prometheus`, we no longer pull in the `protobuf` crate
2025-03-13 03:17:33 +00:00
Eitan Seri-Levi
2c40f0b004 Set epochs-per-blob-prune default to 256 (#7113)
Partially #7100


  Set blob pruning to default to once per day
2025-03-13 02:43:07 +00:00
Jimmy Chen
9c4fc6eac2 Change state cache size default to 32 (#7101)
Cherry-picking #7055 from `holesky-rescue` branch to the clean `release-v7.0.0` branch.
2025-03-11 01:21:50 +00:00
Eitan Seri-Levi
0f5e680149 Address cargo audit failure RUSTSEC-2025-0009 (#7086) 2025-03-10 23:58:58 +00:00
Michael Sproul
7d598ed8a5 Optimise status processing (#7082)
This is a backport from `holesky-rescue`.

Part of:

- https://github.com/sigp/lighthouse/issues/7039

Original PR to `holesky-rescue`:

- https://github.com/sigp/lighthouse/pull/7054


  Avoid doing database lookups for slots that lie in the hot database when processing status messages. This avoids a DoS vector during non-finality, as loading hot states to iterate block roots is very expensive.
2025-03-10 04:51:49 +00:00
Michael Sproul
166f6df5a9 Temporarily ignore cargo audit failures (#7092)
Unblock CI by ignoring cargo audit failures. IMO we need to merge some stuff to get our PR backlog under control and can't wait for audit fixes. I've opened issues to address the actual audit failures:

- https://github.com/sigp/lighthouse/issues/7090
- https://github.com/sigp/lighthouse/issues/7091
2025-03-10 02:58:57 +00:00
Jimmy Chen
09849e841b Use sync_tolerance_epochs flag to control the proposer prep routines (#7044)
Replace the `2 + 2 == 5` hacks from `holesky-rescue` and use the existing `sync_tolerance_epochs` flag to control the proposer prep routines.
2025-03-06 03:50:42 +00:00
Eitan Seri-Levi
066f28770f Schedule Chiado testnet Electra hard fork (#7074) 2025-03-05 06:21:13 +00:00
Ryan Schneider
efa6ba37bb Make ExecutionBlock::total_difficulty Optional (#7050)
This change makes the `total_difficulty` field in `ExecutionBlock` an `Option<Uint256>` since newer clients are no longer including the `totalDifficulty` field.

I think this will fix https://github.com/sigp/lighthouse/issues/6937 but I was actually more focused on the builder registration case described below.

In our [builder-playground](https://github.com/flashbots/builder-playground) we setup a local devnet using lighthouse, reth, and mev-boost-relay.  After upgrading to reth 1.2.0 and lighthouse v7.0.0.beta.0 for Pectra, we noticed that the validator registration process was _sometimes_ failing with:

```
Feb 25 23:35:25.038 ERRO Unable to publish proposer preparation to all beacon nodes, error: Some endpoints failed, num_failed: 1 http://localhost:3500/ => RequestFailed(ServerMessage(ErrorMessage { code: 400, message: "BAD_REQUEST: error updating proposer preparations: ForkchoiceUpdate(EngineError(Api { error: Json(Error(\"missing field `totalDifficulty`\", line: 0, column: 0)) }))", stacktraces: [] })), service: preparation
Feb 25 23:35:25.099 WARN Unable to publish validator registrations to the builder network, error: Some endpoints failed, num_failed: 1 http://localhost:3500/ => RequestFailed(ServerMessage(ErrorMessage { code: 400, message: "BAD_REQUEST: error updating proposer preparations: ForkchoiceUpdate(EngineError(Api { error: Json(Error(\"missing field `totalDifficulty`\", line: 0, column: 0)) }))", stacktraces: [] })), service: preparation
```

What was even more confusing, was that it was sometimes working, which actually led to a wild goose chase thinking it was a networking issue.  However, when tracing through the LH code, I came across this comment:

70194dfc6a/beacon_node/beacon_chain/src/beacon_chain.rs (L6048-L6049)

This explained why it sometimes worked, in our playground we run lighthouse with `--prepare-payload-lookahead 8000` thus there was always a 4-second window where the call wasn't made.

But, if the call was made, then this code would 100% fail with updated reth:

https://github.com/sigp/lighthouse/blob/unstable/beacon_node/execution_layer/src/lib.rs#L1688-L1692

Which would then mapped to a `Error::ForkchoiceUpdate` in `update_execution_engine_forkchoice`.


  Anyways, the fix was to make `total_difficulty` Optional, and then to update any code paths where it was used.  In doing so, I assume that if the EL doesn't include total difficulty then the chain is already post-merge.
2025-03-05 01:53:00 +00:00
Mac L
29a295a134 Add --long-timeouts-multiplier CLI flag (#7047)
Adds the `--long-timeouts-multiplier` flag.
Allows granular control for VC timeouts which has proved useful in Holesky.
2025-03-05 01:52:57 +00:00
Michael Sproul
80cd8bd911 Add --disable-attesting flag to validator client (#7046)
Cleaned up and isolated version of the `--disable-attesting` flag for the VC, from the `holesky-rescue` branch:

- https://github.com/sigp/lighthouse/pull/7041

I figured we don't need the `--disable-attesting` flag on the BN for now, and it was a much more invasive impl.
2025-02-26 13:07:16 +00:00
Jimmy Chen
fe0cf9cb67 Add test flag to override SYNC_TOLERANCE_EPOCHS for range sync testing (#7030)
Related to #6880, an issue that's usually observed on local devnets with small number of nodes.

When testing range sync, I usually shutdown a node for some period of time and restart it again. However, if it's within `SYNC_TOLERANCE_EPOCHS` (8), Lighthouse would consider the node as synced, and if it may attempt to produce a block if requested by a validator - on a local devnet, nodes frequently produce blocks - when this happens, the node ends up producing a block that would revert finality and would get disconnected from peers immediately.

NOTE: This is PR#7030 cherry-picked from `unstable` to `release-v7.0.0`.

Run Lighthouse BN with this flag to override:

```
--sync-tolerance--epoch 0
```
2025-02-26 18:29:49 +11:00
Pawan Dhananjay
522b3cbaab Fix builder API headers (#7009)
Resolves https://github.com/sigp/lighthouse/issues/7000


  Set the accept header on builder to the correct value when requesting ssz.

This PR also adds a flag to disable ssz over the builder api altogether. In the case that builders/relays have an ssz bug, we can react quickly by asking clients to restart their nodes with the `--disable-ssz-builder` flag to force json. I'm not fully convinced if this is useful so open to removing it or opening another PR for it.

Testing this currently.
2025-02-24 03:39:13 +00:00
Pawan Dhananjay
b3b6aea1c5 Rust 1.85 lints (#7019)
N/A


  2 changes:
1. Replace Option::map_or(true, ...) with is_none_or(...)
2. Remove unnecessary `Into::into` blocks where the type conversion is apparent from the types
2025-02-24 02:36:13 +00:00
Michael Sproul
ff739d56be Fix light client merkle proofs (#7007)
Fix a regression introduced in this PR:

- https://github.com/sigp/lighthouse/pull/6361

We were indexing into the `MerkleTree` with raw generalized indices, which was incorrect and triggering `debug_assert` failures, as described here:

- https://github.com/sigp/lighthouse/issues/7005


  - Convert `generalized_index` to the correct leaf index prior to proof generation.
- Add sanity checks on indices used in `BeaconState::generate_proof`.
- Remove debug asserts from `MerkleTree::generate_proof` in favour of actual errors. This would have caught the bug earlier.
- Refactor the EF tests so that the merkle validity tests are actually run. They were misconfigured in a way that resulted in them running silently with 0 test cases, and the `check_all_files_accessed.py` script still had an ignore that covered the test files, so this omission wasn't detected.
2025-02-18 00:39:49 +00:00
Michael Sproul
1888be554c Release v7.0.0-beta.0 (#6962)
New release for Electra on Holesky and Sepolia.

Includes PRs:

- https://github.com/sigp/lighthouse/pull/6808
- https://github.com/sigp/lighthouse/pull/6914
- https://github.com/sigp/lighthouse/pull/6949
- https://github.com/sigp/lighthouse/pull/6950
- https://github.com/sigp/lighthouse/pull/6958
v7.0.0-beta.0
2025-02-13 03:06:20 +00:00
Michael Sproul
25f804a111 Fix light client plumbing in beacon processor (#6993)
Our Holesky nodes running with the light client enabled were logging messages about full queues:

> Feb 12 22:09:28.949 ERRO Work queue is full                      queue: unknown_light_client_optimistic_update, queue_len: 128, msg: the system has insufficient resources for load, service: bproc

I thought this might be genuine overload, but it turns out this queue was never being read from!


  - [x] Rename light-client related queues in the beacon processor for clarity.
- [x] Ensure all light-client related queues are being popped from.
2025-02-13 00:50:17 +00:00
Eitan Seri-Levi
ed8086c897 Ensure GET v2/validator/aggregate_attestation is backwards compatible (#6984)
Closes #6983


  `GET v2/validator/aggregate_attestation` is not backwards compatible. It only works for post electra attestations. This PR adds backwards compatibility and additional test coverage. We should include this in the upcoming 7.0 beta release if possible
2025-02-12 00:13:05 +00:00
Jimmy Chen
0728140086 Address cargo audit failure RUSTSEC-2025-0006 (#6972)
PR to fix cargo audit failure and bump `hickory-proto` version:
https://github.com/sigp/lighthouse/actions/runs/13252660830/job/36993664960
2025-02-11 04:17:02 +00:00
Age Manning
62a0f25f97 IPv6 By Default (#6808) 2025-02-10 01:58:11 +00:00
Michael Sproul
ceb5ecf349 Update EF tests to spec v1.5.0-beta.2 (#6958)
Update spec tests for recent v1.5.0-beta.2 release. There are no substantial changes for Electra and earlier, and the Fulu test updates to be made are tracked here:

- https://github.com/sigp/lighthouse/issues/6957


  - Add `SingleAttestation` SSZ tests
- Add new `deposit_with_reorg` fork choice tests
- Update tag to v1.5.0-beta.2
- Ignore Fulu tests
2025-02-10 01:58:08 +00:00
Lion - dapplion
f35213ebe7 Sync active request byrange ids logs (#6914)
- Re-opened PR from https://github.com/sigp/lighthouse/pull/6869

Writing and running tests I noted that the sync RPC requests are very verbose now.

`DataColumnsByRootRequestId { id: 123, requester: Custody(CustodyId { requester: CustodyRequester(SingleLookupReqId { req_id: 121, lookup_id: 101 }) }) }`

Since this Id is logged rather often I believe there's value in
1. Making them more succinct for log verbosity
2. Make them a string that's easy to copy and work with elastic


  Write custom `Display` implementations to render Ids in a more DX format

_ DataColumnsByRootRequestId with a block lookup_

```
123/Custody/121/Lookup/101
```

_DataColumnsByRangeRequestId_

```
123/122/RangeSync/0/5492900659401505034
```

- This one will be shorter after https://github.com/sigp/lighthouse/pull/6868

Also made the logs format and text consistent across all methods
2025-02-10 01:27:05 +00:00
Eitan Seri-Levi
afdda83798 Enable Light Client server by default (#6950) 2025-02-10 01:27:03 +00:00
Eitan Seri-Levi
e3e21f7516 Schedule Sepolia and Holesky Electra forks (#6949) 2025-02-10 01:27:00 +00:00
Michael Sproul
0344f68cfd Update attestation rewards API for Electra (#6819)
Closes:

- https://github.com/sigp/lighthouse/issues/6818


  Use `MAX_EFFECTIVE_BALANCE_ELECTRA` (2048) for attestation reward calculations involving Electra.

Add a new `InteropGenesisBuilder` that tries to provide a more flexible way to build genesis states. Unfortunately due to lifetime jank, it is quite unergonomic at present. We may want to refactor this builder in future to make it easier to use.
2025-02-09 10:15:33 +00:00
Eitan Seri-Levi
6032f15890 Fix aggregate attestation v2 response (#6926) 2025-02-09 10:15:30 +00:00
Lion - dapplion
e3c721817e Remove duplicated fork_epoch and fork_version implementation (#6953)
This PR adds an implementation to get fork_version and fork_epoch given a `ForkName`. I didn't realize that this is already implemented in the `ChainSpec` sorry

- https://github.com/sigp/lighthouse/pull/6933


  Remove duplicated fork_epoch and fork_version implementation
2025-02-08 00:38:24 +00:00
Michael Sproul
2bd5bbdffb Optimise and refine SingleAttestation conversion (#6934)
Closes

- https://github.com/sigp/lighthouse/issues/6805


  - Use a new `WorkEvent::GossipAttestationToConvert` to handle the conversion from `SingleAttestation` to `Attestation` _on_ the beacon processor (prevents a Tokio thread being blocked).
- Improve the error handling for single attestations. I think previously we had no ability to reprocess single attestations for unknown blocks -- we would just error. This seemed to be the case in both gossip processing and processing of `SingleAttestation`s from the HTTP API.
- Move the `SingleAttestation -> Attestation` conversion function into `beacon_chain` so that it can return the `attestation_verification::Error` type, which has well-defined error handling and peer penalties. The now-unused variants of `types::Attestation::Error` have been removed.
2025-02-07 23:18:57 +00:00
Michael Sproul
cb117f859d Fix fetch blobs in all-null case (#6940)
Fix another issue with fetch-blobs, similar to:

- https://github.com/sigp/lighthouse/pull/6911


  Check if the list of blobs returned is all `None`, and if so, do not proceed any further.

This prevents an ugly error like:

> Feb 03 17:32:12.384 ERRO Error fetching or processing blobs from EL, block_root: 0x7326fe2dc1cb9036c9de7a07a662c86a339085597849016eadf061b70b7815ba, error: BlobProcessingError(AvailabilityCheck(Unexpected)), module
: network::network_beacon_processor:1011
2025-02-07 09:19:32 +00:00
chonghe
d6596dbe21 Keep execution payload during historical backfill when prune-payloads set to false (#6766)
- #6510


  - Keep execution payload during historical backfill when `--prune-payloads false` is set
- Add a field in the historical backfill debug log to indicate if execution payload is kept
- Add a test to check historical blocks has execution payload when `--prune-payloads false is set
- Very minor typo correction that I notice when working on this
2025-02-07 09:19:29 +00:00
Lion - dapplion
921d95217d Remove un-used batch sync error condition (#6917)
- PR https://github.com/sigp/lighthouse/pull/6497 made obsolete some consistency checks inside the batch

I forgot to remove the consumers of those errors


  Remove un-used batch sync error condition, which was a nested `Result<_, Result<_, E>>`
2025-02-07 07:48:58 +00:00
Akihito Nakano
7408719de8 Remove unused metrics (#6817)
N/A


  Removed metrics that were defined but not used anywhere.
2025-02-07 07:48:52 +00:00
Lion - dapplion
59afe41d61 Reduce ForkName boilerplate in fork-context (#6933)
Noted that there's a bit of fork boiler plate in fork context.


  If we list a mapping of ForkName -> fork_version in the ForkName enum we can get rid of it :)

Not much, but should make the next fork a tiny tit less annoying
2025-02-07 05:38:36 +00:00
Jimmy Chen
9c45a0e8c1 Use old geth version due to breaking changes. (#6936)
A temporary workaround for the failing execution tests for geth.

https://github.com/sigp/lighthouse/actions/runs/13192297954

The test is broken due to the following breaking changes in geth that requires updating our tests:
1. removal of `personal` namespace in v1.14.12: See #30704
2. removal of `totalDifficulty` field from RPC in v1.14.11. See #30386.

Using an older version for now (` 1.14.10`) as we need to get things merged for the upcoming release.

Will create a separate issue to fix this.
2025-02-07 04:53:14 +00:00
Michael Sproul
364a978f12 Fix attestation queue length metric (#6924)
We were using the wrong queue length for attestation work event metrics.
2025-02-06 07:08:20 +00:00
Krishang Shah
a4e3f361bf Update metrics.rs (#6863)
Fixes #5206, a low-hanging fruit.
2025-02-06 05:19:51 +00:00
Lion - dapplion
2193f6a4d4 Add individual by_range sync requests (#6497)
Part of
- https://github.com/sigp/lighthouse/issues/6258

To address PeerDAS sync issues we need to make individual by_range requests within a batch retriable. We should adopt the same pattern for lookup sync where each request (block/blobs/columns) is tracked individually within a "meta" request that group them all and handles retry logic.


  - Building on https://github.com/sigp/lighthouse/pull/6398

second step is to add individual request accumulators for `blocks_by_range`, `blobs_by_range`, and `data_columns_by_range`. This will allow each request to progress independently and be retried separately.

Most of the logic is just piping, excuse the large diff. This PR does not change the logic of how requests are handled or retried. This will be done in a future PR changing the logic of `RangeBlockComponentsRequest`.

### Before

- Sync manager receives block with `SyncRequestId::RangeBlockAndBlobs`
- Insert block into `SyncNetworkContext::range_block_components_requests`
- (If received stream terminators of all requests)
- Return `Vec<RpcBlock>`, and insert into `range_sync`

### Now

- Sync manager receives block with `SyncRequestId::RangeBlockAndBlobs`
- Insert block into `SyncNetworkContext:: blocks_by_range_requests`
- (If received stream terminator of this request)
- Return `Vec<SignedBlock>`, and insert into `SyncNetworkContext::components_by_range_requests `
- (If received a result for all requests)
- Return `Vec<RpcBlock>`, and insert into `range_sync`
2025-02-05 07:08:28 +00:00
Pawan Dhananjay
7bfdb33729 Return error if getBlobs not supported (#6911)
N/A


  Previously, we were returning an empty vec of Nones if get_blobs was not supported in the EL. This results in confusing logging where we try to process the empty list of blobs and log a bunch of Unexpected errors. See
```
Feb 03 17:32:12.383 DEBG Fetching blobs from the EL, num_expected_blobs: 6, block_root: 0x7326fe2dc1cb9036c9de7a07a662c86a339085597849016eadf061b70b7815ba, service: fetch_engine_blobs, service: beacon, module: beac
on_chain::fetch_blobs:84
Feb 03 17:32:12.384 DEBG Processing engine blobs, num_fetched_blobs: 0, block_root: 0x7326fe2dc1cb9036c9de7a07a662c86a339085597849016eadf061b70b7815ba, service: fetch_engine_blobs, service: beacon, module: beacon_c
hain::fetch_blobs:197
Feb 03 17:32:12.384 ERRO Error fetching or processing blobs from EL, block_root: 0x7326fe2dc1cb9036c9de7a07a662c86a339085597849016eadf061b70b7815ba, error: BlobProcessingError(AvailabilityCheck(Unexpected)), module
: network::network_beacon_processor:1011
```

The error we should be getting is that getBlobs is not supported, this PR adds a new error variant and returns that.
2025-02-05 01:14:41 +00:00
chonghe
3d06bc26d1 Add test to beacon node fallback feature (#6568) 2025-02-04 06:43:37 +00:00
Eitan Seri-Levi
56f201a257 Add check to Lockbud CI job (#6898) 2025-02-04 02:00:37 +00:00
Age Manning
d1061dcf59 UX Network Fixes (#6796)
There were two things I came across during some recent testing, that this PR addresses.

1 - The default port for IPv6 was set to 9090, which is confusing. I've set this to match its ipv4 counterpart (i.e 9000 and 9001). This makes more sense and is easier to firewall, for those firewalls that support both versions for a single rule.

2 - Watching the NAT status of lighthouse, I notice we only set the field to 1 once the NAT is passed. We don't give it a default 0 (false). So we only see results when its successful. On peer disconnects, i've piggy-backed a loop of the connected peers to also watch and check for NAT status updates.
2025-02-04 01:28:37 +00:00
kevaundray
df131b2a6c chore: update peerDAS KZG library to 0.5.3 (#6906)
This optimizes the time it takes to load the context, so that tests do not time out
2025-02-04 00:47:11 +00:00
Eitan Seri-Levi
7e4b27c922 Migrate validator client to clap derive (#6300)
Partially #5900


  Migrate the validator client cli to clap derive
2025-02-03 20:08:31 +00:00
Lion - dapplion
95cec45c38 Use data column batch verification consistently (#6851)
Resolve a `TODO(das)` to use KZG batch verification in `put_rpc_custody_columns`


  Uses `verify_kzg_for_data_column_list_with_scoring` in all paths that send more than one column. To use batch verification and have attributability of which peer is sending a bad column.

Needs to move `verify_kzg_for_data_column_list_with_scoring` into the type's module to convert to the KZG verified type.
2025-02-03 06:07:45 +00:00