Commit Graph

3366 Commits

Author SHA1 Message Date
Mac L
0e6da0fcaf Merge branch 'release-v7.0.0' into v7-backmerge 2025-04-04 13:32:58 +11:00
Mac L
82d1674455 Rust 1.86.0 lints (#7254)
Implement lints for the new Rust compiler version 1.86.0.
2025-04-04 02:30:22 +00:00
Age Manning
d6cd049a45 RPC RequestId Cleanup (#7238)
I've been working at updating another library to latest Lighthouse and got very confused with RPC request Ids.

There were types that had fields called `request_id` and `id`. And interchangeably could have types `PeerRequestId`, `rpc::RequestId`, `AppRequestId`, `api_types::RequestId` or even `Request.id`.

I couldn't keep track of which Id was linked to what and what each type meant.

So this PR mainly does a few things:
- Changes the field naming to match the actual type. So any field that has an  `AppRequestId` will be named `app_request_id` rather than `id` or `request_id` for example.
- I simplified the types. I removed the two different `RequestId` types (one in Lighthouse_network the other in the rpc) and grouped them into one. It has one downside tho. I had to add a few unreachable lines of code in the beacon processor, which the extra type would prevent, but I feel like it might be worth it. Happy to add an extra type to avoid those few lines.
- I also removed the concept of `PeerRequestId` which sometimes went alongside a `request_id`. There were times were had a `PeerRequest` and a `Request` being returned, both of which contain a `RequestId` so we had redundant information. I've simplified the logic by removing `PeerRequestId` and made a `ResponseId`. I think if you look at the code changes, it simplifies things a bit and removes the redundant extra info.

I think with this PR things are a little bit easier to reasonable about what is going on with all these RPC Ids.

NOTE: I did this with the help of AI, so probably should be checked
2025-04-03 10:10:15 +00:00
Jimmy Chen
80626e58d2 Attempt to fix flaky network tests (#7244) 2025-04-03 04:01:34 +00:00
Michael Sproul
578db67755 Merge remote-tracking branch 'origin/release-v7.0.0' into backmerge-apr-2 2025-04-02 09:57:42 +11:00
Michael Sproul
9bc0d5161e Disable LevelDB snappy feature (#7235)
Disable the `snappy` feature of LevelDB to prevent compilation issues with CMake 4.0, e.g.

https://github.com/sigp/lighthouse/actions/runs/14182783816/job/39732457274?pr=7231

We do not use Snappy compression in LevelDB, and do not need to compile this. This might also shave a few seconds off compilation!
2025-04-01 07:56:06 +00:00
Michael Sproul
bde0f1ef0b Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 2025-03-29 13:01:58 +11:00
Pawan Dhananjay
54aef2d043 Admin add/remove peer (#7198)
N/A


  Adds endpoints to add and remove trusted peers from the http api. The added peers are trusted peers so they won't be disconnected for bad scores. We try to maintain a connection to the peer in case they disconnect from us by trying to dial it every heartbeat.
2025-03-28 12:59:09 +00:00
Eitan Seri-Levi
a5ea05ce2a Top-up pubkey cache on startup (#7217)
This is a workaround for #7216


  In the case of gaps between the in-memory pub key cache and its on-disk representation, use the head state on startup to "top-up" the cache/db w/ any missing validators
2025-03-28 08:29:19 +00:00
Michael Sproul
6d5a2be7f9 Release v7.0.0-beta.5 (#7210)
New release for Pectra-enabled networks.
2025-03-27 03:42:34 +00:00
Michael Sproul
7d792e615c Fix xdelta3 output buffer issue (#7174)
* Fix xdelta3 output buffer issue

* Fix buckets

* Update commit hash to `main`

* Tag TODO(hdiff)

* Update cargo lock
2025-03-27 13:25:50 +11:00
Lion - dapplion
6f31d44343 Remove CGC from data_availability checker (#7033)
- Part of https://github.com/sigp/lighthouse/issues/6767

Validator custody makes the CGC and set of sampling columns dynamic. Right now this information is stored twice:
- in the data availability checker
- in the network globals

If that state becomes dynamic we must make sure it is in sync updating it twice, or guarding it behind a mutex. However, I noted that we don't really have to keep the CGC inside the data availability checker. All consumers can actually read it from the network globals, and we can update `make_available` to read the expected count of data columns from the block.
2025-03-26 05:19:51 +00:00
Eitan Seri-Levi
2f37bf4de5 Fix more merge conflicts between unstable and release-v7.0.0 2025-03-24 09:13:28 +11:00
Eitan Seri-Levi
cbf1c04a14 resolve merge conflicts between untstable and release-v7.0.0 2025-03-23 11:09:02 -06:00
Lion - dapplion
ca237652f1 Track request IDs in RangeBlockComponentsRequest (#6998)
Part of
- https://github.com/sigp/lighthouse/issues/6258


  `RangeBlockComponentsRequest` handles a set of by_range requests. It's quite lose on these requests, not tracking them by ID. We want to implement individual request retries, so we must make `RangeBlockComponentsRequest` aware of its requests IDs. We don't want the result of a prior by_range request to affect the state of a future retry. Lookup sync uses this mechanism.

Now `RangeBlockComponentsRequest` tracks:

```rust
pub struct RangeBlockComponentsRequest<E: EthSpec> {
blocks_request: ByRangeRequest<BlocksByRangeRequestId, Vec<Arc<SignedBeaconBlock<E>>>>,
block_data_request: RangeBlockDataRequest<E>,
}

enum RangeBlockDataRequest<E: EthSpec> {
NoData,
Blobs(ByRangeRequest<BlobsByRangeRequestId, Vec<Arc<BlobSidecar<E>>>>),
DataColumns {
requests: HashMap<
DataColumnsByRangeRequestId,
ByRangeRequest<DataColumnsByRangeRequestId, DataColumnSidecarList<E>>,
>,
expected_custody_columns: Vec<ColumnIndex>,
},
}

enum ByRangeRequest<I: PartialEq + std::fmt::Display, T> {
Active(I),
Complete(T),
}
```

I have merged `is_finished` and `Into_responses` into the same function. Otherwise, we need to duplicate the logic to figure out if the requests are done.
2025-03-21 04:07:30 +00:00
Michael Sproul
04868027a6 Release v7.0.0-beta.4 (#7162)
New release for Hoodi testnet including clean versions of fixes from `holesky-rescue`.
2025-03-20 05:12:19 +00:00
Eitan Seri-Levi
e4c9805438 Reject attestations to blocks prior to the split (#7084) 2025-03-19 13:39:28 +00:00
Eitan Seri-Levi
ed1b7689ae Manual compaction endpoint backport (#7104)
Backports:
- https://github.com/sigp/lighthouse/pull/7072

To:
- https://github.com/sigp/lighthouse/issues/7039

#7103 should be merged first


  This PR introduces an endpoint that allows users to manually trigger background compaction.
2025-03-18 06:29:12 +00:00
Eitan Seri-Levi
27aabe8159 Pseudo finalization endpoint (#7103)
This is a backport of:
-  https://github.com/sigp/lighthouse/pull/7059
- https://github.com/sigp/lighthouse/pull/7071

For:
- https://github.com/sigp/lighthouse/issues/7039


  Introduce a new lighthouse endpoint that allows a user to force a pseudo finalization. This migrates data to the freezer db and prunes sidechains which may help reduce disk space issues on non finalized networks like Holesky

We also ban peers that send us blocks that conflict with the manually finalized checkpoint.

There were some CI fixes in https://github.com/sigp/lighthouse/pull/7071 that I tried including here

Co-authored with: @jimmygchen  @pawanjay176 @michaelsproul
2025-03-18 05:21:05 +00:00
Michael Sproul
4de062626b State cache tweaks (#7095)
Backport of:

- https://github.com/sigp/lighthouse/pull/7067

For:

- https://github.com/sigp/lighthouse/issues/7039


  - Prevent writing to state cache when migrating the database
- Add `state-cache-headroom` flag to control pruning
- Prune old epoch boundary states ahead of mid-epoch states
- Never prune head block's state
- Avoid caching ancestor states unless they are on an epoch boundary
- Log when states enter/exit the cache

Co-authored-by: Eitan Seri-Levi <eserilev@ucsc.edu>
2025-03-18 02:10:21 +00:00
Eitan Seri-Levi
8ce9edc584 Add block ban flag --invalid-block-roots (#7042) 2025-03-17 13:18:22 +00:00
Jun Song
50b5a72c58 feat: implement new beacon APIs(accessors for pending_deposits/pending_partial_withdrawals) (#7006)
Resolves #7003


  Added two endpoints as https://github.com/ethereum/beacon-APIs/pull/500 proposed:
- `/eth/v1/beacon/states/{state_id}/pending_deposits`
- `/eth/v1/beacon/states/{state_id}/pending_partial_withdrawals`
2025-03-17 01:46:50 +00:00
João Oliveira
c095a0a58f update gossipsub to the latest upstream revision (#7130)
I feel it's preferable to do this explicitly by updating the revision on `Cargo.toml` rather than implicitly by letting `Cargo.lock` control the revision of the branch.
2025-03-17 00:22:19 +00:00
Daniel Knopik
574b204bdb decouple eth2 from store and lighthouse_network (#6680)
- #6452 (partially)


  Remove dependencies on `store` and `lighthouse_network`  from `eth2`. This was achieved as follows:

- depend on `enr` and `multiaddr` directly instead of using `lighthouse_network`'s reexports.
- make `lighthouse_network` responsible for converting between API and internal types.
- in two cases, remove complex internal types and use the generic `serde_json::Value` instead - this is not ideal, but should be fine for now, as this affects two internal non-spec endpoints which are meant for debugging, unstable, and subject to change without notice anyway. Inspired by #6679. The alternative is to move all relevant types to `eth2` or `types` instead - what do you think?
2025-03-14 16:44:48 +00:00
Lion - dapplion
a6bdc474db Log range sync download errors (#6991)
Currently range sync download log errors just say `error: rpc_error` which isn't helpful. The actual error is suppressed unless logged somewhere else.


  Log the actual error that caused the batch download to fail as part of the log that states that the batch download failed.
2025-03-13 13:26:43 +00:00
Eitan Seri-Levi
2c40f0b004 Set epochs-per-blob-prune default to 256 (#7113)
Partially #7100


  Set blob pruning to default to once per day
2025-03-13 02:43:07 +00:00
ThreeHrSleep
d60c24ef1c Integrate tracing (#6339)
Tracing Integration
- [reference](5bbf1859e9/projects/project-ideas.md (L297))


  - [x] replace slog & log with tracing throughout the codebase
- [x] implement custom crit log
- [x] make relevant changes in the formatter
- [x] replace sloggers
- [x] re-write SSE logging components

cc: @macladson @eserilev
2025-03-12 22:31:05 +00:00
João Oliveira
f23f984f85 switch to upstream gossipsub (#7057)
We forked `gossipsub` into the lighthouse repo sometime ago so that we could iterate quicker on implementing back pressure and IDONTWANT.
Meanwhile we have pushed all our changes upstream and we are now the main maintainers of `rust-libp2p` this allows us to use upstream `gossipsub` again.
Nonetheless we still use our forked repo to give us freedom to experiment with features before submitting them upstream
2025-03-11 12:59:16 +00:00
Michael Sproul
1a08e6f0a0 Remove duplicate sync_tolerance_epochs config (#7109)
Delete duplicate sync tolerance epoch config in the HTTP API which is unused.

We introduced the `sync-tolerance-epoch` flag in this PR:

- https://github.com/sigp/lighthouse/pull/7030

Then refined it in this PR:

- https://github.com/sigp/lighthouse/pull/7044

Somewhere in the merge of `release-v7.0.0` into `unstable`, the config from the original PR which had been deleted came back. I think I resolved these conflicts, so my bad.
2025-03-11 05:35:13 +00:00
Jimmy Chen
9c4fc6eac2 Change state cache size default to 32 (#7101)
Cherry-picking #7055 from `holesky-rescue` branch to the clean `release-v7.0.0` branch.
2025-03-11 01:21:50 +00:00
Paul Hauner
8d1abce26e Bump SSZ version for larger bitfield SmallVec (#6915)
NA


  Bumps the `ethereum_ssz` version, along with other crates that share the dep.

Primarily, this give us bitfields which can store 128 bytes on the stack before allocating, rather than 32 bytes (https://github.com/sigp/ethereum_ssz/pull/38). The validator count has increase massively since we set it at 32 bytes, so aggregation bitfields (et al) now require a heap allocation. This new value of 128 should get us to ~2m active validators.
2025-03-10 08:18:33 +00:00
Michael Sproul
7d598ed8a5 Optimise status processing (#7082)
This is a backport from `holesky-rescue`.

Part of:

- https://github.com/sigp/lighthouse/issues/7039

Original PR to `holesky-rescue`:

- https://github.com/sigp/lighthouse/pull/7054


  Avoid doing database lookups for slots that lie in the hot database when processing status messages. This avoids a DoS vector during non-finality, as loading hot states to iterate block roots is very expensive.
2025-03-10 04:51:49 +00:00
Michael Sproul
b4e79edf2a Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 2025-03-10 15:21:24 +11:00
Jimmy Chen
09849e841b Use sync_tolerance_epochs flag to control the proposer prep routines (#7044)
Replace the `2 + 2 == 5` hacks from `holesky-rescue` and use the existing `sync_tolerance_epochs` flag to control the proposer prep routines.
2025-03-06 03:50:42 +00:00
Ryan Schneider
efa6ba37bb Make ExecutionBlock::total_difficulty Optional (#7050)
This change makes the `total_difficulty` field in `ExecutionBlock` an `Option<Uint256>` since newer clients are no longer including the `totalDifficulty` field.

I think this will fix https://github.com/sigp/lighthouse/issues/6937 but I was actually more focused on the builder registration case described below.

In our [builder-playground](https://github.com/flashbots/builder-playground) we setup a local devnet using lighthouse, reth, and mev-boost-relay.  After upgrading to reth 1.2.0 and lighthouse v7.0.0.beta.0 for Pectra, we noticed that the validator registration process was _sometimes_ failing with:

```
Feb 25 23:35:25.038 ERRO Unable to publish proposer preparation to all beacon nodes, error: Some endpoints failed, num_failed: 1 http://localhost:3500/ => RequestFailed(ServerMessage(ErrorMessage { code: 400, message: "BAD_REQUEST: error updating proposer preparations: ForkchoiceUpdate(EngineError(Api { error: Json(Error(\"missing field `totalDifficulty`\", line: 0, column: 0)) }))", stacktraces: [] })), service: preparation
Feb 25 23:35:25.099 WARN Unable to publish validator registrations to the builder network, error: Some endpoints failed, num_failed: 1 http://localhost:3500/ => RequestFailed(ServerMessage(ErrorMessage { code: 400, message: "BAD_REQUEST: error updating proposer preparations: ForkchoiceUpdate(EngineError(Api { error: Json(Error(\"missing field `totalDifficulty`\", line: 0, column: 0)) }))", stacktraces: [] })), service: preparation
```

What was even more confusing, was that it was sometimes working, which actually led to a wild goose chase thinking it was a networking issue.  However, when tracing through the LH code, I came across this comment:

70194dfc6a/beacon_node/beacon_chain/src/beacon_chain.rs (L6048-L6049)

This explained why it sometimes worked, in our playground we run lighthouse with `--prepare-payload-lookahead 8000` thus there was always a 4-second window where the call wasn't made.

But, if the call was made, then this code would 100% fail with updated reth:

https://github.com/sigp/lighthouse/blob/unstable/beacon_node/execution_layer/src/lib.rs#L1688-L1692

Which would then mapped to a `Error::ForkchoiceUpdate` in `update_execution_engine_forkchoice`.


  Anyways, the fix was to make `total_difficulty` Optional, and then to update any code paths where it was used.  In doing so, I assume that if the EL doesn't include total difficulty then the chain is already post-merge.
2025-03-05 01:53:00 +00:00
Jimmy Chen
fe0cf9cb67 Add test flag to override SYNC_TOLERANCE_EPOCHS for range sync testing (#7030)
Related to #6880, an issue that's usually observed on local devnets with small number of nodes.

When testing range sync, I usually shutdown a node for some period of time and restart it again. However, if it's within `SYNC_TOLERANCE_EPOCHS` (8), Lighthouse would consider the node as synced, and if it may attempt to produce a block if requested by a validator - on a local devnet, nodes frequently produce blocks - when this happens, the node ends up producing a block that would revert finality and would get disconnected from peers immediately.

NOTE: This is PR#7030 cherry-picked from `unstable` to `release-v7.0.0`.

Run Lighthouse BN with this flag to override:

```
--sync-tolerance--epoch 0
```
2025-02-26 18:29:49 +11:00
Michael Sproul
cf4104abe5 Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 2025-02-25 08:42:16 +11:00
Jimmy Chen
54b4150a62 Add test flag to override SYNC_TOLERANCE_EPOCHS for range sync testing (#7030)
Related to #6880, an issue that's usually observed on local devnets with small number of nodes.

When testing range sync, I usually shutdown a node for some period of time and restart it again. However, if it's within `SYNC_TOLERANCE_EPOCHS` (8), Lighthouse would consider the node as synced, and if it may attempt to produce a block if requested by a validator - on a local devnet, nodes frequently produce blocks - when this happens, the node ends up producing a block that would revert finality and would get disconnected from peers immediately.

### Usage

Run Lighthouse BN with this flag to override:

```
--sync-tolerance--epoch 0
```
2025-02-24 08:30:11 +00:00
Michael Sproul
454c7d05c4 Remove LC server config from HTTP API (#7017)
Partly addresses

- https://github.com/sigp/lighthouse/issues/6959


  Use the `enable_light_client_server` field from the beacon chain config in the HTTP API. I think we can make this the single source of truth, as I think the network crate also has access to the beacon chain config.
2025-02-24 07:15:32 +00:00
Krishang Shah
6e11bddd4b feat: adds CLI flags to delay publishing for edge case testing on PeerDAS devnets (#6947)
Closes #6919
2025-02-24 06:03:17 +00:00
Lion - dapplion
3fab6a2c0b Block availability data enum (#6866)
PeerDAS has undergone multiple refactors + the blending with the get_blobs optimization has generated technical debt.

A function signature like this

f008b84079/beacon_node/beacon_chain/src/beacon_chain.rs (L7171-L7178)

Allows at least the following combination of states:
- blobs: Some / None
- data_columns: Some / None
- data_column_recv: Some / None
- Block has data? Yes / No
- Block post-PeerDAS? Yes / No

In reality, we don't have that many possible states, only:

- `NoData`: pre-deneb, pre-PeerDAS with 0 blobs or post-PeerDAS with 0 blobs
- `Blobs(BlobSidecarList<E>)`: post-Deneb pre-PeerDAS with > 0 blobs
- `DataColumns(DataColumnSidecarList<E>)`: post-PeerDAS with > 0 blobs
- `DataColumnsRecv(oneshot::Receiver<DataColumnSidecarList<E>>)`: post-PeerDAS with > 0 blobs, but we obtained the columns via reconstruction

^ this are the variants of the new `AvailableBlockData` enum

So we go from 2^5 states to 4 well-defined. Downstream code benefits nicely from this clarity and I think it makes the whole feature much more maintainable.

Currently `is_available` returns a bool, and then we construct the available block in `make_available`. In a way the availability condition is duplicated in both functions. Instead, this PR constructs `AvailableBlockData` in `is_available` so the availability conditions are written once

```rust
if let Some(block_data) = is_available(..) {
let available_block = make_available(block_data);
}
```
2025-02-24 04:47:09 +00:00
Pawan Dhananjay
522b3cbaab Fix builder API headers (#7009)
Resolves https://github.com/sigp/lighthouse/issues/7000


  Set the accept header on builder to the correct value when requesting ssz.

This PR also adds a flag to disable ssz over the builder api altogether. In the case that builders/relays have an ssz bug, we can react quickly by asking clients to restart their nodes with the `--disable-ssz-builder` flag to force json. I'm not fully convinced if this is useful so open to removing it or opening another PR for it.

Testing this currently.
2025-02-24 03:39:13 +00:00
Michael Sproul
e5e43ecd81 Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 2025-02-24 13:59:40 +11:00
Pawan Dhananjay
b3b6aea1c5 Rust 1.85 lints (#7019)
N/A


  2 changes:
1. Replace Option::map_or(true, ...) with is_none_or(...)
2. Remove unnecessary `Into::into` blocks where the type conversion is apparent from the types
2025-02-24 02:36:13 +00:00
João Oliveira
193061ff73 Use RpcSend on RPC::self_limiter::ready_requests (#6634)
We don't need to store `BehaviourAction` for `ready_requests` and therefore avoid having an `unreachable!` on #6625.
Therefore this PR should be merged before it
2025-02-19 21:17:19 +00:00
Michael Sproul
6ab6eae40c Merge remote-tracking branch 'origin/release-v7.0.0-beta.0' into unstable 2025-02-14 10:26:36 +11:00
Michael Sproul
1888be554c Release v7.0.0-beta.0 (#6962)
New release for Electra on Holesky and Sepolia.

Includes PRs:

- https://github.com/sigp/lighthouse/pull/6808
- https://github.com/sigp/lighthouse/pull/6914
- https://github.com/sigp/lighthouse/pull/6949
- https://github.com/sigp/lighthouse/pull/6950
- https://github.com/sigp/lighthouse/pull/6958
2025-02-13 03:06:20 +00:00
Michael Sproul
25f804a111 Fix light client plumbing in beacon processor (#6993)
Our Holesky nodes running with the light client enabled were logging messages about full queues:

> Feb 12 22:09:28.949 ERRO Work queue is full                      queue: unknown_light_client_optimistic_update, queue_len: 128, msg: the system has insufficient resources for load, service: bproc

I thought this might be genuine overload, but it turns out this queue was never being read from!


  - [x] Rename light-client related queues in the beacon processor for clarity.
- [x] Ensure all light-client related queues are being popped from.
2025-02-13 00:50:17 +00:00
Eitan Seri-Levi
ed8086c897 Ensure GET v2/validator/aggregate_attestation is backwards compatible (#6984)
Closes #6983


  `GET v2/validator/aggregate_attestation` is not backwards compatible. It only works for post electra attestations. This PR adds backwards compatibility and additional test coverage. We should include this in the upcoming 7.0 beta release if possible
2025-02-12 00:13:05 +00:00
Lion - dapplion
0055af56b6 Unsubscribe blob topics at Fulu fork (#6932)
Addresses #6854.

PeerDAS requires unsubscribing a Gossip topic at a fork boundary. This is not possible with our current topic machinery.


  Instead of defining which topics have to be **added** at a given fork, we define the complete set of topics at a given fork. The new start of the show and key function is:

```rust
pub fn core_topics_to_subscribe<E: EthSpec>(
fork_name: ForkName,
opts: &TopicConfig,
spec: &ChainSpec,
) -> Vec<GossipKind> {
// ...
if fork_name.deneb_enabled() && !fork_name.fulu_enabled() {
// All of deneb blob topics are core topics
for i in 0..spec.blob_sidecar_subnet_count(fork_name) {
topics.push(GossipKind::BlobSidecar(i));
}
}
// ...
}
```

`core_topics_to_subscribe` only returns the blob topics if `fork < Fulu`. Then at the fork boundary, we subscribe with the new fork digest to `core_topics_to_subscribe(next_fork)`, which excludes the blob topics.

I added `is_fork_non_core_topic` to carry on to the next fork the aggregator topics for attestations and sync committee messages. This approach is future-proof if those topics ever become fork-dependent.

Closes https://github.com/sigp/lighthouse/issues/6854
2025-02-11 23:40:14 +00:00