Commit Graph

3393 Commits

Author SHA1 Message Date
Akihito Nakano
1324d3d3c4 Delayed RPC Send Using Tokens (#5923)
closes https://github.com/sigp/lighthouse/issues/5785


  The diagram below shows the differences in how the receiver (responder) behaves before and after this PR. The following sentences will detail the changes.

```mermaid
flowchart TD

subgraph "*** After ***"
Start2([START]) --> AA[Receive request]
AA --> COND1{Is there already an active request <br> with the same protocol?}
COND1 --> |Yes| CC[Send error response]
CC --> End2([END])
%% COND1 --> |No| COND2{Request is too large?}
%% COND2 --> |Yes| CC
COND1 --> |No| DD[Process request]
DD --> EE{Rate limit reached?}
EE --> |Yes| FF[Wait until tokens are regenerated]
FF --> EE
EE --> |No| GG[Send response]
GG --> End2
end

subgraph "*** Before ***"
Start([START]) --> A[Receive request]
A --> B{Rate limit reached <br> or <br> request is too large?}
B -->|Yes| C[Send error response]
C --> End([END])
B -->|No| E[Process request]
E --> F[Send response]
F --> End
end
```

### `Is there already an active request with the same protocol?`

This check is not performed in `Before`. This is taken from the PR in the consensus-spec, which proposes updates regarding rate limiting and response timeout.
https://github.com/ethereum/consensus-specs/pull/3767/files
> The requester MUST NOT make more than two concurrent requests with the same ID.

The PR mentions the requester side. In this PR, I introduced the `ActiveRequestsLimiter` for the `responder` side to restrict more than two requests from running simultaneously on the same protocol per peer. If the limiter disallows a request, the responder sends a rate-limited error and penalizes the requester.



### `Rate limit reached?` and `Wait until tokens are regenerated`

UPDATE: I moved the limiter logic to the behaviour side. https://github.com/sigp/lighthouse/pull/5923#issuecomment-2379535927

~~The rate limiter is shared between the behaviour and the handler.  (`Arc<Mutex<RateLimiter>>>`) The handler checks the rate limit and queues the response if the limit is reached. The behaviour handles pruning.~~

~~I considered not sharing the rate limiter between the behaviour and the handler, and performing all of these either within the behaviour or handler. However, I decided against this for the following reasons:~~

- ~~Regarding performing everything within the behaviour: The behaviour is unable to recognize the response protocol when `RPC::send_response()` is called, especially when the response is `RPCCodedResponse::Error`. Therefore, the behaviour can't rate limit responses based on the response protocol.~~
- ~~Regarding performing everything within the handler: When multiple connections are established with a peer, there could be multiple handlers interacting with that peer. Thus, we cannot enforce rate limiting per peer solely within the handler. (Any ideas? 🤔 )~~
2025-04-24 03:46:16 +00:00
chonghe
c13e069c9c Revise logging when queue is full (#7324) 2025-04-22 22:46:30 +00:00
Michael Sproul
e61e92b926 Merge remote-tracking branch 'origin/stable' into unstable 2025-04-22 18:55:06 +10:00
Michael Sproul
54f7bc5b2c Release v7.0.0 (#7288)
New v7.0.0 release for Electra on mainnet.
2025-04-22 09:21:03 +10:00
chonghe
80fe133d2c Update Lighthouse Book for Electra features (#7280)
* #7227
2025-04-17 09:31:26 +00:00
Mac L
c32569ab83 Restore HTTP API logging and add more metrics (#7225)
#7124


  - Restores previous HTTP logging with tracing compatible syntax
- Adds metrics for certain missing endpoints (and alphabetized the existing ones)
2025-04-17 08:18:45 +00:00
Michael Sproul
fd82ee2f81 Release v7.0.0-beta.7 (#7333) 2025-04-17 14:46:43 +10:00
Jean-Baptiste Pinalie
5352d5f78a Update proposer_slashings and attester_slashings amounts for electra. (#7316)
Did not find a specific issue beside https://github.com/sigp/lighthouse/issues/6821


  Leverage `whistleblower_reward_quotient_for_state` to have accurate post-electra `proposer_slashings` and `attester_slashings` fields returned by `/eth/v1/beacon/rewards/blocks/<id>`.
2025-04-17 00:58:36 +00:00
Michael Sproul
6fad6fba6a Release v7.0.0-beta.6 2025-04-16 08:54:53 +10:00
Jimmy Chen
476f3a593c Add MAX_BLOBS_PER_BLOCK_FULU config (#7161)
Add `MAX_BLOBS_PER_BLOCK_FULU` config.
2025-04-15 00:20:46 +00:00
Lion - dapplion
be68dd24d0 Fix wrong custody column count for lookup blocks (#7281)
Fixes
- https://github.com/sigp/lighthouse/issues/7278


  Don't assume 0 columns for `RpcBlockInner::Block`
2025-04-11 22:00:57 +00:00
Mac L
39eb8145f8 Merge branch 'release-v7.0.0' into unstable 2025-04-11 21:32:24 +10:00
Eitan Seri-Levi
af51d50b05 Ensure /eth/v2/beacon/pool/attestations honors committee_index (#7298)
#7294


  Fix the filtering logic so that we actually filter by committee index for both `Base` and `Electra` attestations.

Added a tiny optimization when calculating committee_index to prevent unneeded memory allocations

Added a regression test
2025-04-11 04:47:30 +00:00
Eitan Seri-Levi
ef8ec35ac5 Ensure light_client/updates endpoint returns spec compliant SSZ data (#7230)
Closes #7167


  - Ensure the fork digest is generated from ther light client updates attested header and not the signature slot
- Ensure the format of the SSZ response is spec compliant
2025-04-11 04:47:27 +00:00
Eitan Seri-Levi
aed562abef Downgrade light client errors (#7300)
Downgrade light client errors to debug

Error messages are alarming and usually indicate somethings wrong with the beacon node. The Light Client service is supposed to minimally impact users, and most will not care if the light client server is erroring. Furthermore, the only errors we've seen in the wild are during hard forks, for the first few epochs before the fork finalizes.
2025-04-10 02:17:07 +00:00
Mac L
7534f5752d Add pending_consolidations Beacon API endpoint (#7290)
#7282


  Adds the missing `beacon/states/{state_id}/pending_consolidations` Beacon API endpoint along with related tests.
2025-04-10 01:21:01 +00:00
SunnysidedJ
d96b73152e Fix for #6296: Deterministic RNG in peer DAS publish block tests (#7192)
#6296: Deterministic RNG in peer DAS publish block tests


  Made test functions to call publish-block APIs with true for the deterministic RNG boolean parameter while production code with false. This will deterministically shuffle columns for unit tests under broadcast_validation_tests.rs.
2025-04-09 15:35:15 +00:00
Michael Sproul
ec643843e0 Remove/document remaining Electra TODOs (#6982)
Not essential to merge this now, but I'm going through TODOs for Electra to make sure we haven't missed anything.

Targeting this at the release branch anyway so that auditors/readers don't get alarmed 😅
2025-04-09 04:14:50 +00:00
Pawan Dhananjay
076f3f0984 Clarify network limits (#7175)
Resolves #6811


  Rename `GOSSIP_MAX_SIZE` to `MAX_PAYLOAD_SIZE` and remove `MAX_CHUNK_SIZE` in accordance with the spec.

The spec also "clarifies"  the message size limits at different levels. The rpc limits are equivalent to what we had before imo.
The gossip limits have additional checks.

I have gotten rid of the `is_bellatrix_enabled`  checks that used a lower limit (1mb) pre-merge. Since all networks we run start from the merge, I don't think this will break any setups.
2025-04-09 02:50:45 +00:00
Jimmy Chen
759b0612b3 Offloading KZG Proof Computation from the beacon node (#7117)
Addresses #7108

- Add EL integration for `getPayloadV5` and `getBlobsV2`
- Offload proof computation and use proofs from EL RPC APIs
2025-04-08 07:37:16 +00:00
Jimmy Chen
e924264e17 Fullnodes to publish data columns from EL getBlobs (#7258)
Previously only supernode contributes to data column publishing in Lighthouse.

Recently we've [updated the spec](https://github.com/ethereum/consensus-specs/pull/4183) to have full nodes publishing data columns as well, to ensure all nodes contributes to propagation.

This also prevents already imported data columns from being imported again (because we don't "observe" them), and ensures columns that are observed in the [gossip seen cache](d60c24ef1c/beacon_node/beacon_chain/src/data_column_verification.rs (L492)) are forwarded to its peers, rather than being ignored.
2025-04-08 03:20:31 +00:00
Michael Sproul
47a85cd118 Bump version to v7.1.0-beta.0 (not a release) (#7269)
Having merged the drop-headtracker PR we now have a DB schema change in `unstable` compared to `release-v7.0.0`:

- https://github.com/sigp/lighthouse/pull/6744

There is a DB downgrade available, however this needs to be applied manually and it's usually a bit of a hassle.

This PR bumps the version on `unstable` to `v7.1.0-beta.0` _without_ actually cutting a `v7.1.0-beta.0` release, so that we can tell at a glance which schema version a node is using.
2025-04-07 06:01:20 +00:00
Lion - dapplion
70850fe58d Drop head tracker for summaries DAG (#6744)
The head tracker is a persisted piece of state that must be kept in sync with the fork-choice. It has been a source of pruning issues in the past, so we want to remove it
- see https://github.com/sigp/lighthouse/issues/1785

When implementing tree-states in the hot DB we have to change the pruning routine (more details below) so we want to do those changes first in isolation.
- see https://github.com/sigp/lighthouse/issues/6580
- If you want to see the full feature of tree-states hot https://github.com/dapplion/lighthouse/pull/39

Closes https://github.com/sigp/lighthouse/issues/1785


  **Current DB migration routine**

- Locate abandoned heads with head tracker
- Use a roots iterator to collect the ancestors of those heads can be pruned
- Delete those abandoned blocks / states
- Migrate the newly finalized chain to the freezer

In summary, it computes what it has to delete and keeps the rest. Then it migrates data to the freezer. If the abandoned forks routine has a bug it can break the freezer migration.

**Proposed migration routine (this PR)**

- Migrate the newly finalized chain to the freezer
- Load all state summaries from disk
- From those, just knowing the head and finalized block compute two sets: (1) descendants of finalized (2) newly finalized chain
- Iterate all summaries, if a summary does not belong to set (1) or (2), delete

This strategy is more sound as it just checks what's there in the hot DB, computes what it has to keep and deletes the rest. Because it does not rely and 3rd pieces of data we can drop the head tracker and pruning checkpoint. Since the DB migration happens **first** now, as long as the computation of the sets to keep is correct we won't have pruning issues.
2025-04-07 04:23:52 +00:00
Pawan Dhananjay
091e292c99 Return eth1_data early post transition (#7248)
N/A


  Return state.eth1_data() early if we have passed the transition period post electra. Even if we don't return early, the function would still return state.eth1_data() based on the current conditions. However, doing this explicitly here to match the spec. This covers setting the right eth1_data in our block.

The other thing we need to ensure is that the deposits returned by the eth1_chain is empty post transition.

The only way we get non-empty deposits post the transition is if `state.eth1_deposit_index` in the below code is less than `min(deposit_requests_start_index, state.eth1_data().deposit_count)`.
0850bcfb89/beacon_node/beacon_chain/src/eth1_chain.rs (L543-L579)

This can never happen because state.eth1_deposit_index will be equal to state.eth1_data.deposit count and cannot exceed the value.

@michaelsproul @ethDreamer please double check the logic for deposits being empty post transition. Following the logic in the spec makes my head hurt.
2025-04-07 03:16:48 +00:00
Lion - dapplion
d511ca0494 Compute roots for unfinalized by_range requests with fork-choice (#7098)
Includes PRs

- https://github.com/sigp/lighthouse/pull/7058
- https://github.com/sigp/lighthouse/pull/7066

Cleaner for the `release-v7.0.0` branch
2025-04-07 03:16:41 +00:00
Jimmy Chen
7cc64cab83 Add missing error log and remove redundant id field from lookup logs (#6990)
Partially #6989.

This PR adds the missing error log when a batch fails due to issues with converting the response into `RpcBlock`. See the above linked issue for more details.

Adding this log reveals that we're completing range requests with missing columns, hence causing the batch to fail. It looks like we've hit the case where we've received enough stream terminations, but not all columns are returned.

```
Feb 12 06:12:16.558 DEBG Failed to convert range block components into RpcBlock, error: No column for block 0xc5b6c7fa02f5ef603d45819c08c6519f1dba661fd5d44a2fc849d3e7028b6007 index 18, id: 3456/RangeSync/116/3432, service: sync, module: network::sync::network_context:488
```

I've also removed some redundant `id` logging, as the `id` debug representation is difficult to read, and is now being logged as part of `req_id` in a more succinct format (relevant PR: #6914)
2025-04-04 09:01:42 +00:00
Jimmy Chen
6a75f24ab1 Fix the getBlobs metric and ensure it is recorded promptly to prevent miscounts (#7188)
From testing conducted by Sunnyside Labs, they noticed that the "expected blobs" are quite low on bandwidth constrained nodes. This observation revealed that we don't record the `beacon_blobs_from_el_expected_total` metric at all if the EL doesn't return any response. The fetch blobs function returns without recording the metric.

To fix this, I've moved `BLOBS_FROM_EL_EXPECTED_TOTAL` and `BLOBS_FROM_EL_RECEIVED_TOTAL` to as early as possible, to make the metric more accurate.
2025-04-04 09:01:39 +00:00
Mac L
0e6da0fcaf Merge branch 'release-v7.0.0' into v7-backmerge 2025-04-04 13:32:58 +11:00
Mac L
82d1674455 Rust 1.86.0 lints (#7254)
Implement lints for the new Rust compiler version 1.86.0.
2025-04-04 02:30:22 +00:00
Age Manning
d6cd049a45 RPC RequestId Cleanup (#7238)
I've been working at updating another library to latest Lighthouse and got very confused with RPC request Ids.

There were types that had fields called `request_id` and `id`. And interchangeably could have types `PeerRequestId`, `rpc::RequestId`, `AppRequestId`, `api_types::RequestId` or even `Request.id`.

I couldn't keep track of which Id was linked to what and what each type meant.

So this PR mainly does a few things:
- Changes the field naming to match the actual type. So any field that has an  `AppRequestId` will be named `app_request_id` rather than `id` or `request_id` for example.
- I simplified the types. I removed the two different `RequestId` types (one in Lighthouse_network the other in the rpc) and grouped them into one. It has one downside tho. I had to add a few unreachable lines of code in the beacon processor, which the extra type would prevent, but I feel like it might be worth it. Happy to add an extra type to avoid those few lines.
- I also removed the concept of `PeerRequestId` which sometimes went alongside a `request_id`. There were times were had a `PeerRequest` and a `Request` being returned, both of which contain a `RequestId` so we had redundant information. I've simplified the logic by removing `PeerRequestId` and made a `ResponseId`. I think if you look at the code changes, it simplifies things a bit and removes the redundant extra info.

I think with this PR things are a little bit easier to reasonable about what is going on with all these RPC Ids.

NOTE: I did this with the help of AI, so probably should be checked
2025-04-03 10:10:15 +00:00
Jimmy Chen
80626e58d2 Attempt to fix flaky network tests (#7244) 2025-04-03 04:01:34 +00:00
Michael Sproul
578db67755 Merge remote-tracking branch 'origin/release-v7.0.0' into backmerge-apr-2 2025-04-02 09:57:42 +11:00
Michael Sproul
9bc0d5161e Disable LevelDB snappy feature (#7235)
Disable the `snappy` feature of LevelDB to prevent compilation issues with CMake 4.0, e.g.

https://github.com/sigp/lighthouse/actions/runs/14182783816/job/39732457274?pr=7231

We do not use Snappy compression in LevelDB, and do not need to compile this. This might also shave a few seconds off compilation!
2025-04-01 07:56:06 +00:00
Michael Sproul
bde0f1ef0b Merge remote-tracking branch 'origin/release-v7.0.0' into unstable 2025-03-29 13:01:58 +11:00
Pawan Dhananjay
54aef2d043 Admin add/remove peer (#7198)
N/A


  Adds endpoints to add and remove trusted peers from the http api. The added peers are trusted peers so they won't be disconnected for bad scores. We try to maintain a connection to the peer in case they disconnect from us by trying to dial it every heartbeat.
2025-03-28 12:59:09 +00:00
Eitan Seri-Levi
a5ea05ce2a Top-up pubkey cache on startup (#7217)
This is a workaround for #7216


  In the case of gaps between the in-memory pub key cache and its on-disk representation, use the head state on startup to "top-up" the cache/db w/ any missing validators
2025-03-28 08:29:19 +00:00
Michael Sproul
6d5a2be7f9 Release v7.0.0-beta.5 (#7210)
New release for Pectra-enabled networks.
2025-03-27 03:42:34 +00:00
Michael Sproul
7d792e615c Fix xdelta3 output buffer issue (#7174)
* Fix xdelta3 output buffer issue

* Fix buckets

* Update commit hash to `main`

* Tag TODO(hdiff)

* Update cargo lock
2025-03-27 13:25:50 +11:00
Lion - dapplion
6f31d44343 Remove CGC from data_availability checker (#7033)
- Part of https://github.com/sigp/lighthouse/issues/6767

Validator custody makes the CGC and set of sampling columns dynamic. Right now this information is stored twice:
- in the data availability checker
- in the network globals

If that state becomes dynamic we must make sure it is in sync updating it twice, or guarding it behind a mutex. However, I noted that we don't really have to keep the CGC inside the data availability checker. All consumers can actually read it from the network globals, and we can update `make_available` to read the expected count of data columns from the block.
2025-03-26 05:19:51 +00:00
Eitan Seri-Levi
2f37bf4de5 Fix more merge conflicts between unstable and release-v7.0.0 2025-03-24 09:13:28 +11:00
Eitan Seri-Levi
cbf1c04a14 resolve merge conflicts between untstable and release-v7.0.0 2025-03-23 11:09:02 -06:00
Lion - dapplion
ca237652f1 Track request IDs in RangeBlockComponentsRequest (#6998)
Part of
- https://github.com/sigp/lighthouse/issues/6258


  `RangeBlockComponentsRequest` handles a set of by_range requests. It's quite lose on these requests, not tracking them by ID. We want to implement individual request retries, so we must make `RangeBlockComponentsRequest` aware of its requests IDs. We don't want the result of a prior by_range request to affect the state of a future retry. Lookup sync uses this mechanism.

Now `RangeBlockComponentsRequest` tracks:

```rust
pub struct RangeBlockComponentsRequest<E: EthSpec> {
blocks_request: ByRangeRequest<BlocksByRangeRequestId, Vec<Arc<SignedBeaconBlock<E>>>>,
block_data_request: RangeBlockDataRequest<E>,
}

enum RangeBlockDataRequest<E: EthSpec> {
NoData,
Blobs(ByRangeRequest<BlobsByRangeRequestId, Vec<Arc<BlobSidecar<E>>>>),
DataColumns {
requests: HashMap<
DataColumnsByRangeRequestId,
ByRangeRequest<DataColumnsByRangeRequestId, DataColumnSidecarList<E>>,
>,
expected_custody_columns: Vec<ColumnIndex>,
},
}

enum ByRangeRequest<I: PartialEq + std::fmt::Display, T> {
Active(I),
Complete(T),
}
```

I have merged `is_finished` and `Into_responses` into the same function. Otherwise, we need to duplicate the logic to figure out if the requests are done.
2025-03-21 04:07:30 +00:00
Michael Sproul
04868027a6 Release v7.0.0-beta.4 (#7162)
New release for Hoodi testnet including clean versions of fixes from `holesky-rescue`.
2025-03-20 05:12:19 +00:00
Eitan Seri-Levi
e4c9805438 Reject attestations to blocks prior to the split (#7084) 2025-03-19 13:39:28 +00:00
Eitan Seri-Levi
ed1b7689ae Manual compaction endpoint backport (#7104)
Backports:
- https://github.com/sigp/lighthouse/pull/7072

To:
- https://github.com/sigp/lighthouse/issues/7039

#7103 should be merged first


  This PR introduces an endpoint that allows users to manually trigger background compaction.
2025-03-18 06:29:12 +00:00
Eitan Seri-Levi
27aabe8159 Pseudo finalization endpoint (#7103)
This is a backport of:
-  https://github.com/sigp/lighthouse/pull/7059
- https://github.com/sigp/lighthouse/pull/7071

For:
- https://github.com/sigp/lighthouse/issues/7039


  Introduce a new lighthouse endpoint that allows a user to force a pseudo finalization. This migrates data to the freezer db and prunes sidechains which may help reduce disk space issues on non finalized networks like Holesky

We also ban peers that send us blocks that conflict with the manually finalized checkpoint.

There were some CI fixes in https://github.com/sigp/lighthouse/pull/7071 that I tried including here

Co-authored with: @jimmygchen  @pawanjay176 @michaelsproul
2025-03-18 05:21:05 +00:00
Michael Sproul
4de062626b State cache tweaks (#7095)
Backport of:

- https://github.com/sigp/lighthouse/pull/7067

For:

- https://github.com/sigp/lighthouse/issues/7039


  - Prevent writing to state cache when migrating the database
- Add `state-cache-headroom` flag to control pruning
- Prune old epoch boundary states ahead of mid-epoch states
- Never prune head block's state
- Avoid caching ancestor states unless they are on an epoch boundary
- Log when states enter/exit the cache

Co-authored-by: Eitan Seri-Levi <eserilev@ucsc.edu>
2025-03-18 02:10:21 +00:00
Eitan Seri-Levi
8ce9edc584 Add block ban flag --invalid-block-roots (#7042) 2025-03-17 13:18:22 +00:00
Jun Song
50b5a72c58 feat: implement new beacon APIs(accessors for pending_deposits/pending_partial_withdrawals) (#7006)
Resolves #7003


  Added two endpoints as https://github.com/ethereum/beacon-APIs/pull/500 proposed:
- `/eth/v1/beacon/states/{state_id}/pending_deposits`
- `/eth/v1/beacon/states/{state_id}/pending_partial_withdrawals`
2025-03-17 01:46:50 +00:00
João Oliveira
c095a0a58f update gossipsub to the latest upstream revision (#7130)
I feel it's preferable to do this explicitly by updating the revision on `Cargo.toml` rather than implicitly by letting `Cargo.lock` control the revision of the branch.
2025-03-17 00:22:19 +00:00