Closes#9545
`BeaconNodeHttpClient::eth_path` clones the configured server URL and pushes path segments onto it. When the base URL ends with a slash (such as a QuickNode-style endpoint that embeds the API key in the path, `https://<host>/<api-key>/`), the URL already carries an empty trailing segment, so pushing `eth` yields a double slash:
```
https://<host>/<api-key>//eth/v2/debug/beacon/states/finalized
```
The provider responds to the `//` form with a `307` redirect to a normalized path that drops the key segment, so the redirected request arrives unauthenticated and checkpoint sync fails with `Error loading checkpoint state from remote: StatusCode(401)`.
This PR calls `pop_if_empty()` before pushing segments, the standard `url`-crate idiom for trailing-slash-tolerant joining, so a trailing slash no longer changes the resulting path. The fix is applied in `eth_path` (the shared prefix builder that every `/eth/vX` endpoint derives from) and in `post_lighthouse_liveness`, the only other builder that extends the base URL directly.
Co-Authored-By: rahulbarman <itsrahulbarman1@gmail.com>
Range/backfill sync already de-prioritizes peers that failed a batch download (batch.failed_peers() -> peers_to_deprioritize). Block and payload lookups did not: SingleLookupRequestState only tracked a failure counter, not which peer failed, so a retry could re-select the same flaky peer.
Track failed_peers per request-state, record the peer on download failure, and feed it into the block_lookup_request / payload_lookup_request peer-selection sort key, mirroring range sync. Custody lookups already have their own per-peer-attempt mechanism and are left unchanged.
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
- Generalize the custody-by-root request to accept a `Vec<Hash256>` of block roots so a whole batch is fetched in one request; `DataColumnsByRootRequestParams { block_roots, indices }` completes at `block_roots.len() * indices.len()`.
- `custody_lookup_request` takes `block_roots: &[Hash256]` + `block_epoch`; `cached_data_column_indexes` takes `block_epoch`.
- `block_sidecar_coupling.rs`: couple a batch's blocks with the columns returned by the single custody-by-root request; drop the per-column request map and retry/faulty-peer machinery.
- Remove the now-unused columns-by-range range-sync path and the serialize-downloads gate.
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
- Bump `warp` to 0.4. This unifies `warp` and `axum` onto the same `http`, `hyper`, `h2`, `rustls`, etc versions.
- Create `axum_utils` which contain common functions and types
- Begins migration of all HTTP API servers from warp to axum
Co-Authored-By: Mac L <mjladson@pm.me>
`dequeue_attestations` released votes by splitting the queue at the first entry with `slot >= current_slot`, which assumes the queue is sorted by slot. It isn't: `on_attestation` pushes attestations in arrival order and never sorts. When a future-slot vote sits ahead of a vote that is already due, the split happens at the future-slot vote and the due vote stays stuck behind it and is never applied to fork choice, even after its slot is in the past.
The PR current uses a naive solution to solve the bug and also adds regression tests to exercise the bug. There are other competing solutions which can be used which also optimize this path at the same time.
https://github.com/sigp/lighthouse/pull/8378https://github.com/sigp/lighthouse/pull/8378#discussion_r2543322106
Co-Authored-By: hopinheimer <knmanas6@gmail.com>
Co-Authored-By: hopinheimer <48147533+hopinheimer@users.noreply.github.com>
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Remove some `is_gloas` checks that are unnecessary in the `gloas_reorg_tests.rs`.
I found myself wanting to make this change while tweaking these tests in another PR. Figured it makes sense as a simple standalone PR.
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Co-Authored-By: hopinheimer <48147533+hopinheimer@users.noreply.github.com>
I noticed that `beacon_node/http_api/src/beacon/execution_payload_envelope.rs` was recently removed but not added to the forbidden-files.txt.
Add the removed file to the forbidden list to ensure it isn't accidentally re-added by a merge or rebase.
Co-Authored-By: Mac L <mjladson@pm.me>
The proposer preferences service was attempting to publish preferences at the start of each epoch. This caused it to race with the validator duties service, it wouldn't calculate validator duties in time for the proposer preference service.
This PR first updates the validator duties service to calculate proposer duties for the current epoch and the next epoch. After Fulu we have the ability to look ahead one epoch for proposer duties, but we never updated the vc to leverage this feature.
This PR also updates the proposer preferences service to fire at every slot. We have an `(Epoch, DependentRoot)` map that prevents us from publishing the same preferences twice.
The changes here should prevent the race condition between the two services and make the proposer preferences service more robust in general.
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
This PR fixes two issues:
1. This condition is inverted: dfb259171a/beacon_node/network/src/network_beacon_processor/gossip_methods.rs (L1507-L1508)
We are supposed to filter out incomplete columns when we DON'T have local blobs yet!
2. When the EL returns no blobs, we never store a partial in the assembler, and this code fails to publish our need to the network, as no partials are returned: dfb259171a/beacon_node/network/src/network_beacon_processor/mod.rs (L1038-L1050)
The simple fix for 1 would be to invert the condition, but we can improve the flow here: Instead of not publishing anything, we can publish what we got, but not request anything. This ties into the fix for 2: After get blobs completes, we not only publish anything in the partial assembler, but also for every missing custody column in there, publish an empty column and a request for all cells.
In particular:
- When sending a partial message to `network`, allow specifying a request bitmap instead of hardcoding an all-ones bitmap.
- For clarity and to prepare for Gloas integration, add a `PubsubPartialMessage` enum with a `DataColumnFulu` variant.
- On republishing after merging a gossip column: always publish, but only request cells if local blobs are known or get blobs is disabled. This also prepares us to request only *some* cells, e.g. in cases where we are aware of the blobs that the EL is going to send us, e.g. via `engine_hasBlobs`.
- Move guards in `fetch_engine_blobs_and_publish` to ensure everything works fine if there are no blobs or if get_blobs is disabled.
Co-Authored-By: Daniel Knopik <daniel@dknopik.de>
- Move info about pre-8.0 schema migrations to the historical migrations section.
- Include info about v8.1.x and v8.2.x.
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Refactors the payload attestation service
- Returns a `Result`. we need this for #9434 so we can keep track if we've succeeded with producing payload attestation earlier than the deadline
- Separates getting payload attestation data and signing + publishing. This mimics what we do in attestation service and also is needed for #9434 to surface the error while still keeping the same spawning mechanism. In #9434 we want to broadcast payload attestations early if we've already seen an avail envelope. If the SSE event fires, but for some reason getting the payload attestation data from the BN fails, we still want to retry at the deadline. If signing + publishing fails we wont retry at the deadline (similar to the attestation service).
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
Fix a bug in fork choice whereby the unrealized justified and finalized checkpoints from a parent block would be incorrectly carried over to a child block in the same epoch. The documented optimisation in `fork_choice.rs` was wrong because it failed to account for slashings.
This bug is not considered to be sensitive due to the difficulty of triggering it, and the low payoff for doing so (fleeting divergence).
Keep the optimisation, updating it to correctly skip reusing the parent checkpoints when slashings are present.
A more minimal alternative would be to scrap the optimisation altogether (always compute the checkpoints), however this would come with a minor performance downside. I think the updated optimisation is simple enough to be worth retaining.
There are 3 regression tests added which confirm the correct behaviour. Temporarily setting `has_slashings` to `false` causes all 3 tests to fail.
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Handle payload-present attestations before the payload is seen (gloas)
A gloas beacon_attestation with index == 1 claims a past block's payload is already present. If we haven't seen that block's payload envelope yet, we shouldn't reject it the envelope may just be in flight.
So instead we IGNORE it (new AttnError::UnknownPayloadEnvelope), ask sync to fetch the envelope, and park the attestation in the reprocess queue. When the envelope is imported, the parked attestations are released and re-verified.
The envelope lookup itself is stubbed here and wired up in #9155 or a follow up PR
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
Similar to https://github.com/sigp/lighthouse/pull/9476, partial columns is a feature expected to become the default soon, so instead of introducing a CLI option that will be removed again soon, consolidate into `--enable-partial-columns` which now takes a boolean argument.
Co-Authored-By: Daniel Knopik <daniel@dknopik.de>
Single block lookups do not respect the `earliest_available_slot` peers sent. This causes us to potentially request columns from peers that do not custody columns yet (but will soon).
Pass down the block's slot and only consider peers where `earliest_available_slot <= block_slot` for custody column requests.
Co-Authored-By: Daniel Knopik <daniel@dknopik.de>
Use the `LruCache` implementation provided by `hashlink` instead of the current `lru` one.
This is mostly a 1-to-1 swap with only slight API incompatibilities.
I have decided to leave some config files which previously used `NonZeroUsize` but they may not be required anymore and could potentially switch to `usize`.
Co-Authored-By: Mac L <mjladson@pm.me>
Bump `discv5` to the latest release. This removes the duplicated `hashlink` dependency and also removes `ahash`.
Co-Authored-By: Mac L <mjladson@pm.me>
There's a recent breaking change that deprecated dashboard-based snapshot enablement, and is breaking our CI:
https://www.warpbuild.com/docs/ci/changelog/2026-june#june-8-2026
> Legacy dashboard-based snapshot enablement has been removed. Runners previously configured with snapshots enabled from the dashboard will no longer use snapshots. To continue using snapshot runners, migrate to the label-based configuration by adding snapshot.enabled=true or snapshot.key=<alias> to your runs-on labels. See the [Snapshot Runners docs](https://www.warpbuild.com/docs/ci/features/snapshot-runners) for migration details.
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
Small collection of improved diagnostics for partials.
- In the peer info struct (exposing peer data via `/lighthouse/peers`), add a new field to indicate on which subnets the peer supports partials.
- Fix the description of several metrics.
- Downgrade some noisy logging when sending partials to trace, and downgrade one warning that should not worry the user.
Co-Authored-By: Daniel Knopik <daniel@dknopik.de>
N/A
Implement range sync in gloas.
Basically requests blocks and payloads post gloas from the same peer, couples them and sends it for processing.
Does not change sync much at all other than adding the machinery for payloads by range requests.
Main changes are:
`RangeSyncBlock` which used to be a struct is an enum to account for the Gloas case. This allows a clear separation between gloas and pre-gloas code.
`AvailableBlockData` now has a `BlockInEnvelope` variant. This is to clearly indicate the post gloas case. I feel this is simpler to follow compared to `NoData` variant.
Tries to extract post gloas logic into its own functions so that there is minimal logic change in mainnet range sync behaviour.
This is meant as a stable base on which we can iterate further to make range sync cleaner and for unblocking range sync support on devnet. Some ideas for later is removing the retry mechanism in favour of delegating column fetching to lookup sync which can be done post #9155 and batch signature verifying envelopes.
Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
The initial partial send after getBlobs only sends incomplete partial columns, causing issues as we do not propagate these columns.
Return complete and incomplete columns from `get_columns_and_mark_as_local_fetched`, and convert any complete columns to partials if necessary. Send all columns retrieved this way after getBlobs
Co-Authored-By: Daniel Knopik <daniel@dknopik.de>
Implementing gloas lookup sync is currently incompatible with the `GossipBlockProcessResult` mechanism.
Today it's implemented such that if we receive a sucessful `GossipBlockProcessResult` we directly mark the lookup as Complete and delete it. In Gloas we can't delete a lookup after block import, as we may still have FULL child awaiting the payload.
IMO this `GossipBlockProcessResult` brings a lot of headache and edge cases that we can just live without. Also the `reset_request` business is nasty and can easily leave the lookup in a bad state.
If we get rid of `GossipBlockProcessResult` we only pay the following performance penalty:
- Lookup is created exactly while the block's payload is being execution validated
- (new degradation) we download the block again
- send the block for processing but the duplicate cache prevents double execution
So in the worst case we spend a few KBs of extra download bandwidth. Remember each block is downloaded 8x times through gossip in the happy case.
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
No. But related to #9009 and #8996
- Change the `ForkContext::next_fork_digest()` to return `[u8; 4]` (returning `[0u8; 4]` for "no next fork").
- Update the initialization path and runtime fork transition path accordingly.
Added tests:
- [x] `test_next_fork_digest` — existing test passes with non-Option return type
- [x] `test_next_fork_digest_returns_zero_when_no_next_fork` — init at last BPO fork returns `[0u8; 4]`
- [x] `test_next_fork_digest_zero_after_runtime_transition_to_last_fork` — simulates `update_current_fork` to last fork, then verifies zero
Co-Authored-By: alleysira <1367108378@qq.com>
Co-Authored-By: Alleysira <56925051+Alleysira@users.noreply.github.com>
Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>
N/A
Currently, we have `EnvelopeError` having a `ImportError` wrapping a `BlockError`. I feel this is extremely unintuitive because most of the envelope processing functions can simply return an `EnvelopeError` that makes sense in the function's context. It revealed further ugliness when implementing range sync in #9362
This PR does 2 main things:
1. Removes `ImportError(BlockError)` variant
2. Adds `EnvelopeError(EnvelopeError)` variant to a `BlockError`.
I feel this is more natural as there can be envelope errors when we try importing a Block but envelope errors can be contained to just envelope related errors.
The main blocker to doing this was `PayloadVerificationHandle` returning a `BlockError`. It uses a very small subset of `BlockError` which I extracted to its own error type which can be converted into both a BlockError and EnvelopeError.
This allows us to keep most of the pure envelope processing functions to just return EnvelopeErrors while we convert it to a `BlockError` only in import paths where we need to return a consolidated `BlockError`.
Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>