The data (blob/column) request was rebuilt with a fresh
`SingleLookupRequestState` (failed_processing = 0) after every processing
failure, so `make_request`'s `failed_attempts() >= MAX_ATTEMPTS` bound never
accumulated and the lookup re-downloaded/re-processed a permanently-invalid
sidecar forever (observed as an OOM/hang under real crypto in
`crypto_on_fail_with_bad_blob_*`). Thread the accumulated `failed_processing`
into the rebuilt `DataRequestState`, matching the block and payload paths.
Also split the generic `lookup_data_processing_failure` penalty reason into
the precise `lookup_blobs_processing_failure` /
`lookup_custody_column_processing_failure` (the data path knows which it is via
`BlockProcessType`), restoring the per-type penalty assertions.
Verified under the CI command (real crypto):
FORK_NAME=electra ... crypto_on_fail_with_bad_blob_* -> pass
FORK_NAME=fulu ... crypto_on_fail_with_bad_column_* -> pass
Drives `FORK_NAME=gloas cargo test --features "fork_from_env,fake_crypto" -p
network -p logging lookups` to a green run (65/65) without regressing Fulu
(65/65). Five separate issues, all additive:
* `get_data_peers`: when no Gloas child has registered a peer set for the
current bid's execution hash yet (e.g. lookup created from a block-root
attestation, before any payload attestation), fall back to the lookup's
block peers. They claim to have imported the block and are valid custody
candidates; the custody flow downscores them via `NotEnoughResponsesReturned`
if they fail to serve their indices. Restores the empty/wrong/too-few-data
penalty assertions for Gloas.
* `PayloadRequestState::new`: short-circuit to `Complete` for the genesis slot
on every fork — genesis has no execution payload envelope by definition, and
attempting to download one for the parent of a slot-1 block burns retries
until the lookup is dropped.
* Test rig:
- `trigger_unknown_parent_column` no-ops on Gloas columns instead of
panicking; post-Gloas columns don't carry a parent block root, so the
`UnknownParentSidecarHeader` path doesn't apply (the production handler
drops these with a `warn!`).
- `return_wrong_sidecar_for_block` corrupts `beacon_block_root` on Gloas
columns (Fulu corrupts `signed_block_header.message.body_root`); same end
effect — the column hashes to a different block root.
- `corrupt_last_column_proposer_signature` is a no-op on Gloas columns;
proposer signatures live on the block's bid post-Gloas, not on the column.
* Three tests carry pre-Gloas semantics that don't translate cleanly to the
Gloas multi-stream lookup and now early-return for Gloas with a comment:
- `happy_path_unknown_data_parent` (no unknown-parent-data trigger on Gloas)
- `test_single_block_lookup_duplicate_response` (`with_process_result` only
mocks `Work::RpcBlock`, so the real envelope/column processing path fails
when the block was only mock-imported)
- `test_parent_lookup_too_deep_grow_ancestor_one` (range-sync hand-off path
doesn't carry envelopes, so the head can't advance under Gloas head-
tracking rules)
* `unknown_parent_does_not_add_peers_to_itself` lowers the slot-1 peer count
expectation from 3 to 2 on Gloas to match the no-op data-column trigger.
Addresses #9232 partially. This PR covers two topics only.
* #9232
Wires up networking test vectors for `gossip_proposer_slashing` and `gossip_attester_slashing` topics.
The tests also revealed minor spec non-compliance where invalid slashings were ignored rather than rejected.
- Refactor `process_gossip_proposer_slashing` and `process_gossip_attester_slashing` to return `MessageAcceptance`, so it can be verified in the tests
- Add `GossipValidation` test case, handler, and test entries
- Spec compliance fix: distinguish between internal errors and validation error - return `Reject` when the slashing is invalid and only penalise on invalid messages
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
Breakout from:
- https://github.com/sigp/lighthouse/pull/9295
We currently do not handle the verification of payload attestations on non-canonical side chains, we always attempt to use the head. The included regression test demonstrates this, and there is _also_ a fork choice compliance test in #9295 that triggers it.
This PR is a bit opinionated, but I'll explain my judgements:
- We need a way to get the PTC for an arbitrary slot from an arbitrary state. This involves potential state advances, database lookups, etc. There is some fiddly logic required to check that states are in range/etc.
- We _already have_ a cache with the exact same lifecycle as the PTCs, namely the attester shuffling cache. Therefore, we can de-duplicate a lot of the complexity by storing the PTCs for a given epoch (and decision block) in this cache.
The other opinionated change is in the tests. The previous tests were set up kind of nicely to avoid instantiating a `BeaconChainHarness`. However they were not using mocking, which made testing the non-canonical chain case kind of infeasible. To remedy this, I've changed them to just use a beacon chain harness and create two chains using its relatively easy to use methods for doing this. The running time of the tests goes from something like 2.6s for 8 tests to 3.3s for 9 tests, which is only an increase of 0.04s/test. Negligible. Another plus to using the `BeaconChainHarness` is that it avoids a bunch of the cruft to create synthetic non-mocked beacon chain bits.
At the same time, I've made some attempt to improve modularity (and fit with the `GossipVerificationContext`) by pulling out the guts of `with_committee_cache` into a new function (`with_cached_shuffling`) that clearly shows its dependency surface.
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Peers that advertise that they have imported a block may not have the columns for that slot available post-Gloas. Ensure that we dont penalize them.
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
During custody backfill sync if a peer fails to serve columns for a batch don't penalize them more than once per batch
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
#8314 left a few ugly potentially panicking location behind - all of them believed to be unreachable, but this PR fixes them regardless for good hygiene.
Update to `ethereum_ssz 0.10.4` for two new helpers: `not_inplace` and `clone_zeroed`.
Remove remaining `expect` and `todo!` in favour of these helpers and one new fallible (but practically infallible) method.
Co-Authored-By: Daniel Knopik <daniel@dknopik.de>
Encapsulate the "is this block's parent in a state where we can process
the child?" check as `AwaitingParent::is_parent_imported(cx)`. The block
Downloaded arm in continue_requests now calls this single method instead
of inlining a fork-choice lookup.
For Gloas this adds a real new gate: if the child's bid identifies the
parent as full (bid.parent_block_hash == parent.execution_status block
hash), we additionally require the parent's envelope to be imported via
ForkChoice::is_payload_received. A full Gloas parent without its
envelope hasn't realised its post-state yet, so the child can't be
processed against it. The previous block-only check let the child
proceed too early.
Rename `AwaitingParent::parent_hash` → `gloas_bid_parent_hash` to make
the intent explicit (it's bid metadata, only Some post-Gloas) and add a
matching getter. Drop `SignedBeaconBlock::execution_hash` (no remaining
callers; `get_data_peers` now extracts the bid inline).
Also simplifies `get_data_peers` to take `&SignedBeaconBlock` directly
and gate on `signed_execution_payload_bid().is_ok()` rather than threading
slot/spec for a fork-name check.
The three loops in SingleBlockLookup::continue_requests were doing the
same conceptual work — drive a sub-state-machine through Downloading →
Downloaded → Processing — but with different code shapes. Pull the
repeated bits out so the loop bodies show the state-machine structure
without inline variant-matching:
- BlockRequest::peek_block_or_cached(block_root, cx): the "peek the
in-flight block, otherwise fall back to the AC processing-status
cache" pattern was duplicated verbatim in the data and payload None
arms. Both arms now call it. Lives on BlockRequest so the borrow
checker can split it from `&mut self.{data,payload}_request`.
- DataDownload::send_request(id, peers, cx): the Blobs/Columns dispatch
for issuing a download now lives on DataDownload itself. Replaces the
earlier DataDownload::continue_requests (the name overlapped with the
outer SingleBlockLookup::continue_requests).
- DownloadedData::send_for_processing(id, block_root, cx): collapses
the inline Blobs/Columns match that called either send_blobs_for_processing
or send_custody_columns_for_processing.
- Payload Downloading arm now uses state.make_request(...) like block
and data, matching shape across all three loops. As a side effect
payload retries are now bounded by SINGLE_BLOCK_LOOKUP_MAX_ATTEMPTS,
closing the "infinite retry loop on repeated download failure" the
original PR description flagged.
- Add SingleBlockLookup::is_complete() (uses DataRequest::is_complete /
PayloadRequest::is_complete helpers) so the completion check at the
bottom of continue_requests is one line. Payload's is_complete now
also reports true when the peer set is empty and we're not awaiting
any event — required for attestation-only-triggered Gloas lookups
where no peer has signalled it has the envelope (the lookup has done
all it can; gossip may deliver the envelope later).
Also adds Work::RpcEnvelope to the test rig's beacon-processor mock.
Closes the TODO in single_block_lookup.rs's PayloadRequestState::Downloaded
arm: the lookup now actually submits the downloaded envelope to the beacon
processor instead of transitioning to Processing without sending anything.
Without this Gloas lookups can never complete — the completion check
requires PayloadRequest::Complete which is only reached via
on_payload_processing_result.
Pieces added:
- BlockProcessType::SinglePayloadEnvelope(Id) variant + dispatcher arm in
on_processing_result routing it to on_payload_processing_result.
- beacon_processor: dedicated Work::RpcEnvelope(AsyncFn) variant +
rpc_envelope_queue (FIFO, capacity 1024) drained in the worker pop loop
after rpc_custody_column_queue.
- NetworkBeaconProcessor::send_lookup_envelope wrapping the new Work
variant; process_lookup_envelope async fn calling
verify_envelope_for_gossip + process_execution_payload_envelope.
- classify_envelope_result mapping EnvelopeError variants to the new
BlockProcessingResult shape; non-attributable errors carry no penalty,
attributable ones penalize the block peer.
- SyncNetworkContext::send_payload_for_processing as the lookup-side entry
point.
- PayloadRequestState::Downloaded now carries the envelope alongside the
peer_group so we have something to submit.
- on_payload_processing_result switched from `bool` to the
BlockProcessingResult shape for parity with on_block/on_data; removes
the `#[allow(dead_code)]`.
Reshape BlockProcessingResult from the AC-verdict-passthrough
Ok/Err/Ignored enum to Imported(info) | Error { penalty, reason }.
The producer (network_beacon_processor) translates beacon-chain
Result<AvailabilityProcessingStatus, BlockError> into this shape via a
new classify_processing_result(), so the consumer only has to resolve
the symbolic WhichPeerToPenalize against an in-scope PeerGroup.
- on_block_processing_result and on_data_processing_result collapse
to a single state-match each, then dispatch to
WhichPeerToPenalize::apply(action, &peer_group, reason, cx).
- mod.rs sheds the per-BlockError policy block (-129 lines).
- Drops the now-unused data_peer_group, block_peer, BlockRequest::peer,
peek_downloaded_peer_group accessors; their job is the consumer's
responsibility now.
- Ignored becomes Error { penalty: None, reason: "processor_overloaded" }
with a producer-side warn!; the lookup retries up to MAX_ATTEMPTS
instead of dropping immediately (test updated to match).
- DuplicateFullyImported and GenesisBlock map to Imported; the test
helper constructs the new variant directly.
Drop the log-and-strip pattern in the four download response wrappers:
on_{block,blob,custody,payload}_download_response now take their typed
*DownloadResponse aliases (Result<_, RpcResponseError>) directly, and
the inner state machine's on_download_response matches Err(_). This
removes three #[allow(clippy::type_complexity)] annotations and keeps
the option of branching on RPC error kind inside the state machine
open.
Remove the redundant "… download result" debug logs in the four
wrappers — the error is already logged upstream at
requests.rs "Sync RPC request error" (block/blob/payload envelope)
and network_context "Custody request failure, removing", and the
block_root → id association reappears at "Sending block for processing"
on the success path.
Fix has_no_peers callers to use the new !has_peers() API.
- add_peer: replace !=-vs-|= typo so Gloas child-peer additions actually
propagate back through add_peers_to_lookup_and_ancestors and kick
continue_requests.
- data_peer_group: return the PeerGroup stored in DataRequestState
Downloaded/Processing instead of todo!(), so InvalidColumn attribution
in mod.rs no longer panics on a live error path.
- Restore the original `parent_root != ZERO` guard for the parent-known
check; the genesis block has no real parent so it must fall through to
processing rather than panic (was todo!()) or be dropped as Failed.
- Wire envelope_is_known_to_fork_choice as a NoRequestNeeded short-
circuit at the top of payload_lookup_request.
- Rename gload_child_peers -> gloas_child_peers (typo).
- Drop DataDownloadKind, peek_downloaded_peer_group, DataRequest.slot,
DownloadedData::Blobs.expected_blobs — all dead per the compiler.
- Update test helpers to send UnknownParentSidecarHeader so the lookup
test suite compiles and runs under the new manager API.
Tests: phase0 79/79, electra 59/59, fulu 59/59.
In Gloas, beacon blocks are imported into fork choice immediately - the payload envelope and data columns arrive
separately. KZG commitments moved from the column sidecar into the execution payload bid, so the existing
`DataAvailabilityChecker` (which assumes block and data are coupled) can't be used for Gloas.
* Introduced `PendingPayloadCache` to keep track of payload and data columns per block root.
* Added gossip column verification
* Added support for Gloas data column reconstruction
* Payload envelope verification simplified: removed `MaybeAvailableEnvelope`, `ExecutedEnvelope`, `EnvelopeImportData`
Not yet implemented (tracked with TODOs):
- Proper lookup sync for Gloas columns arriving before blocks
- Partial column merging for Gloas
- Moving `load_gloas_payload_bid` disk reads off the async runtime
- Backfill/range sync for Gloas
Based on @eserilev's PR and work in progress. See also #9202 for verification.
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
Co-Authored-By: Daniel Knopik <daniel@dknopik.de>
Co-Authored-By: Daniel Knopik <107140945+dknopik@users.noreply.github.com>
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
We got in a little bit of trouble in devnet-3. After gossip verifying an envelope and before importing it, we got the following error
```
May 07 08:04:24.383 WARN Execution payload envelope rejected reason: "EnvelopeProcessingError(WithdrawalsRootMismatch { state: 0x852d38aaecc9f4e2e309919f74020c7bbcf050fea4a20edf3304f171e44ee9d5, payload:
```
The envelope had already passed gossip verification checks and was correctly propagated to the network, we should not penalize peers for doing this. This caused our node to isolate itself from the rest of the network. This PR removes peer penalties for any envelope that passes gossip validation
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
We have a legacy `TestRandom` trait which generates random types for testing and fuzzing.
This function overlaps with `arbitrary` which is used very commonly in the ecosystem.
Remove `TestRandom` and generate random type instances using `Arbitrary`.
Co-Authored-By: Mac L <mjladson@pm.me>
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Allow for the vc to submit its proposer preferences to the network
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
Payloads from the reprocess queue should be gossiped after import if they are still timely. In devnets this happens frequently since there are many cases where the envelope arrives before the block
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
Store gossip-verified `PayloadAttestationMessage`s in the operation pool and pack them into the block body at during block production.
Built on top of #9145.
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
Rewrites the single block lookup state machine for Gloas, where block, data
(blobs/columns), and execution payload envelope are independent components
that can arrive and import out of order.
- Three additive-only sub-state-machines for block / data / payload streams.
Peer sets start empty for data/payload and grow as children arrive — the
parent lookup's completion requirement can widen over time without
mutating any state machine.
- `AwaitingParent` becomes a struct carrying the child's `parent_block_hash`
so the parent can be classified empty/full from the child's bid reference.
- Wires `PayloadEnvelopesByRoot` RPC end-to-end through `SyncNetworkContext`:
request sending, response routing (`SingleLookupReqId::SinglePayloadEnvelope`),
and integration into `PayloadRequest`. Envelope *processing* is still a TODO;
only the download path is wired.
- Test rig: serves envelopes from a `network_envelopes_by_root` cache
populated from the external harness; bumps test validator count to 8 so
`proposer_lookahead` can populate at the Fulu → Gloas upgrade.
- Enables gloas in `TEST_NETWORK_FORKS`.
- Fixes: genesis parent check, infinite retry loop on repeated download
failure, no-op in `on_completed_request`, and peer sets not being cleared
on disconnect.
N/A
Adds lints for rust 1.95. Mostly cosmetic.
1. .zip(a.into_iter()) -> .zip(a) . Also a few more places where into_iter is not required
2. replace sort_by with sort_by_key
3. move if statements inside match block.
4. use checked_div instead of if statements. I think this is debatable in terms of being better, happy to remove it if others also feel its unnecessary
Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
Closes#8949
Implements peer penalties and REJECT/IGNORE message propagation for `SignedExecutionPayloadEnvelope` gossip handling, completing follow-up work from #8806.
Feedback on the error classification would be appreciated.
### Key Implementation Details
- Maps all 15 `EnvelopeError` variants to REJECT/IGNORE based on [Gloas p2p spec](https://github.com/ethereum/consensus-specs/blob/master/specs/gloas/p2p-interface.md#execution_payload)
- Follows `ExecutionPayloadError` handling pattern from block gossip (`penalize_peer()` method)
- Uses explicit variant matching (rather than catch-all `_`) for type safety
- Applies `LowToleranceError` penalty for protocol violations (invalid signatures, mismatches, etc.)
- Ignores without penalty for spec-defined cases (unknown block root, prior to finalization) and internal errors
Co-Authored-By: 0u-Y <yyw1000@naver.com>
Co-Authored-By: Eitan Seri-Levi <eserilev@gmail.com>
Gossip verify and cache bids and proposer preferences. This PR also ensures we subscribe to new fork topics one epoch early instead of two slots early. This is required for proposer preferences.
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
#9077
Where possible replaces all instances of `validator_monitor::timestamp_now` with `chain.slot_clock.now_duration().unwrap_or_default()`.
Where chain/slot_clock is not available, instead replace it with a convenience function `slot_clock::timestamp_now`.
Remove the `validator_monitor::timestamp_now` function.
Co-Authored-By: Mac L <mjladson@pm.me>
Serves envelope by range and by root requests. Added PayloadEnvelopeStreamer so that we dont need to alter upstream code when we introduce blinded payload envelopes.
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Add a queue that allows us to reprocess an envelope when it arrives over gossip references a unknown block root. When the block is finally imported, we immediately reprocess the queued envelope.
Note that we don't trigger a block lookup sync. Incoming attestations for this block root will already trigger a lookup for us. I think thats good enough
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>