- custody_lookup_request: under Gloas with no FULL-child peer to serve columns, park the
data request (Pending) so the lookup completes on block import instead of stranding
- has_no_peers: once the Gloas block request is complete, evaluate only the gloas child peers
- bad_peer_wrong_data_response: the wrong-sidecar penalty is only attributable under Gloas at
depth >= 2 (the tip has no FULL-child peer)
The test harness stores a payload envelope for every block it produces (via
StoreOp::PutPayloadEnvelope), so:
- by_root_unknown must use a root with no block, not a real block root (which
already has a persisted envelope) to test the empty case.
- by_range must not expect a genesis envelope: genesis has no canonical
execution payload, so the by-range handler filters it via block_has_canonical_payload.
Gloas carries data columns in the payload envelope, so the test harness produces
no block-level columns for a Gloas block. Two tests assumed columns are present
and failed under the newly-enabled FORK_NAME=gloas network test matrix:
accept_processed_gossip_data_columns_without_import and
data_column_reconstruction_at_deadline. Gate them on the same fork condition as
their siblings, extended to skip Gloas; fulu coverage is unchanged.
Implementing gloas lookup sync is currently incompatible with the `GossipBlockProcessResult` mechanism.
Today it's implemented such that if we receive a sucessful `GossipBlockProcessResult` we directly mark the lookup as Complete and delete it. In Gloas we can't delete a lookup after block import, as we may still have FULL child awaiting the payload.
IMO this `GossipBlockProcessResult` brings a lot of headache and edge cases that we can just live without. Also the `reset_request` business is nasty and can easily leave the lookup in a bad state.
If we get rid of `GossipBlockProcessResult` we only pay the following performance penalty:
- Lookup is created exactly while the block's payload is being execution validated
- (new degradation) we download the block again
- send the block for processing but the duplicate cache prevents double execution
So in the worst case we spend a few KBs of extra download bandwidth. Remember each block is downloaded 8x times through gossip in the happy case.
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
No. But related to #9009 and #8996
- Change the `ForkContext::next_fork_digest()` to return `[u8; 4]` (returning `[0u8; 4]` for "no next fork").
- Update the initialization path and runtime fork transition path accordingly.
Added tests:
- [x] `test_next_fork_digest` — existing test passes with non-Option return type
- [x] `test_next_fork_digest_returns_zero_when_no_next_fork` — init at last BPO fork returns `[0u8; 4]`
- [x] `test_next_fork_digest_zero_after_runtime_transition_to_last_fork` — simulates `update_current_fork` to last fork, then verifies zero
Co-Authored-By: alleysira <1367108378@qq.com>
Co-Authored-By: Alleysira <56925051+Alleysira@users.noreply.github.com>
Co-Authored-By: chonghe <44791194+chong-he@users.noreply.github.com>
- block_verification test: ParentUnknown pattern needs `..` (field restored).
- Count gloas leaf-block completions in completed_lookups (were removed silently).
- Retain a parent on payload-download TooManyAttempts while a FULL child awaits its
payload (don't cascade-drop); the payload may still arrive.
- on_external_processing_result: complete the lookup on gossip import (gloas-aware),
fixing the pre-gloas regression flagged by the TODO.
- Complete lookups that become available via the da_checker during continue_requests
(no Imported processing result is emitted): detect in on_lookup_result + the
block-imported branch of on_processing_result.
- Lint: debug_assert!(true) -> false; redundant if-let Some(_) -> is_some().
N/A
Currently, we have `EnvelopeError` having a `ImportError` wrapping a `BlockError`. I feel this is extremely unintuitive because most of the envelope processing functions can simply return an `EnvelopeError` that makes sense in the function's context. It revealed further ugliness when implementing range sync in #9362
This PR does 2 main things:
1. Removes `ImportError(BlockError)` variant
2. Adds `EnvelopeError(EnvelopeError)` variant to a `BlockError`.
I feel this is more natural as there can be envelope errors when we try importing a Block but envelope errors can be contained to just envelope related errors.
The main blocker to doing this was `PayloadVerificationHandle` returning a `BlockError`. It uses a very small subset of `BlockError` which I extracted to its own error type which can be converted into both a BlockError and EnvelopeError.
This allows us to keep most of the pure envelope processing functions to just return EnvelopeErrors while we convert it to a `BlockError` only in import paths where we need to return a consolidated `BlockError`.
Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
Harness/tests (foundation):
- make_gloas_block_with_status: produce a gloas block with explicit parent
payload status (builds FULL vs EMPTY children); returns its data columns.
- TestRig::build_full_empty_fork: G(full) -> A(full) -> B(FULL child), A -> C(EMPTY).
- SimulateConfig::return_no_envelope_for_block: withhold a block's payload envelope.
- Tests: gloas_build_full_empty_fork_shape (shape), gloas_full_empty_children_
retain_parent_for_payload (happy path), gloas_empty_child_continues_while_
parent_payload_withheld (red: C must complete, B+A retained while payload withheld).
Option B sketch (untested, mod.rs) -- to be implemented properly:
- continue_child_lookups on a SingleBlock Imported result (children re-evaluate
on parent block import, before its payload).
- retain a failed lookup while another lookup awaits it (is_awaited).
- PeerType::PreGloas/PostGloas -> Block/GloasChild (names describe how a peer
relates to the block, not the fork).
- Add PeerType::new(parent_block_hash) and use it; search_parent_of_child now
takes peer_type: &PeerType instead of the raw parent_block_hash.
- request_batches_should_not_loop_infinitely: drop the bogus gloas skip and use
8 validators (4 was too few for a Gloas genesis -> InvalidIndicesCount).
Remove SingleBlockLookup::awaiting_parent_bid_hash (duplicated awaiting_parent
state) and derive the bid parent_block_hash from the lookup's own downloaded
block. This removes the parent_block_hash field from BlockError::ParentUnknown /
BlockProcessingResult::ParentUnknown, re-aligning them with unstable.
Adopt #9382's canonical ParentImportStatus / get_parent_import_status (drops the
duplicate is_parent_imported_status from this branch), keeping ParentUnknown's
parent_block_hash field which the lookup-sync peer donation depends on.
- Gate payload-envelope processing on block_request.state.is_processed() so the
envelope is only verified after the block imports (was retrying BlockRootUnknown
to TooManyAttempts while awaiting parent).
- Penalize attributable peers withholding columns post-Gloas (drop !gloas_enabled
custody carve-out).
- Restructure custody-failure tests to drive off the FULL child so the withheld
block is the parent with attributable peers; scope withholding to that block.
- Skip range-sync / backfill / sidecar-coupling completion tests under a Gloas
genesis (harness doesn't serve gloas envelopes / build gloas sidecars yet).
Rebase the gloas lookup-sync work onto #9391's RequestState trait-removal
design: payload-envelope request reuses the generic SingleLookupRequestState,
concrete BlockRequest/DataRequest/PayloadRequest, parent-imported gate against
awaiting_parent: Option<Hash256>. (Some gloas custody-failure tests still fail —
known peer-attribution issue, pushed for visibility.)
When debugging ePBS with columns, we noticed that columns arriving before their block dont pass gossip verification checks and are dropped. This PR ensures that columns arriving before the block are sent to the reprocess queue. Once their block arrives, they are reprocessed.
This isn't an issue pre-gloas because we don't make block root checks for fulu data columns. This allows us to gossip verify the column and send it to the DA cache before the block arrives.
I think we also need to handle this edge case for partial data columns. Theres an existing TODO for that already.
Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
- Simplification from https://github.com/sigp/lighthouse/pull/9155
Lookup sync does not cache sidecars, so sending the full network object adds unnecessary complexity. Sync only needs to know: We have received a header that has an unknown parent.
Replace `UnknownParentDataColumn` and `UnknownParentPartialDataColumn` for `UnknownParentSidecarHeader`
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Co-Authored-By: Eitan Seri-Levi <eserilev@gmail.com>
- https://github.com/sigp/lighthouse/pull/9155 remove the trait abstraction for processing block / blobs / columns / payloads
As a result we would have to duplicate x3 the big match on `BlockProcessingResult` we currently have in block lookups mod.rs
This PR moves the match of `BlockProcessingResult` to `sync_methods` to reduce the diff of https://github.com/sigp/lighthouse/pull/9155. There are some subtle changes that deserve dedicated attention, and may be drowned in the bigger diff of https://github.com/sigp/lighthouse/pull/9155 otherwise:
| Unstable | This PR / #9115 |
| - | - |
| Some error conditions immediately `Drop` the lookup (no retries). For example for "internal" errors like the BeaconChainError | Retries ALL errors 4 times. I believe assuming some errors are internal is risky as dropping a lookup drops all its children potentially forcing the node to resync a lot of blocks because of an internal timeout
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Reconciles unstable's #9383 (Deprecate blob lookup sync) with this PR's
rewritten lookup architecture by removing blob lookup from the new arch:
Deneb/Electra block lookups complete on the block alone (the merged
da_checker makes them available without blobs), and DataDownload::Blobs,
blob_lookup_request, SyncRequestId::SingleBlob, BlockProcessType::SingleBlob,
the process_rpc_blobs lookup cluster, and blob lookup tests are removed.
Range-sync blobs and blob serving are kept.