Single blob lookups (#4152)

* some blob reprocessing work

* remove ForceBlockLookup

* reorder enum match arms in sync manager

* a lot more reprocessing work

* impl logic for triggerng blob lookups along with block lookups

* deal with rpc blobs in groups per block in the da checker. don't cache missing blob ids in the da checker.

* make single block lookup generic

* more work

* add delayed processing logic and combine some requests

* start fixing some compile errors

* fix compilation in main block lookup mod

* much work

* get things compiling

* parent blob lookups

* fix compile

* revert red/stevie changes

* fix up sync manager delay message logic

* add peer usefulness enum

* should remove lookup refactor

* consolidate retry error handling

* improve peer scoring during certain failures in parent lookups

* improve retry code

* drop parent lookup if either req has a peer disconnect during download

* refactor single block processed method

* processing peer refactor

* smol bugfix

* fix some todos

* fix lints

* fix lints

* fix compile in lookup tests

* fix lints

* fix lints

* fix existing block lookup tests

* renamings

* fix after merge

* cargo fmt

* compilation fix in beacon chain tests

* fix

* refactor lookup tests to work with multiple forks and response types

* make tests into macros

* wrap availability check error

* fix compile after merge

* add random blobs

* start fixing up lookup verify error handling

* some bug fixes and the start of deneb only tests

* make tests work for all forks

* track information about peer source

* error refactoring

* improve peer scoring

* fix test compilation

* make sure blobs are sent for processing after stream termination, delete copied tests

* add some tests and fix a bug

* smol bugfixes and moar tests

* add tests and fix some things

* compile after merge

* lots of refactoring

* retry on invalid block/blob

* merge unknown parent messages before current slot lookup

* get tests compiling

* penalize blob peer on invalid blobs

* Check disk on in-memory cache miss

* Update beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs

* Update beacon_node/network/src/sync/network_context.rs

Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>

* fix bug in matching blocks and blobs in range sync

* pr feedback

* fix conflicts

* upgrade logs from warn to crit when we receive incorrect response in range

* synced_and_connected_within_tolerance -> should_search_for_block

* remove todo

* Fix Broken Overflow Tests

* fix merge conflicts

* checkpoint sync without alignment

* add import

* query for checkpoint state by slot rather than state root (teku doesn't serve by state root)

* get state first and query by most recent block root

* simplify delay logic

* rename unknown parent sync message variants

* rename parameter, block_slot -> slot

* add some docs to the lookup module

* use interval instead of sleep

* drop request if blocks and blobs requests both return `None` for `Id`

* clean up `find_single_lookup` logic

* add lookup source enum

* clean up `find_single_lookup` logic

* add docs to find_single_lookup_request

* move LookupSource our of param where unnecessary

* remove unnecessary todo

* query for block by `state.latest_block_header.slot`

* fix lint

* fix test

* fix test

* fix observed  blob sidecars test

* PR updates

* use optional params instead of a closure

* create lookup and trigger request in separate method calls

* remove `LookupSource`

* make sure duplicate lookups are not dropped

---------

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
This commit is contained in:
realbigsean
2023-06-15 12:59:10 -04:00
committed by GitHub
parent 5428e68943
commit a62e52f319
47 changed files with 4981 additions and 1309 deletions

View File

@@ -419,23 +419,14 @@ where
let weak_subj_block_root = weak_subj_block.canonical_root();
let weak_subj_state_root = weak_subj_block.state_root();
// Check that the given block lies on an epoch boundary. Due to the database only storing
// Check that the given state lies on an epoch boundary. Due to the database only storing
// full states on epoch boundaries and at restore points it would be difficult to support
// starting from a mid-epoch state.
if weak_subj_slot % TEthSpec::slots_per_epoch() != 0 {
return Err(format!(
"Checkpoint block at slot {} is not aligned to epoch start. \
Please supply an aligned checkpoint with block.slot % 32 == 0",
weak_subj_block.slot(),
));
}
// Check that the block and state have consistent slots and state roots.
if weak_subj_state.slot() != weak_subj_block.slot() {
return Err(format!(
"Slot of snapshot block ({}) does not match snapshot state ({})",
weak_subj_block.slot(),
weak_subj_state.slot(),
"Checkpoint state at slot {} is not aligned to epoch start. \
Please supply an aligned checkpoint with state.slot % 32 == 0",
weak_subj_slot,
));
}
@@ -444,16 +435,21 @@ where
weak_subj_state
.build_all_caches(&self.spec)
.map_err(|e| format!("Error building caches on checkpoint state: {e:?}"))?;
let computed_state_root = weak_subj_state
weak_subj_state
.update_tree_hash_cache()
.map_err(|e| format!("Error computing checkpoint state root: {:?}", e))?;
if weak_subj_state_root != computed_state_root {
return Err(format!(
"Snapshot state root does not match block, expected: {:?}, got: {:?}",
weak_subj_state_root, computed_state_root
));
let latest_block_slot = weak_subj_state.latest_block_header().slot;
// We can only validate the block root if it exists in the state. We can't calculated it
// from the `latest_block_header` because the state root might be set to the zero hash.
if let Ok(state_slot_block_root) = weak_subj_state.get_block_root(latest_block_slot) {
if weak_subj_block_root != *state_slot_block_root {
return Err(format!(
"Snapshot state's most recent block root does not match block, expected: {:?}, got: {:?}",
weak_subj_block_root, state_slot_block_root
));
}
}
// Check that the checkpoint state is for the same network as the genesis state.
@@ -508,13 +504,12 @@ where
let fc_store = BeaconForkChoiceStore::get_forkchoice_store(store, &snapshot)
.map_err(|e| format!("Unable to initialize fork choice store: {e:?}"))?;
let current_slot = Some(snapshot.beacon_block.slot());
let fork_choice = ForkChoice::from_anchor(
fc_store,
snapshot.beacon_block_root,
&snapshot.beacon_block,
&snapshot.beacon_state,
current_slot,
Some(weak_subj_slot),
&self.spec,
)
.map_err(|e| format!("Unable to initialize ForkChoice: {:?}", e))?;
@@ -891,13 +886,10 @@ where
validator_monitor: RwLock::new(validator_monitor),
genesis_backfill_slot,
//TODO(sean) should we move kzg solely to the da checker?
data_availability_checker: DataAvailabilityChecker::new(
slot_clock,
kzg.clone(),
store,
self.spec,
)
.map_err(|e| format!("Error initializing DataAvailabiltyChecker: {:?}", e))?,
data_availability_checker: Arc::new(
DataAvailabilityChecker::new(slot_clock, kzg.clone(), store, self.spec)
.map_err(|e| format!("Error initializing DataAvailabiltyChecker: {:?}", e))?,
),
proposal_blob_cache: BlobCache::default(),
kzg,
};