* some blob reprocessing work

* remove ForceBlockLookup

* reorder enum match arms in sync manager

* a lot more reprocessing work

* impl logic for triggerng blob lookups along with block lookups

* deal with rpc blobs in groups per block in the da checker. don't cache missing blob ids in the da checker.

* make single block lookup generic

* more work

* add delayed processing logic and combine some requests

* start fixing some compile errors

* fix compilation in main block lookup mod

* much work

* get things compiling

* parent blob lookups

* fix compile

* revert red/stevie changes

* fix up sync manager delay message logic

* add peer usefulness enum

* should remove lookup refactor

* consolidate retry error handling

* improve peer scoring during certain failures in parent lookups

* improve retry code

* drop parent lookup if either req has a peer disconnect during download

* refactor single block processed method

* processing peer refactor

* smol bugfix

* fix some todos

* fix lints

* fix lints

* fix compile in lookup tests

* fix lints

* fix lints

* fix existing block lookup tests

* renamings

* fix after merge

* cargo fmt

* compilation fix in beacon chain tests

* fix

* refactor lookup tests to work with multiple forks and response types

* make tests into macros

* wrap availability check error

* fix compile after merge

* add random blobs

* start fixing up lookup verify error handling

* some bug fixes and the start of deneb only tests

* make tests work for all forks

* track information about peer source

* error refactoring

* improve peer scoring

* fix test compilation

* make sure blobs are sent for processing after stream termination, delete copied tests

* add some tests and fix a bug

* smol bugfixes and moar tests

* add tests and fix some things

* compile after merge

* lots of refactoring

* retry on invalid block/blob

* merge unknown parent messages before current slot lookup

* get tests compiling

* penalize blob peer on invalid blobs

* Check disk on in-memory cache miss

* Update beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs

* Update beacon_node/network/src/sync/network_context.rs

Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>

* fix bug in matching blocks and blobs in range sync

* pr feedback

* fix conflicts

* upgrade logs from warn to crit when we receive incorrect response in range

* synced_and_connected_within_tolerance -> should_search_for_block

* remove todo

* add data gas used and update excess data gas to u64

* Fix Broken Overflow Tests

* payload verification with commitments

* fix merge conflicts

* restore payload file

* Restore payload file

* remove todo

* add max blob commitments per block

* c-kzg lib update

* Fix ef tests

* Abstract over minimal/mainnet spec in kzg crate

* Start integrating new KZG

* checkpoint sync without alignment

* checkpoint sync without alignment

* add import

* add import

* query for checkpoint state by slot rather than state root (teku doesn't serve by state root)

* query for checkpoint state by slot rather than state root (teku doesn't serve by state root)

* loosen check

* get state first and query by most recent block root

* Revert "loosen check"

This reverts commit 069d13dd63.

* get state first and query by most recent block root

* merge max blobs change

* simplify delay logic

* rename unknown parent sync message variants

* rename parameter, block_slot -> slot

* add some docs to the lookup module

* use interval instead of sleep

* drop request if blocks and blobs requests both return `None` for `Id`

* clean up `find_single_lookup` logic

* add lookup source enum

* clean up `find_single_lookup` logic

* add docs to find_single_lookup_request

* move LookupSource our of param where unnecessary

* remove unnecessary todo

* query for block by `state.latest_block_header.slot`

* fix lint

* fix merge transition ef tests

* fix test

* fix test

* fix observed  blob sidecars test

* Add some metrics (#33)

* fix protocol limits for blobs by root

* Update Engine API for 1:1 Structure Method

* make beacon chain tests to fix devnet 6 changes

* get ckzg working and fix some tests

* fix remaining tests

* fix lints

* Fix KZG linking issues

* remove unused dep

* lockfile

* test fixes

* remove dbgs

* remove unwrap

* cleanup tx generator

* small fixes

* fixing fixes

* more self reivew

* more self review

* refactor genesis header initialization

* refactor mock el instantiations

* fix compile

* fix network test, make sure they run for each fork

* pr feedback

* fix last test (hopefully)

---------

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
This commit is contained in:
realbigsean
2023-06-29 15:35:43 -04:00
committed by GitHub
parent 4c9fcf1e83
commit adbb62f7f3
69 changed files with 2114 additions and 1338 deletions

View File

@@ -225,7 +225,7 @@ pub const GOSSIP_ATTESTATION_BATCH: &str = "gossip_attestation_batch";
pub const GOSSIP_AGGREGATE: &str = "gossip_aggregate";
pub const GOSSIP_AGGREGATE_BATCH: &str = "gossip_aggregate_batch";
pub const GOSSIP_BLOCK: &str = "gossip_block";
pub const GOSSIP_BLOCK_AND_BLOBS_SIDECAR: &str = "gossip_block_and_blobs_sidecar";
pub const GOSSIP_BLOBS_SIDECAR: &str = "gossip_blobs_sidecar";
pub const DELAYED_IMPORT_BLOCK: &str = "delayed_import_block";
pub const GOSSIP_VOLUNTARY_EXIT: &str = "gossip_voluntary_exit";
pub const GOSSIP_PROPOSER_SLASHING: &str = "gossip_proposer_slashing";
@@ -1002,7 +1002,7 @@ impl<T: BeaconChainTypes> Work<T> {
Work::GossipAggregate { .. } => GOSSIP_AGGREGATE,
Work::GossipAggregateBatch { .. } => GOSSIP_AGGREGATE_BATCH,
Work::GossipBlock { .. } => GOSSIP_BLOCK,
Work::GossipSignedBlobSidecar { .. } => GOSSIP_BLOCK_AND_BLOBS_SIDECAR,
Work::GossipSignedBlobSidecar { .. } => GOSSIP_BLOBS_SIDECAR,
Work::DelayedImportBlock { .. } => DELAYED_IMPORT_BLOCK,
Work::GossipVoluntaryExit { .. } => GOSSIP_VOLUNTARY_EXIT,
Work::GossipProposerSlashing { .. } => GOSSIP_PROPOSER_SLASHING,

View File

@@ -1,4 +1,3 @@
#![cfg(not(debug_assertions))] // Tests are too slow in debug.
#![cfg(test)]
use crate::beacon_processor::work_reprocessing_queue::{
@@ -7,14 +6,16 @@ use crate::beacon_processor::work_reprocessing_queue::{
use crate::beacon_processor::*;
use crate::{service::NetworkMessage, sync::SyncMessage};
use beacon_chain::test_utils::{
AttestationStrategy, BeaconChainHarness, BlockStrategy, EphemeralHarnessType,
test_spec, AttestationStrategy, BeaconChainHarness, BlockStrategy, EphemeralHarnessType,
};
use beacon_chain::{BeaconChain, ChainConfig, MAXIMUM_GOSSIP_CLOCK_DISPARITY};
use beacon_chain::{BeaconChain, ChainConfig, WhenSlotSkipped, MAXIMUM_GOSSIP_CLOCK_DISPARITY};
use lighthouse_network::discovery::ConnectionId;
use lighthouse_network::rpc::SubstreamId;
use lighthouse_network::{
discv5::enr::{CombinedKey, EnrBuilder},
rpc::methods::{MetaData, MetaDataV2},
types::{EnrAttestationBitfield, EnrSyncCommitteeBitfield},
MessageId, NetworkGlobals, PeerId,
MessageId, NetworkGlobals, PeerId, Response,
};
use slot_clock::SlotClock;
use std::cmp;
@@ -23,8 +24,8 @@ use std::sync::Arc;
use std::time::Duration;
use tokio::sync::mpsc;
use types::{
Attestation, AttesterSlashing, Epoch, EthSpec, MainnetEthSpec, ProposerSlashing,
SignedBeaconBlock, SignedVoluntaryExit, SubnetId,
Attestation, AttesterSlashing, Epoch, MainnetEthSpec, ProposerSlashing, SignedBeaconBlock,
SignedBlobSidecarList, SignedVoluntaryExit, Slot, SubnetId,
};
type E = MainnetEthSpec;
@@ -45,6 +46,7 @@ const STANDARD_TIMEOUT: Duration = Duration::from_secs(10);
struct TestRig {
chain: Arc<BeaconChain<T>>,
next_block: Arc<SignedBeaconBlock<E>>,
next_blobs: Option<SignedBlobSidecarList<E>>,
attestations: Vec<(Attestation<E>, SubnetId)>,
next_block_attestations: Vec<(Attestation<E>, SubnetId)>,
next_block_aggregate_attestations: Vec<SignedAggregateAndProof<E>>,
@@ -74,14 +76,15 @@ impl TestRig {
}
pub async fn new_with_chain_config(chain_length: u64, chain_config: ChainConfig) -> Self {
let mut spec = test_spec::<MainnetEthSpec>();
// This allows for testing voluntary exits without building out a massive chain.
let mut spec = E::default_spec();
spec.shard_committee_period = 2;
let harness = BeaconChainHarness::builder(MainnetEthSpec)
.spec(spec)
.deterministic_keypairs(VALIDATOR_COUNT)
.fresh_ephemeral_store()
.mock_execution_layer()
.chain_config(chain_config)
.build();
@@ -211,6 +214,7 @@ impl TestRig {
Self {
chain,
next_block: Arc::new(next_block_tuple.0),
next_blobs: next_block_tuple.1,
attestations,
next_block_attestations,
next_block_aggregate_attestations,
@@ -246,6 +250,22 @@ impl TestRig {
.unwrap();
}
pub fn enqueue_gossip_blob(&self, blob_index: usize) {
if let Some(blobs) = self.next_blobs.as_ref() {
let blob = blobs.get(blob_index).unwrap();
self.beacon_processor_tx
.try_send(WorkEvent::gossip_signed_blob_sidecar(
junk_message_id(),
junk_peer_id(),
Client::default(),
blob_index as u64,
blob.clone(),
Duration::from_secs(0),
))
.unwrap();
}
}
pub fn enqueue_rpc_block(&self) {
let event = WorkEvent::rpc_beacon_block(
self.next_block.canonical_root(),
@@ -268,6 +288,36 @@ impl TestRig {
self.beacon_processor_tx.try_send(event).unwrap();
}
pub fn enqueue_single_lookup_rpc_blobs(&self) {
if let Some(blobs) = self.next_blobs.clone() {
let blobs = FixedBlobSidecarList::from(
blobs
.into_iter()
.map(|b| Some(b.message))
.collect::<Vec<_>>(),
);
let event = WorkEvent::rpc_blobs(
self.next_block.canonical_root(),
blobs,
std::time::Duration::default(),
BlockProcessType::SingleBlock { id: 1 },
);
self.beacon_processor_tx.try_send(event).unwrap();
}
}
pub fn enqueue_blobs_by_range_request(&self, count: u64) {
let event = WorkEvent::blobs_by_range_request(
PeerId::random(),
(ConnectionId::new(42), SubstreamId::new(24)),
BlobsByRangeRequest {
start_slot: 0,
count,
},
);
self.beacon_processor_tx.try_send(event).unwrap();
}
pub fn enqueue_backfill_batch(&self) {
let event = WorkEvent::chain_segment(
ChainSegmentProcessId::BackSyncBatchId(Epoch::default()),
@@ -491,6 +541,13 @@ async fn import_gossip_block_acceptably_early() {
rig.assert_event_journal(&[GOSSIP_BLOCK, WORKER_FREED, NOTHING_TO_DO])
.await;
let num_blobs = rig.next_blobs.as_ref().map(|b| b.len()).unwrap_or(0);
for i in 0..num_blobs {
rig.enqueue_gossip_blob(i);
rig.assert_event_journal(&[GOSSIP_BLOBS_SIDECAR, WORKER_FREED, NOTHING_TO_DO])
.await;
}
// Note: this section of the code is a bit race-y. We're assuming that we can set the slot clock
// and check the head in the time between the block arrived early and when its due for
// processing.
@@ -499,6 +556,7 @@ async fn import_gossip_block_acceptably_early() {
// processing, instead of just ADDITIONAL_QUEUED_BLOCK_DELAY. Speak to @paulhauner if this test
// starts failing.
rig.chain.slot_clock.set_slot(rig.next_block.slot().into());
assert!(
rig.head_root() != rig.next_block.canonical_root(),
"block not yet imported"
@@ -566,6 +624,19 @@ async fn import_gossip_block_at_current_slot() {
rig.assert_event_journal(&[GOSSIP_BLOCK, WORKER_FREED, NOTHING_TO_DO])
.await;
let num_blobs = rig
.next_blobs
.as_ref()
.map(|blobs| blobs.len())
.unwrap_or(0);
for i in 0..num_blobs {
rig.enqueue_gossip_blob(i);
rig.assert_event_journal(&[GOSSIP_BLOBS_SIDECAR, WORKER_FREED, NOTHING_TO_DO])
.await;
}
assert_eq!(
rig.head_root(),
rig.next_block.canonical_root(),
@@ -618,20 +689,34 @@ async fn attestation_to_unknown_block_processed(import_method: BlockImportMethod
);
// Send the block and ensure that the attestation is received back and imported.
let block_event = match import_method {
let num_blobs = rig
.next_blobs
.as_ref()
.map(|blobs| blobs.len())
.unwrap_or(0);
let mut events = vec![];
match import_method {
BlockImportMethod::Gossip => {
rig.enqueue_gossip_block();
GOSSIP_BLOCK
events.push(GOSSIP_BLOCK);
for i in 0..num_blobs {
rig.enqueue_gossip_blob(i);
events.push(GOSSIP_BLOBS_SIDECAR);
}
}
BlockImportMethod::Rpc => {
rig.enqueue_rpc_block();
RPC_BLOCK
events.push(RPC_BLOCK);
rig.enqueue_single_lookup_rpc_blobs();
if num_blobs > 0 {
events.push(RPC_BLOB);
}
}
};
rig.assert_event_journal_contains_ordered(&[block_event, UNKNOWN_BLOCK_ATTESTATION])
.await;
events.push(UNKNOWN_BLOCK_ATTESTATION);
rig.assert_event_journal_contains_ordered(&events).await;
// Run fork choice, since it isn't run when processing an RPC block. At runtime it is the
// responsibility of the sync manager to do this.
@@ -687,20 +772,34 @@ async fn aggregate_attestation_to_unknown_block(import_method: BlockImportMethod
);
// Send the block and ensure that the attestation is received back and imported.
let block_event = match import_method {
let num_blobs = rig
.next_blobs
.as_ref()
.map(|blobs| blobs.len())
.unwrap_or(0);
let mut events = vec![];
match import_method {
BlockImportMethod::Gossip => {
rig.enqueue_gossip_block();
GOSSIP_BLOCK
events.push(GOSSIP_BLOCK);
for i in 0..num_blobs {
rig.enqueue_gossip_blob(i);
events.push(GOSSIP_BLOBS_SIDECAR);
}
}
BlockImportMethod::Rpc => {
rig.enqueue_rpc_block();
RPC_BLOCK
events.push(RPC_BLOCK);
rig.enqueue_single_lookup_rpc_blobs();
if num_blobs > 0 {
events.push(RPC_BLOB);
}
}
};
rig.assert_event_journal_contains_ordered(&[block_event, UNKNOWN_BLOCK_AGGREGATE])
.await;
events.push(UNKNOWN_BLOCK_AGGREGATE);
rig.assert_event_journal_contains_ordered(&events).await;
// Run fork choice, since it isn't run when processing an RPC block. At runtime it is the
// responsibility of the sync manager to do this.
@@ -868,9 +967,15 @@ async fn test_rpc_block_reprocessing() {
// Insert the next block into the duplicate cache manually
let handle = rig.duplicate_cache.check_and_insert(next_block_root);
rig.enqueue_single_lookup_rpc_block();
rig.assert_event_journal(&[RPC_BLOCK, WORKER_FREED, NOTHING_TO_DO])
.await;
rig.enqueue_single_lookup_rpc_blobs();
if rig.next_blobs.as_ref().map(|b| b.len()).unwrap_or(0) > 0 {
rig.assert_event_journal(&[RPC_BLOB, WORKER_FREED, NOTHING_TO_DO])
.await;
}
// next_block shouldn't be processed since it couldn't get the
// duplicate cache handle
assert_ne!(next_block_root, rig.head_root());
@@ -934,3 +1039,47 @@ async fn test_backfill_sync_processing_rate_limiting_disabled() {
)
.await;
}
#[tokio::test]
async fn test_blobs_by_range() {
if test_spec::<E>().deneb_fork_epoch.is_none() {
return;
};
let mut rig = TestRig::new(64).await;
let slot_count = 32;
rig.enqueue_blobs_by_range_request(slot_count);
let mut blob_count = 0;
for slot in 0..slot_count {
let root = rig
.chain
.block_root_at_slot(Slot::new(slot), WhenSlotSkipped::None)
.unwrap();
blob_count += root
.and_then(|root| {
rig.chain
.get_blobs(&root)
.unwrap_or_default()
.map(|blobs| blobs.len())
})
.unwrap_or(0);
}
let mut actual_count = 0;
while let Some(next) = rig._network_rx.recv().await {
if let NetworkMessage::SendResponse {
peer_id: _,
response: Response::BlobsByRange(blob),
id: _,
} = next
{
if blob.is_some() {
actual_count += 1;
} else {
break;
}
} else {
panic!("unexpected message {:?}", next);
}
}
assert_eq!(blob_count, actual_count);
}

View File

@@ -8,7 +8,7 @@ use beacon_chain::{
light_client_optimistic_update_verification::Error as LightClientOptimisticUpdateError,
observed_operations::ObservationOutcome,
sync_committee_verification::{self, Error as SyncCommitteeError},
validator_monitor::get_block_delay_ms,
validator_monitor::{get_block_delay_ms, get_slot_delay_ms},
AvailabilityProcessingStatus, BeaconChainError, BeaconChainTypes, BlockError, ForkChoiceError,
GossipVerifiedBlock, NotifyExecutionLayer,
};
@@ -659,11 +659,15 @@ impl<T: BeaconChainTypes> Worker<T> {
_peer_client: Client,
blob_index: u64,
signed_blob: SignedBlobSidecar<T::EthSpec>,
_seen_duration: Duration,
seen_duration: Duration,
) {
let slot = signed_blob.message.slot;
let root = signed_blob.message.block_root;
let index = signed_blob.message.index;
let delay = get_slot_delay_ms(seen_duration, slot, &self.chain.slot_clock);
// Log metrics to track delay from other nodes on the network.
metrics::observe_duration(&metrics::BEACON_BLOB_GOSSIP_SLOT_START_DELAY_TIME, delay);
metrics::set_gauge(&metrics::BEACON_BLOB_LAST_DELAY, delay.as_millis() as i64);
match self
.chain
.verify_blob_sidecar_for_gossip(signed_blob, blob_index)
@@ -676,8 +680,9 @@ impl<T: BeaconChainTypes> Worker<T> {
"root" => %root,
"index" => %index
);
metrics::inc_counter(&metrics::BEACON_PROCESSOR_GOSSIP_BLOB_VERIFIED_TOTAL);
self.propagate_validation_result(message_id, peer_id, MessageAcceptance::Accept);
self.process_gossip_verified_blob(peer_id, gossip_verified_blob, _seen_duration)
self.process_gossip_verified_blob(peer_id, gossip_verified_blob, seen_duration)
.await
}
Err(err) => {

View File

@@ -10,7 +10,7 @@ use lighthouse_network::rpc::methods::{
use lighthouse_network::rpc::StatusMessage;
use lighthouse_network::rpc::*;
use lighthouse_network::{PeerId, PeerRequestId, ReportSource, Response, SyncInfo};
use slog::{debug, error, warn};
use slog::{debug, error, trace, warn};
use slot_clock::SlotClock;
use std::collections::{hash_map::Entry, HashMap};
use task_executor::TaskExecutor;
@@ -778,7 +778,7 @@ impl<T: BeaconChainTypes> Worker<T> {
}
}
Ok(None) => {
debug!(
trace!(
self.log,
"No blobs in the store for block root";
"request" => ?req,