Sync cleanups (#8230)

N/A


  1. In the batch retry logic, we were failing to set the batch state to `AwaitingDownload` before attempting a retry. This PR sets it to `AwaitingDownload` before the retry and sets it back to `Downloading` if the retry suceeded in sending out a request
2. Remove all peer scoring logic from retrying and rely on just de priorotizing the failed peer. I finally concede the point to @dapplion 😄
3. Changes `block_components_by_range_request` to accept `block_peers` and `column_peers`. This is to ensure that we use the full synced peerset for requesting columns in order to avoid splitting the column peers among multiple head chains. During forward sync, we want the block peers to be the peers from the syncing chain and column peers to be all synced peers from the peerdb.
Also, fixes a typo and calls `attempt_send_awaiting_download_batches` from more places


Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
This commit is contained in:
Pawan Dhananjay
2025-10-20 04:50:00 -07:00
committed by GitHub
parent c012f46cb9
commit 092aaae961
6 changed files with 77 additions and 44 deletions

View File

@@ -210,7 +210,7 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
.network_globals
.peers
.read()
.synced_peers_for_epoch(self.to_be_downloaded, None)
.synced_peers_for_epoch(self.to_be_downloaded)
.next()
.is_some()
// backfill can't progress if we do not have peers in the required subnets post peerdas.
@@ -313,7 +313,6 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
CouplingError::DataColumnPeerFailure {
error,
faulty_peers,
action,
exceeded_retries,
} => {
debug!(?batch_id, error, "Block components coupling error");
@@ -325,11 +324,8 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
failed_columns.insert(*column);
failed_peers.insert(*peer);
}
for peer in failed_peers.iter() {
network.report_peer(*peer, *action, "failed to return columns");
}
// Only retry if peer failure **and** retries have been exceeded
// Only retry if peer failure **and** retries haven't been exceeded
if !*exceeded_retries {
return self.retry_partial_batch(
network,
@@ -888,7 +884,7 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
.network_globals
.peers
.read()
.synced_peers_for_epoch(batch_id, None)
.synced_peers_for_epoch(batch_id)
.cloned()
.collect::<HashSet<_>>();
@@ -899,6 +895,7 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
request,
RangeRequestId::BackfillSync { batch_id },
&synced_peers,
&synced_peers, // All synced peers have imported up to the finalized slot so they must have their custody columns available
&failed_peers,
) {
Ok(request_id) => {
@@ -964,7 +961,7 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
.network_globals()
.peers
.read()
.synced_peers_for_epoch(batch_id, None)
.synced_peers_for_epoch(batch_id)
.cloned()
.collect::<HashSet<_>>();