Sync peer attribution (#7733)

Which issue # does this PR address?

Closes #7604


  Improvements to range sync including:

1. Contain column requests only to peers that are part of the SyncingChain
2. Attribute the fault to the correct peer and downscore them if they don't return the data columns for the request
3. Improve sync performance by retrying only the failed columns from other peers instead of failing the entire batch
4. Uses the earliest_available_slot to make requests to peers that claim to have the epoch. Note: if no earliest_available_slot info is available, fallback to using previous logic i.e. assume peer has everything backfilled upto WS checkpoint/da boundary

Tested this on fusaka-devnet-2 with a full node and supernode and the recovering logic seems to works well.
Also tested this a little on mainnet.

Need to do more testing and possibly add some unit tests.
This commit is contained in:
Pawan Dhananjay
2025-07-11 17:02:30 -07:00
committed by GitHub
parent b43e0b446c
commit 90ff64381e
9 changed files with 437 additions and 99 deletions

View File

@@ -3,19 +3,20 @@ participants:
- cl_type: lighthouse
cl_image: lighthouse:local
el_type: geth
el_image: ethpandaops/geth:fusaka-devnet-1
el_image: ethpandaops/geth:fusaka-devnet-2
supernode: true
count: 2
# nodes without validators, used for testing sync.
- cl_type: lighthouse
cl_image: lighthouse:local
el_type: geth
el_image: ethpandaops/geth:fusaka-devnet-1
el_image: ethpandaops/geth:fusaka-devnet-2
supernode: true
validator_count: 0
- cl_type: lighthouse
cl_image: lighthouse:local
el_type: geth
el_image: ethpandaops/geth:fusaka-devnet-1
el_image: ethpandaops/geth:fusaka-devnet-2
supernode: false
validator_count: 0
network_params: