Improved peer management (#2993)

## Issue Addressed

I noticed in some logs some excess and unecessary discovery queries. What was happening was we were pruning our peers down to our outbound target and having some disconnect. When we are below this threshold we try to find more peers (even if we are at our peer limit). The request becomes futile because we have no more peer slots. 

This PR corrects this issue and advances the pruning mechanism to favour subnet peers. 

An overview the new logic added is:
- We prune peers down to a target outbound peer count which is higher than the minimum outbound peer count.
- We only search for more peers if there is room to do so, and we are below the minimum outbound peer count not the target. So this gives us some buffer for peers to disconnect. The buffer is currently 10%

The modified pruning logic is documented in the code but for reference it should do the following:
- Prune peers with bad scores first
- If we need to prune more peers, then prune peers that are subscribed to a long-lived subnet
- If we still need to prune peers, the prune peers that we have a higher density of on any given subnet which should drive for uniform peers across all subnets.

This will need a bit of testing as it modifies some significant peer management behaviours in lighthouse.
This commit is contained in:
Age Manning
2022-02-18 02:36:43 +00:00
parent da4ca024f1
commit 3ebb8b0244
7 changed files with 876 additions and 81 deletions

View File

@@ -63,7 +63,7 @@ const MAX_SUBNETS_IN_QUERY: usize = 3;
///
/// We could reduce this constant to speed up queries however at the cost of security. It will
/// make it easier to peers to eclipse this node. Kademlia suggests a value of 16.
const FIND_NODE_QUERY_CLOSEST_PEERS: usize = 16;
pub const FIND_NODE_QUERY_CLOSEST_PEERS: usize = 16;
/// The threshold for updating `min_ttl` on a connected peer.
const DURATION_DIFFERENCE: Duration = Duration::from_millis(1);
@@ -317,17 +317,18 @@ impl<TSpec: EthSpec> Discovery<TSpec> {
}
/// This adds a new `FindPeers` query to the queue if one doesn't already exist.
pub fn discover_peers(&mut self) {
/// The `target_peers` parameter informs discovery to end the query once the target is found.
/// The maximum this can be is 16.
pub fn discover_peers(&mut self, target_peers: usize) {
// If the discv5 service isn't running or we are in the process of a query, don't bother queuing a new one.
if !self.started || self.find_peer_active {
return;
}
// Immediately start a FindNode query
debug!(self.log, "Starting a peer discovery request");
let target_peers = std::cmp::min(FIND_NODE_QUERY_CLOSEST_PEERS, target_peers);
debug!(self.log, "Starting a peer discovery request"; "target_peers" => target_peers );
self.find_peer_active = true;
self.start_query(QueryType::FindPeers, FIND_NODE_QUERY_CLOSEST_PEERS, |_| {
true
});
self.start_query(QueryType::FindPeers, target_peers, |_| true);
}
/// Processes a request to search for more peers on a subnet.