Commit Graph

15 Commits

Author SHA1 Message Date
Divma
36fc887a40 Gossip cache timeout adjustments (#2997)
## Proposed Changes

- Do not retry to publish sync committee messages.
- Give a more lenient timeout to slashings and exits
2022-02-07 23:25:06 +00:00
Age Manning
675c7b7e26 Correct a dial race condition (#2992)
## Issue Addressed

On a network with few nodes, it is possible that the same node can be found from a subnet discovery and a normal peer discovery at the same time.

The network behaviour loads these peers into events and processes them when it has the chance. It can happen that the same peer can enter the event queue more than once and then attempt to be dialed twice. 

This PR shifts the registration of nodes in the peerdb as being dialed before they enter the NetworkBehaviour queue, preventing multiple attempts of the same peer being entered into the queue and avoiding the race condition.
2022-02-07 23:25:05 +00:00
Divma
615695776e Retry gossipsub messages when insufficient peers (#2964)
## Issue Addressed
#2947 

## Proposed Changes

Store messages that fail to be published due to insufficient peers for retry later. Messages expire after half an epoch and are retried if gossipsub informs us that an useful peer has connected. Currently running in Atlanta

## Additional Info
If on retry sending the messages fails they will not be tried again
2022-02-03 01:12:30 +00:00
Age Manning
ca29b580a2 Increase target subnet peers (#2948)
In the latest release we decreased the target number of subnet peers. 

It appears this could be causing issues in some cases and so reverting it back to the previous number it wise. A larger PR that follows this will address some other related discovery issues and peer management around subnet peer discovery.
2022-01-24 12:08:00 +00:00
Age Manning
6f4102aab6 Network performance tuning (#2608)
There is a pretty significant tradeoff between bandwidth and speed of gossipsub messages. 

We can reduce our bandwidth usage considerably at the cost of minimally delaying gossipsub messages. The impact of delaying messages has not been analyzed thoroughly yet, however this PR in conjunction with some gossipsub updates show considerable bandwidth reduction. 

This PR allows the user to set a CLI value (`network-load`) which is an integer in the range of 1 of 5 depending on their bandwidth appetite. 1 represents the least bandwidth but slowest message recieving and 5 represents the most bandwidth and fastest received message time. 

For low-bandwidth users it is likely to be more efficient to use a lower value. The default is set to 3, which currently represents a reduced bandwidth usage compared to previous version of this PR. The previous lighthouse versions are equivalent to setting the `network-load` CLI to 4.

This PR is awaiting a few gossipsub updates before we can get it into lighthouse.
2022-01-14 05:42:47 +00:00
Paul Hauner
aaa5344eab Add peer score adjustment msgs (#2901)
## Issue Addressed

N/A

## Proposed Changes

This PR adds the `msg` field to `Peer score adjusted` log messages. These `msg` fields help identify *why* a peer was banned.

Example:

```
Jan 11 04:18:48.096 DEBG Peer score adjusted                     score: -100.00, peer_id: 16Uiu2HAmQskxKWWGYfginwZ51n5uDbhvjHYnvASK7PZ5gBdLmzWj, msg: attn_unknown_head, service: libp2p
Jan 11 04:18:48.096 DEBG Peer score adjusted                     score: -27.86, peer_id: 16Uiu2HAmA7cCb3MemVDbK3MHZoSb7VN3cFUG3vuSZgnGesuVhPDE, msg: sync_past_slot, service: libp2p
Jan 11 04:18:48.096 DEBG Peer score adjusted                     score: -100.00, peer_id: 16Uiu2HAmQskxKWWGYfginwZ51n5uDbhvjHYnvASK7PZ5gBdLmzWj, msg: attn_unknown_head, service: libp2p
Jan 11 04:18:48.096 DEBG Peer score adjusted                     score: -28.86, peer_id: 16Uiu2HAmA7cCb3MemVDbK3MHZoSb7VN3cFUG3vuSZgnGesuVhPDE, msg: sync_past_slot, service: libp2p
Jan 11 04:18:48.096 DEBG Peer score adjusted                     score: -29.86, peer_id: 16Uiu2HAmA7cCb3MemVDbK3MHZoSb7VN3cFUG3vuSZgnGesuVhPDE, msg: sync_past_slot, service: libp2p
```

There is also a `libp2p_report_peer_msgs_total` metrics which allows us to see count of reports per `msg` tag. 

## Additional Info

NA
2022-01-12 05:32:14 +00:00
Age Manning
81c667b58e Additional networking metrics (#2549)
Adds additional metrics for network monitoring and evaluation.


Co-authored-by: Mark Mackey <mark@sigmaprime.io>
2021-12-22 06:17:14 +00:00
Michael Sproul
2c07a72980 Revert peer DB changes from #2724 (#2828)
## Proposed Changes

This reverts commit 53562010ec from PR #2724

Hopefully this will restore the reliability of the sync simulator.
2021-11-25 03:45:52 +00:00
Age Manning
0b319d4926 Inform dialing via the behaviour (#2814)
I had this change but it seems to have been lost in chaos of network upgrades.

The swarm dialing event seems to miss some cases where we dial via the behaviour. This causes an error to be logged as the peer manager doesn't know about some dialing events. 

This shifts the logic to the behaviour to inform the peer manager.
2021-11-19 04:42:33 +00:00
Divma
53562010ec Move peer db writes to eth2 libp2p (#2724)
## Issue Addressed
Part of a bigger effort to make the network globals read only. This moves all writes to the `PeerDB` to the `eth2_libp2p` crate. Limiting writes to the peer manager is a slightly more complicated issue for a next PR, to keep things reviewable.

## Proposed Changes
- Make the peers field in the globals a private field.
- Allow mutable access to the peers field to `eth2_libp2p` for now.
- Add a new network message to update the sync state.

Co-authored-by: Age Manning <Age@AgeManning.com>
2021-11-19 04:42:31 +00:00
Age Manning
e519af9012 Update Lighthouse Dependencies (#2818)
## Issue Addressed

Updates lighthouse dependencies to resolve audit issues in out-dated deps.
2021-11-18 05:08:42 +00:00
Divma
fbafe416d1 Move the peer manager to be a behaviour (#2773)
This simply moves some functions that were "swarm notifications" to a network behaviour implementation.

Notes
------
- We could disconnect from the peer manager but we would lose the rpc shutdown message
- We still notify from the swarm since this is the most reliable way to get some events. Ugly but best for now
- Events need to be pushed with "add event" to wake the waker

Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
2021-11-08 00:01:10 +00:00
Divma
a683e0296a Peer manager cfg (#2766)
## Issue Addressed
I've done this change in a couple of WIPs already so I might as well submit it on its own. This changes no functionality but reduces coupling in a 0.0001%. It also helps new people who need to work in the peer manager to better understand what it actually needs from the outside

## Proposed Changes

Add a config to the peer manager
2021-11-03 23:44:44 +00:00
Age Manning
1790010260 Upgrade to latest libp2p (#2605)
This is a pre-cursor to the next libp2p upgrade. 

It is currently being used for staging a number of PR upgrades which are contingent on the latest libp2p.
2021-10-29 01:59:29 +00:00
Age Manning
df40700ddd Rename eth2_libp2p to lighthouse_network (#2702)
## Description

The `eth2_libp2p` crate was originally named and designed to incorporate a simple libp2p integration into lighthouse. Since its origins the crates purpose has expanded dramatically. It now houses a lot more sophistication that is specific to lighthouse and no longer just a libp2p integration. 

As of this writing it currently houses the following high-level lighthouse-specific logic:
- Lighthouse's implementation of the eth2 RPC protocol and specific encodings/decodings
- Integration and handling of ENRs with respect to libp2p and eth2
- Lighthouse's discovery logic, its integration with discv5 and logic about searching and handling peers. 
- Lighthouse's peer manager - This is a large module handling various aspects of Lighthouse's network, such as peer scoring, handling pings and metadata, connection maintenance and recording, etc.
- Lighthouse's peer database - This is a collection of information stored for each individual peer which is specific to lighthouse. We store connection state, sync state, last seen ips and scores etc. The data stored for each peer is designed for various elements of the lighthouse code base such as syncing and the http api.
- Gossipsub scoring - This stores a collection of gossipsub 1.1 scoring mechanisms that are continuously analyssed and updated based on the ethereum 2 networks and how Lighthouse performs on these networks.
- Lighthouse specific types for managing gossipsub topics, sync status and ENR fields
- Lighthouse's network HTTP API metrics - A collection of metrics for lighthouse network monitoring
- Lighthouse's custom configuration of all networking protocols, RPC, gossipsub, discovery, identify and libp2p. 

Therefore it makes sense to rename the crate to be more akin to its current purposes, simply that it manages the majority of Lighthouse's network stack. This PR renames this crate to `lighthouse_network`

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2021-10-19 00:30:39 +00:00