As @AgeManning mentioned the newest libp2p version had some problems and got downgraded again on lighthouse master. This is an intermediate version that makes no problems and only adds a small change of allowing only one topic per message.
## Description
This downgrades the recent libp2p upgrade.
There were issues with the RPC which prevented syncing of the chain and this upgrade needs to be further investigated.
## Description
Updates to the latest libp2p and includes gossipsub updates.
Of particular note is the limitation of a single topic per gossipsub message.
Co-authored-by: blacktemplar <blacktemplar@a1.net>
## Issue Addressed
Potentially resolves#1647 and sync stalls.
## Proposed Changes
The handling of the state of banned peers was inadequate for the complex peerdb data structure. We store a limited number of disconnected and banned peers in the db. We were not tracking intermediate "disconnecting" states and the in some circumstances we were updating the peer state without informing the peerdb. This lead to a number of inconsistencies in the peer state.
Further, the peer manager could ban a peer changing a peer's state from being connected to banned. In this circumstance, if the peer then disconnected, we didn't inform the application layer, which lead to applications like sync not being informed of a peers disconnection. This could lead to sync stalling and having to require a lighthouse restart.
Improved handling for peer states and interactions with the peerdb is made in this PR.
## Proposed Changes
Adds a gossipsub topic filter that only allows subscribing and incoming subscriptions from valid ETH2 topics.
## Additional Info
Currently the preparation of the valid topic hashes uses only the current fork id but in the future it must also use all possible future fork ids for planned forks. This has to get added when hard coded forks get implemented.
DO NOT MERGE: We first need to merge the libp2p changes (see https://github.com/sigp/rust-libp2p/pull/70) so that we can refer from here to a commit hash inside the lighthouse branch.
## Proposed Changes
Implement the new message id function (see https://github.com/ethereum/eth2.0-specs/pull/2089) using an additional fast message id function for better performance + caching decompressed data.
Squashed commit of the following:
commit f99373cbae
Author: Age Manning <Age@AgeManning.com>
Date: Mon Oct 5 18:44:09 2020 +1100
Clean up obsolute TODOs
## Issue Addressed
#629
## Proposed Changes
This removes banned peers from the DHT and informs discovery to block the node_id and the known source IP's associated with this node. It has the capabilities of un banning this peer after a period of time.
This also corrects the logic about banning specific IP addresses. We now use seen_ip addresses from libp2p rather than those sent to us via identify (which also include local addresses).
## Issue Addressed
N/A
## Proposed Changes
Shifts the local `metadata` to `network_globals` making it accessible to the HTTP API and other areas of lighthouse.
## Additional Info
N/A
## Issue Addressed
N/A
## Proposed Changes
Adds extended metrics to get a better idea of what is happening at the gossipsub layer of lighthouse. This provides information about mesh statistics per topics, subscriptions and peer scores.
## Additional Info
## Issue Addressed
#1172
## Proposed Changes
* updates the libp2p dependency
* small adaptions based on changes in libp2p
* report not just valid messages but also invalid and distinguish between `IGNORE`d messages and `REJECT`ed messages
Co-authored-by: Age Manning <Age@AgeManning.com>
## Issue Addressed
N/A
## Proposed Changes
Refactor attestation service to send out requests to find peers for subnets as soon as we get attestation duties.
Earlier, we had much more involved logic to send the discovery requests to the discovery service only 6 slots before the attestation slot. Now that discovery is much smarter with grouped queries, the complexity in attestation service can be reduced considerably.
Co-authored-by: Age Manning <Age@AgeManning.com>
## Issue Addressed
Resolves#1489
## Proposed Changes
- Change starting metadata seq num to 0 according to the [spec](https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/p2p-interface.md#metadata).
- Remove metadata field from `NetworkGlobals`
- Persist metadata to disk on every update
- Load metadata seq number from disk on restart
- Persist enr to disk on update to ensure enr sequence number increments are persisted as well.
## Additional info
Since we modified starting metadata seq num to 0 from 1, we might still see `Invalid Sequence number provided` like in #1489 from prysm nodes if they have our metadata cached.
## Issue Addressed
#1384
Only catch, as currently implemented, when dialing the multiaddr nodes, there is no way to ask the peer manager if they are already connected or dialing
limit simultaneous outgoing connections attempts to a reasonable top as an extra layer of protection
also shift the keep alive logic of the rpc handler to avoid needing to update it by hand. I think In rare cases this could make shutting down a connection a bit faster.
## Issue Addressed
Peers that connected after the peer limit may remain connected in some circumstances.
This ensures peers not in the peer manager's list get disconnected. Further logging is also added to track this behaviour.
## Issue Addressed
NA
## Proposed Changes
- Moves the git-based versioning we were doing into the `lighthouse_version` crate in `common`.
- Removes the `beacon_node/version` crate, replacing it with `lighthouse_version`.
- Bumps the version to `v0.2.0`.
## Additional Info
There are now two types of version string:
1. `const VERSION: &str = Lighthouse/v0.2.0-1419501f2+`
1. `version_with_platform() = Lighthouse/v0.2.0-1419501f2+/x86_64-linux`
(1) is handy cause it's a `const` and shorter. (2) has platform info so it's more useful. Note that the plus-sign (`+`) indicates the the git commit is dirty (it used to be `(modified)` but I had to shorten it to fit into graffiti).
These version strings are now included on:
- `lighthouse --version`
- `lcli --version`
- `curl localhost:5052/node/version`
- p2p messages when we communicate our version
You can update the version by changing this constant (version is not related to a `Cargo.toml`):
b9ad7102d5/common/lighthouse_version/src/lib.rs (L4-L15)
## Issue Addressed
Sync was breaking occasionally. The root cause appears to be identify crashing as events we being sent to the protocol after nodes were banned. Have not been able to reproduce sync issues since this update.
## Proposed Changes
Only send messages to sub-behaviour protocols if the peer manager thinks the peer is connected. All other messages are dropped.
## Issue Addressed
N/A
## Proposed Changes
This provides a number of corrections and improvements to gossipsub. Specifically
- Enables options for greater privacy around the message author
- Provides greater flexibility on message validation
- Prevents unvalidated messages from being gossiped
- Shifts the duplicate cache to a time-based cache inside gossipsub
- Updates the message-id to handle bytes
- Bug fixes related to mesh maintenance and topic subscription. This should improve our attestation inclusion rate.
## Issue Addressed
#1112
The logic is slightly different but still valid wrt to error handling.
- Inbound state is either Busy with a future that return the subtream (and info about the processing)
- The state machine works as follows:
- `Idle` with pending responses => `Busy`
- `Busy` => finished ? if so and there are new pending responses then `Busy`, if not then `Idle`
=> not finished remains `Busy`
- Add an `InboundInfo` for readability
- Other stuff:
- Close inbound substreams when all expected responses are sent
- Remove the error variants from `RPCCodedResponse` and use the codes instead
- Fix various spelling mistakes because I got sloppy last time
Sorry for the delay
Co-authored-by: Age Manning <Age@AgeManning.com>
Downgrades libp2p and the gossipsub updates.
This looks to resolve the CPU usage issue we have been seeing.
The root cause is likely inside the latest gossipsub updates, which will be addressed in a later PR