Stable futures (#879)

* Port eth1 lib to use stable futures

* Port eth1_test_rig to stable futures

* Port eth1 tests to stable futures

* Port genesis service to stable futures

* Port genesis tests to stable futures

* Port beacon_chain to stable futures

* Port lcli to stable futures

* Fix eth1_test_rig (#1014)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Update hashmap hashset to stable futures

* Adds panic test to hashset delay

* Port remote_beacon_node to stable futures

* Fix lcli merge conflicts

* Non rpc stuff compiles

* protocol.rs compiles

* Port websockets, timer and notifier to stable futures (#1035)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Port remote_beacon_node to stable futures

* Partial eth2-libp2p stable future upgrade

* Finished first round of fighting RPC types

* Further progress towards porting eth2-libp2p adds caching to discovery

* Update behaviour

* RPC handler to stable futures

* Update RPC to master libp2p

* Network service additions

* Fix the fallback transport construction (#1102)

* Correct warning

* Remove hashmap delay

* Compiling version of eth2-libp2p

* Update all crates versions

* Fix conversion function and add tests (#1113)

* Port validator_client to stable futures (#1114)

* Add PH & MS slot clock changes

* Account for genesis time

* Add progress on duties refactor

* Add simple is_aggregator bool to val subscription

* Start work on attestation_verification.rs

* Add progress on ObservedAttestations

* Progress with ObservedAttestations

* Fix tests

* Add observed attestations to the beacon chain

* Add attestation observation to processing code

* Add progress on attestation verification

* Add first draft of ObservedAttesters

* Add more tests

* Add observed attesters to beacon chain

* Add observers to attestation processing

* Add more attestation verification

* Create ObservedAggregators map

* Remove commented-out code

* Add observed aggregators into chain

* Add progress

* Finish adding features to attestation verification

* Ensure beacon chain compiles

* Link attn verification into chain

* Integrate new attn verification in chain

* Remove old attestation processing code

* Start trying to fix beacon_chain tests

* Split adding into pools into two functions

* Add aggregation to harness

* Get test harness working again

* Adjust the number of aggregators for test harness

* Fix edge-case in harness

* Integrate new attn processing in network

* Fix compile bug in validator_client

* Update validator API endpoints

* Fix aggreagation in test harness

* Fix enum thing

* Fix attestation observation bug:

* Patch failing API tests

* Start adding comments to attestation verification

* Remove unused attestation field

* Unify "is block known" logic

* Update comments

* Supress fork choice errors for network processing

* Add todos

* Tidy

* Add gossip attn tests

* Disallow test harness to produce old attns

* Comment out in-progress tests

* Partially address pruning tests

* Fix failing store test

* Add aggregate tests

* Add comments about which spec conditions we check

* Dont re-aggregate

* Split apart test harness attn production

* Fix compile error in network

* Make progress on commented-out test

* Fix skipping attestation test

* Add fork choice verification tests

* Tidy attn tests, remove dead code

* Remove some accidentally added code

* Fix clippy lint

* Rename test file

* Add block tests, add cheap block proposer check

* Rename block testing file

* Add observed_block_producers

* Tidy

* Switch around block signature verification

* Finish block testing

* Remove gossip from signature tests

* First pass of self review

* Fix deviation in spec

* Update test spec tags

* Start moving over to hashset

* Finish moving observed attesters to hashmap

* Move aggregation pool over to hashmap

* Make fc attn borrow again

* Fix rest_api compile error

* Fix missing comments

* Fix monster test

* Uncomment increasing slots test

* Address remaining comments

* Remove unsafe, use cfg test

* Remove cfg test flag

* Fix dodgy comment

* Revert "Update hashmap hashset to stable futures"

This reverts commit d432378a3c.

* Revert "Adds panic test to hashset delay"

This reverts commit 281502396f.

* Ported attestation_service

* Ported duties_service

* Ported fork_service

* More ports

* Port block_service

* Minor fixes

* VC compiles

* Update TODOS

* Borrow self where possible

* Ignore aggregates that are already known.

* Unify aggregator modulo logic

* Fix typo in logs

* Refactor validator subscription logic

* Avoid reproducing selection proof

* Skip HTTP call if no subscriptions

* Rename DutyAndState -> DutyAndProof

* Tidy logs

* Print root as dbg

* Fix compile errors in tests

* Fix compile error in test

* Re-Fix attestation and duties service

* Minor fixes

Co-authored-by: Paul Hauner <paul@paulhauner.com>

* Network crate update to stable futures

* Port account_manager to stable futures (#1121)

* Port account_manager to stable futures

* Run async fns in tokio environment

* Port rest_api crate to stable futures (#1118)

* Port rest_api lib to stable futures

* Reduce tokio features

* Update notifier to stable futures

* Builder update

* Further updates

* Convert self referential async functions

* stable futures fixes (#1124)

* Fix eth1 update functions

* Fix genesis and client

* Fix beacon node lib

* Return appropriate runtimes from environment

* Fix test rig

* Refactor eth1 service update

* Upgrade simulator to stable futures

* Lighthouse compiles on stable futures

* Remove println debugging statement

* Update libp2p service, start rpc test upgrade

* Update network crate for new libp2p

* Update tokio::codec to futures_codec (#1128)

* Further work towards RPC corrections

* Correct http timeout and network service select

* Use tokio runtime for libp2p

* Revert "Update tokio::codec to futures_codec (#1128)"

This reverts commit e57aea924a.

* Upgrade RPC libp2p tests

* Upgrade secio fallback test

* Upgrade gossipsub examples

* Clean up RPC protocol

* Test fixes (#1133)

* Correct websocket timeout and run on os thread

* Fix network test

* Clean up PR

* Correct tokio tcp move attestation service tests

* Upgrade attestation service tests

* Correct network test

* Correct genesis test

* Test corrections

* Log info when block is received

* Modify logs and update attester service events

* Stable futures: fixes to vc, eth1 and account manager (#1142)

* Add local testnet scripts

* Remove whiteblock script

* Rename local testnet script

* Move spawns onto handle

* Fix VC panic

* Initial fix to block production issue

* Tidy block producer fix

* Tidy further

* Add local testnet clean script

* Run cargo fmt

* Tidy duties service

* Tidy fork service

* Tidy ForkService

* Tidy AttestationService

* Tidy notifier

* Ensure await is not suppressed in eth1

* Ensure await is not suppressed in account_manager

* Use .ok() instead of .unwrap_or(())

* RPC decoding test for proto

* Update discv5 and eth2-libp2p deps

* Fix lcli double runtime issue (#1144)

* Handle stream termination and dialing peer errors

* Correct peer_info variant types

* Remove unnecessary warnings

* Handle subnet unsubscription removal and improve logigng

* Add logs around ping

* Upgrade discv5 and improve logging

* Handle peer connection status for multiple connections

* Improve network service logging

* Improve logging around peer manager

* Upgrade swarm poll centralise peer management

* Identify clients on error

* Fix `remove_peer` in sync (#1150)

* remove_peer removes from all chains

* Remove logs

* Fix early return from loop

* Improved logging, fix panic

* Partially correct tests

* Stable futures: Vc sync (#1149)

* Improve syncing heuristic

* Add comments

* Use safer method for tolerance

* Fix tests

* Stable futures: Fix VC bug, update agg pool, add more metrics (#1151)

* Expose epoch processing summary

* Expose participation metrics to prometheus

* Switch to f64

* Reduce precision

* Change precision

* Expose observed attesters metrics

* Add metrics for agg/unagg attn counts

* Add metrics for gossip rx

* Add metrics for gossip tx

* Adds ignored attns to prom

* Add attestation timing

* Add timer for aggregation pool sig agg

* Add write lock timer for agg pool

* Add more metrics to agg pool

* Change map lock code

* Add extra metric to agg pool

* Change lock handling in agg pool

* Change .write() to .read()

* Add another agg pool timer

* Fix for is_aggregator

* Fix pruning bug

Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
This commit is contained in:
Age Manning
2020-05-17 21:16:48 +10:00
committed by GitHub
parent 21901b1615
commit b6408805a2
165 changed files with 7924 additions and 7733 deletions

View File

@@ -2,45 +2,76 @@ use crate::behaviour::{Behaviour, BehaviourEvent};
use crate::discovery::enr;
use crate::multiaddr::Protocol;
use crate::types::{error, GossipKind};
use crate::EnrExt;
use crate::{NetworkConfig, NetworkGlobals};
use futures::prelude::*;
use futures::Stream;
use libp2p::core::{
identity::Keypair,
multiaddr::Multiaddr,
muxing::StreamMuxerBox,
nodes::Substream,
transport::boxed::Boxed,
upgrade::{InboundUpgradeExt, OutboundUpgradeExt},
ConnectedPoint,
};
use libp2p::{core, noise, secio, swarm::NetworkBehaviour, PeerId, Swarm, Transport};
use slog::{crit, debug, error, info, trace, warn};
use libp2p::{
core, noise, secio,
swarm::{NetworkBehaviour, SwarmBuilder, SwarmEvent},
PeerId, Swarm, Transport,
};
use slog::{crit, debug, error, info, o, trace, warn};
use std::fs::File;
use std::io::prelude::*;
use std::io::{Error, ErrorKind};
use std::pin::Pin;
use std::sync::Arc;
use std::time::Duration;
use tokio::timer::DelayQueue;
use tokio::time::DelayQueue;
use types::{EnrForkId, EthSpec};
type Libp2pStream = Boxed<(PeerId, StreamMuxerBox), Error>;
type Libp2pBehaviour<TSpec> = Behaviour<Substream<StreamMuxerBox>, TSpec>;
pub const NETWORK_KEY_FILENAME: &str = "key";
/// The time in milliseconds to wait before banning a peer. This allows for any Goodbye messages to be
/// flushed and protocols to be negotiated.
const BAN_PEER_WAIT_TIMEOUT: u64 = 200;
/// The maximum simultaneous libp2p connections per peer.
const MAX_CONNECTIONS_PER_PEER: usize = 1;
/// The types of events than can be obtained from polling the libp2p service.
///
/// This is a subset of the events that a libp2p swarm emits.
#[derive(Debug)]
pub enum Libp2pEvent<TSpec: EthSpec> {
/// A behaviour event
Behaviour(BehaviourEvent<TSpec>),
/// A new listening address has been established.
NewListenAddr(Multiaddr),
/// A peer has established at least one connection.
PeerConnected {
/// The peer that connected.
peer_id: PeerId,
/// Whether the peer was a dialer or listener.
endpoint: ConnectedPoint,
},
/// A peer no longer has any connections, i.e is disconnected.
PeerDisconnected {
/// The peer the disconnected.
peer_id: PeerId,
/// Whether the peer was a dialer or a listener.
endpoint: ConnectedPoint,
},
}
/// The configuration and state of the libp2p components for the beacon node.
pub struct Service<TSpec: EthSpec> {
/// The libp2p Swarm handler.
//TODO: Make this private
pub swarm: Swarm<Libp2pStream, Libp2pBehaviour<TSpec>>,
pub swarm: Swarm<Behaviour<TSpec>>,
/// This node's PeerId.
pub local_peer_id: PeerId,
/// Used for managing the state of peers.
network_globals: Arc<NetworkGlobals<TSpec>>,
/// A current list of peers to ban after a given timeout.
peers_to_ban: DelayQueue<PeerId>,
@@ -55,8 +86,9 @@ impl<TSpec: EthSpec> Service<TSpec> {
pub fn new(
config: &NetworkConfig,
enr_fork_id: EnrForkId,
log: slog::Logger,
log: &slog::Logger,
) -> error::Result<(Arc<NetworkGlobals<TSpec>>, Self)> {
let log = log.new(o!("service"=> "libp2p"));
trace!(log, "Libp2p Service starting");
// initialise the node's ID
@@ -84,10 +116,22 @@ impl<TSpec: EthSpec> Service<TSpec> {
let mut swarm = {
// Set up the transport - tcp/ws with noise/secio and mplex/yamux
let transport = build_transport(local_keypair.clone());
let transport = build_transport(local_keypair.clone())
.map_err(|e| format!("Failed to build transport: {:?}", e))?;
// Lighthouse network behaviour
let behaviour = Behaviour::new(&local_keypair, config, network_globals.clone(), &log)?;
Swarm::new(transport, behaviour, local_peer_id.clone())
// use the executor for libp2p
struct Executor(tokio::runtime::Handle);
impl libp2p::core::Executor for Executor {
fn exec(&self, f: Pin<Box<dyn Future<Output = ()> + Send>>) {
self.0.spawn(f);
}
}
SwarmBuilder::new(transport, behaviour, local_peer_id.clone())
.peer_connection_limit(MAX_CONNECTIONS_PER_PEER)
.executor(Box::new(Executor(tokio::runtime::Handle::current())))
.build()
};
// listen on the specified address
@@ -131,19 +175,24 @@ impl<TSpec: EthSpec> Service<TSpec> {
}
// attempt to connect to any specified boot-nodes
for bootnode_enr in &config.boot_nodes {
let mut boot_nodes = config.boot_nodes.clone();
boot_nodes.dedup();
for bootnode_enr in boot_nodes {
for multiaddr in &bootnode_enr.multiaddr() {
// ignore udp multiaddr if it exists
let components = multiaddr.iter().collect::<Vec<_>>();
if let Protocol::Udp(_) = components[1] {
continue;
}
// inform the peer manager that we are currently dialing this peer
network_globals
if !network_globals
.peers
.write()
.dialing_peer(&bootnode_enr.peer_id());
dial_addr(multiaddr);
.read()
.is_connected_or_dialing(&bootnode_enr.peer_id())
{
dial_addr(multiaddr);
}
}
}
@@ -160,6 +209,7 @@ impl<TSpec: EthSpec> Service<TSpec> {
let service = Service {
local_peer_id,
swarm,
network_globals: network_globals.clone(),
peers_to_ban: DelayQueue::new(),
peer_ban_timeout: DelayQueue::new(),
log,
@@ -177,76 +227,132 @@ impl<TSpec: EthSpec> Service<TSpec> {
);
self.peer_ban_timeout.insert(peer_id, timeout);
}
}
impl<TSpec: EthSpec> Stream for Service<TSpec> {
type Item = BehaviourEvent<TSpec>;
type Error = error::Error;
fn poll(&mut self) -> Poll<Option<Self::Item>, Self::Error> {
pub async fn next_event(&mut self) -> Libp2pEvent<TSpec> {
loop {
match self.swarm.poll() {
Ok(Async::Ready(Some(event))) => {
return Ok(Async::Ready(Some(event)));
}
Ok(Async::Ready(None)) => unreachable!("Swarm stream shouldn't end"),
Ok(Async::NotReady) => break,
_ => break,
}
}
tokio::select! {
event = self.swarm.next_event() => {
match event {
SwarmEvent::Behaviour(behaviour) => {
return Libp2pEvent::Behaviour(behaviour)
}
SwarmEvent::ConnectionEstablished {
peer_id,
endpoint,
num_established,
} => {
debug!(self.log, "Connection established"; "peer_id"=> peer_id.to_string(), "connections" => num_established.get());
// if this is the first connection inform the network layer a new connection
// has been established and update the db
if num_established.get() == 1 {
// update the peerdb
match endpoint {
ConnectedPoint::Listener { .. } => {
self.swarm.peer_manager().connect_ingoing(&peer_id);
}
ConnectedPoint::Dialer { .. } => self
.network_globals
.peers
.write()
.connect_outgoing(&peer_id),
}
return Libp2pEvent::PeerConnected { peer_id, endpoint };
}
}
SwarmEvent::ConnectionClosed {
peer_id,
cause,
endpoint,
num_established,
} => {
debug!(self.log, "Connection closed"; "peer_id"=> peer_id.to_string(), "cause" => cause.to_string(), "connections" => num_established);
if num_established == 0 {
// update the peer_db
self.swarm.peer_manager().notify_disconnect(&peer_id);
// the peer has disconnected
return Libp2pEvent::PeerDisconnected {
peer_id,
endpoint,
};
}
}
SwarmEvent::NewListenAddr(multiaddr) => {
return Libp2pEvent::NewListenAddr(multiaddr)
}
// check if peers need to be banned
loop {
match self.peers_to_ban.poll() {
Ok(Async::Ready(Some(peer_id))) => {
let peer_id = peer_id.into_inner();
Swarm::ban_peer_id(&mut self.swarm, peer_id.clone());
// TODO: Correctly notify protocols of the disconnect
// TODO: Also remove peer from the DHT: https://github.com/sigp/lighthouse/issues/629
let dummy_connected_point = ConnectedPoint::Dialer {
address: "/ip4/0.0.0.0"
.parse::<Multiaddr>()
.expect("valid multiaddr"),
};
self.swarm
.inject_disconnected(&peer_id, dummy_connected_point);
// inform the behaviour that the peer has been banned
self.swarm.peer_banned(peer_id);
}
Ok(Async::NotReady) | Ok(Async::Ready(None)) => break,
Err(e) => {
warn!(self.log, "Peer banning queue failed"; "error" => format!("{:?}", e));
SwarmEvent::IncomingConnection {
local_addr,
send_back_addr,
} => {
debug!(self.log, "Incoming connection"; "our_addr" => local_addr.to_string(), "from" => send_back_addr.to_string())
}
SwarmEvent::IncomingConnectionError {
local_addr,
send_back_addr,
error,
} => {
debug!(self.log, "Failed incoming connection"; "our_addr" => local_addr.to_string(), "from" => send_back_addr.to_string(), "error" => error.to_string())
}
SwarmEvent::BannedPeer {
peer_id,
endpoint: _,
} => {
debug!(self.log, "Attempted to dial a banned peer"; "peer_id" => peer_id.to_string())
}
SwarmEvent::UnreachableAddr {
peer_id,
address,
error,
attempts_remaining,
} => {
debug!(self.log, "Failed to dial address"; "peer_id" => peer_id.to_string(), "address" => address.to_string(), "error" => error.to_string(), "attempts_remaining" => attempts_remaining);
self.swarm.peer_manager().notify_disconnect(&peer_id);
}
SwarmEvent::UnknownPeerUnreachableAddr { address, error } => {
debug!(self.log, "Peer not known at dialed address"; "address" => address.to_string(), "error" => error.to_string());
}
SwarmEvent::ExpiredListenAddr(multiaddr) => {
debug!(self.log, "Listen address expired"; "multiaddr" => multiaddr.to_string())
}
SwarmEvent::ListenerClosed { addresses, reason } => {
debug!(self.log, "Listener closed"; "addresses" => format!("{:?}", addresses), "reason" => format!("{:?}", reason))
}
SwarmEvent::ListenerError { error } => {
debug!(self.log, "Listener error"; "error" => format!("{:?}", error.to_string()))
}
SwarmEvent::Dialing(peer_id) => {
debug!(self.log, "Dialing peer"; "peer" => peer_id.to_string());
self.swarm.peer_manager().dialing_peer(&peer_id);
}
}
}
}
// un-ban peer if it's timeout has expired
loop {
match self.peer_ban_timeout.poll() {
Ok(Async::Ready(Some(peer_id))) => {
let peer_id = peer_id.into_inner();
debug!(self.log, "Peer has been unbanned"; "peer" => format!("{:?}", peer_id));
self.swarm.peer_unbanned(&peer_id);
Swarm::unban_peer_id(&mut self.swarm, peer_id);
}
Ok(Async::NotReady) | Ok(Async::Ready(None)) => break,
Err(e) => {
warn!(self.log, "Peer banning timeout queue failed"; "error" => format!("{:?}", e));
}
Some(Ok(peer_to_ban)) = self.peers_to_ban.next() => {
let peer_id = peer_to_ban.into_inner();
Swarm::ban_peer_id(&mut self.swarm, peer_id.clone());
// TODO: Correctly notify protocols of the disconnect
// TODO: Also remove peer from the DHT: https://github.com/sigp/lighthouse/issues/629
self.swarm.inject_disconnected(&peer_id);
// inform the behaviour that the peer has been banned
self.swarm.peer_banned(peer_id);
}
Some(Ok(peer_to_unban)) = self.peer_ban_timeout.next() => {
debug!(self.log, "Peer has been unbanned"; "peer" => format!("{:?}", peer_to_unban));
let unban_peer = peer_to_unban.into_inner();
self.swarm.peer_unbanned(&unban_peer);
Swarm::unban_peer_id(&mut self.swarm, unban_peer);
}
}
}
Ok(Async::NotReady)
}
}
/// The implementation supports TCP/IP, WebSockets over TCP/IP, noise/secio as the encryption layer, and
/// mplex or yamux as the multiplexing layer.
fn build_transport(local_private_key: Keypair) -> Boxed<(PeerId, StreamMuxerBox), Error> {
// TODO: The Wire protocol currently doesn't specify encryption and this will need to be customised
// in the future.
let transport = libp2p::tcp::TcpConfig::new().nodelay(true);
let transport = libp2p::dns::DnsConfig::new(transport);
fn build_transport(
local_private_key: Keypair,
) -> Result<Boxed<(PeerId, StreamMuxerBox), Error>, Error> {
let transport = libp2p_tcp::TokioTcpConfig::new().nodelay(true);
let transport = libp2p::dns::DnsConfig::new(transport)?;
#[cfg(feature = "libp2p-websocket")]
let transport = {
let trans_clone = transport.clone();
@@ -260,7 +366,7 @@ fn build_transport(local_private_key: Keypair) -> Boxed<(PeerId, StreamMuxerBox)
secio::SecioConfig::new(local_private_key),
);
core::upgrade::apply(stream, upgrade, endpoint, core::upgrade::Version::V1).and_then(
move |out| {
|out| async move {
match out {
// Noise was negotiated
core::either::EitherOutput::First((remote_id, out)) => {
@@ -288,12 +394,12 @@ fn build_transport(local_private_key: Keypair) -> Boxed<(PeerId, StreamMuxerBox)
.map_outbound(move |muxer| (peer_id2, muxer));
core::upgrade::apply(stream, upgrade, endpoint, core::upgrade::Version::V1)
.map(|(id, muxer)| (id, core::muxing::StreamMuxerBox::new(muxer)))
.map_ok(|(id, muxer)| (id, core::muxing::StreamMuxerBox::new(muxer)))
})
.timeout(Duration::from_secs(20))
.map_err(|err| Error::new(ErrorKind::Other, err))
.boxed();
transport
Ok(transport)
}
fn keypair_from_hex(hex_bytes: &str) -> error::Result<Keypair> {