Stable futures (#879)

* Port eth1 lib to use stable futures

* Port eth1_test_rig to stable futures

* Port eth1 tests to stable futures

* Port genesis service to stable futures

* Port genesis tests to stable futures

* Port beacon_chain to stable futures

* Port lcli to stable futures

* Fix eth1_test_rig (#1014)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Update hashmap hashset to stable futures

* Adds panic test to hashset delay

* Port remote_beacon_node to stable futures

* Fix lcli merge conflicts

* Non rpc stuff compiles

* protocol.rs compiles

* Port websockets, timer and notifier to stable futures (#1035)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Port remote_beacon_node to stable futures

* Partial eth2-libp2p stable future upgrade

* Finished first round of fighting RPC types

* Further progress towards porting eth2-libp2p adds caching to discovery

* Update behaviour

* RPC handler to stable futures

* Update RPC to master libp2p

* Network service additions

* Fix the fallback transport construction (#1102)

* Correct warning

* Remove hashmap delay

* Compiling version of eth2-libp2p

* Update all crates versions

* Fix conversion function and add tests (#1113)

* Port validator_client to stable futures (#1114)

* Add PH & MS slot clock changes

* Account for genesis time

* Add progress on duties refactor

* Add simple is_aggregator bool to val subscription

* Start work on attestation_verification.rs

* Add progress on ObservedAttestations

* Progress with ObservedAttestations

* Fix tests

* Add observed attestations to the beacon chain

* Add attestation observation to processing code

* Add progress on attestation verification

* Add first draft of ObservedAttesters

* Add more tests

* Add observed attesters to beacon chain

* Add observers to attestation processing

* Add more attestation verification

* Create ObservedAggregators map

* Remove commented-out code

* Add observed aggregators into chain

* Add progress

* Finish adding features to attestation verification

* Ensure beacon chain compiles

* Link attn verification into chain

* Integrate new attn verification in chain

* Remove old attestation processing code

* Start trying to fix beacon_chain tests

* Split adding into pools into two functions

* Add aggregation to harness

* Get test harness working again

* Adjust the number of aggregators for test harness

* Fix edge-case in harness

* Integrate new attn processing in network

* Fix compile bug in validator_client

* Update validator API endpoints

* Fix aggreagation in test harness

* Fix enum thing

* Fix attestation observation bug:

* Patch failing API tests

* Start adding comments to attestation verification

* Remove unused attestation field

* Unify "is block known" logic

* Update comments

* Supress fork choice errors for network processing

* Add todos

* Tidy

* Add gossip attn tests

* Disallow test harness to produce old attns

* Comment out in-progress tests

* Partially address pruning tests

* Fix failing store test

* Add aggregate tests

* Add comments about which spec conditions we check

* Dont re-aggregate

* Split apart test harness attn production

* Fix compile error in network

* Make progress on commented-out test

* Fix skipping attestation test

* Add fork choice verification tests

* Tidy attn tests, remove dead code

* Remove some accidentally added code

* Fix clippy lint

* Rename test file

* Add block tests, add cheap block proposer check

* Rename block testing file

* Add observed_block_producers

* Tidy

* Switch around block signature verification

* Finish block testing

* Remove gossip from signature tests

* First pass of self review

* Fix deviation in spec

* Update test spec tags

* Start moving over to hashset

* Finish moving observed attesters to hashmap

* Move aggregation pool over to hashmap

* Make fc attn borrow again

* Fix rest_api compile error

* Fix missing comments

* Fix monster test

* Uncomment increasing slots test

* Address remaining comments

* Remove unsafe, use cfg test

* Remove cfg test flag

* Fix dodgy comment

* Revert "Update hashmap hashset to stable futures"

This reverts commit d432378a3c.

* Revert "Adds panic test to hashset delay"

This reverts commit 281502396f.

* Ported attestation_service

* Ported duties_service

* Ported fork_service

* More ports

* Port block_service

* Minor fixes

* VC compiles

* Update TODOS

* Borrow self where possible

* Ignore aggregates that are already known.

* Unify aggregator modulo logic

* Fix typo in logs

* Refactor validator subscription logic

* Avoid reproducing selection proof

* Skip HTTP call if no subscriptions

* Rename DutyAndState -> DutyAndProof

* Tidy logs

* Print root as dbg

* Fix compile errors in tests

* Fix compile error in test

* Re-Fix attestation and duties service

* Minor fixes

Co-authored-by: Paul Hauner <paul@paulhauner.com>

* Network crate update to stable futures

* Port account_manager to stable futures (#1121)

* Port account_manager to stable futures

* Run async fns in tokio environment

* Port rest_api crate to stable futures (#1118)

* Port rest_api lib to stable futures

* Reduce tokio features

* Update notifier to stable futures

* Builder update

* Further updates

* Convert self referential async functions

* stable futures fixes (#1124)

* Fix eth1 update functions

* Fix genesis and client

* Fix beacon node lib

* Return appropriate runtimes from environment

* Fix test rig

* Refactor eth1 service update

* Upgrade simulator to stable futures

* Lighthouse compiles on stable futures

* Remove println debugging statement

* Update libp2p service, start rpc test upgrade

* Update network crate for new libp2p

* Update tokio::codec to futures_codec (#1128)

* Further work towards RPC corrections

* Correct http timeout and network service select

* Use tokio runtime for libp2p

* Revert "Update tokio::codec to futures_codec (#1128)"

This reverts commit e57aea924a.

* Upgrade RPC libp2p tests

* Upgrade secio fallback test

* Upgrade gossipsub examples

* Clean up RPC protocol

* Test fixes (#1133)

* Correct websocket timeout and run on os thread

* Fix network test

* Clean up PR

* Correct tokio tcp move attestation service tests

* Upgrade attestation service tests

* Correct network test

* Correct genesis test

* Test corrections

* Log info when block is received

* Modify logs and update attester service events

* Stable futures: fixes to vc, eth1 and account manager (#1142)

* Add local testnet scripts

* Remove whiteblock script

* Rename local testnet script

* Move spawns onto handle

* Fix VC panic

* Initial fix to block production issue

* Tidy block producer fix

* Tidy further

* Add local testnet clean script

* Run cargo fmt

* Tidy duties service

* Tidy fork service

* Tidy ForkService

* Tidy AttestationService

* Tidy notifier

* Ensure await is not suppressed in eth1

* Ensure await is not suppressed in account_manager

* Use .ok() instead of .unwrap_or(())

* RPC decoding test for proto

* Update discv5 and eth2-libp2p deps

* Fix lcli double runtime issue (#1144)

* Handle stream termination and dialing peer errors

* Correct peer_info variant types

* Remove unnecessary warnings

* Handle subnet unsubscription removal and improve logigng

* Add logs around ping

* Upgrade discv5 and improve logging

* Handle peer connection status for multiple connections

* Improve network service logging

* Improve logging around peer manager

* Upgrade swarm poll centralise peer management

* Identify clients on error

* Fix `remove_peer` in sync (#1150)

* remove_peer removes from all chains

* Remove logs

* Fix early return from loop

* Improved logging, fix panic

* Partially correct tests

* Stable futures: Vc sync (#1149)

* Improve syncing heuristic

* Add comments

* Use safer method for tolerance

* Fix tests

* Stable futures: Fix VC bug, update agg pool, add more metrics (#1151)

* Expose epoch processing summary

* Expose participation metrics to prometheus

* Switch to f64

* Reduce precision

* Change precision

* Expose observed attesters metrics

* Add metrics for agg/unagg attn counts

* Add metrics for gossip rx

* Add metrics for gossip tx

* Adds ignored attns to prom

* Add attestation timing

* Add timer for aggregation pool sig agg

* Add write lock timer for agg pool

* Add more metrics to agg pool

* Change map lock code

* Add extra metric to agg pool

* Change lock handling in agg pool

* Change .write() to .read()

* Add another agg pool timer

* Fix for is_aggregator

* Fix pruning bug

Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
This commit is contained in:
Age Manning
2020-05-17 21:16:48 +10:00
committed by GitHub
parent 21901b1615
commit b6408805a2
165 changed files with 7924 additions and 7733 deletions

View File

@@ -34,26 +34,38 @@ pub fn spawn_block_processor<T: BeaconChainTypes>(
chain: Weak<BeaconChain<T>>,
process_id: ProcessId,
downloaded_blocks: Vec<SignedBeaconBlock<T::EthSpec>>,
mut sync_send: mpsc::UnboundedSender<SyncMessage<T::EthSpec>>,
sync_send: mpsc::UnboundedSender<SyncMessage<T::EthSpec>>,
log: slog::Logger,
) {
std::thread::spawn(move || {
match process_id {
// this a request from the range sync
ProcessId::RangeBatchId(chain_id, batch_id) => {
debug!(log, "Processing batch"; "id" => *batch_id, "blocks" => downloaded_blocks.len());
let len = downloaded_blocks.len();
let start_slot = if len > 0 {
downloaded_blocks[0].message.slot.as_u64()
} else {
0
};
let end_slot = if len > 0 {
downloaded_blocks[len - 1].message.slot.as_u64()
} else {
0
};
debug!(log, "Processing batch"; "id" => *batch_id, "blocks" => downloaded_blocks.len(), "start_slot" => start_slot, "end_slot" => end_slot);
let result = match process_blocks(chain, downloaded_blocks.iter(), &log) {
(_, Ok(_)) => {
debug!(log, "Batch processed"; "id" => *batch_id );
debug!(log, "Batch processed"; "id" => *batch_id , "start_slot" => start_slot, "end_slot" => end_slot);
BatchProcessResult::Success
}
(imported_blocks, Err(e)) if imported_blocks > 0 => {
debug!(log, "Batch processing failed but imported some blocks";
warn!(log, "Batch processing failed but imported some blocks";
"id" => *batch_id, "error" => e, "imported_blocks"=> imported_blocks);
BatchProcessResult::Partial
}
(_, Err(e)) => {
debug!(log, "Batch processing failed"; "id" => *batch_id, "error" => e);
warn!(log, "Batch processing failed"; "id" => *batch_id, "error" => e);
BatchProcessResult::Failed
}
};
@@ -64,7 +76,7 @@ pub fn spawn_block_processor<T: BeaconChainTypes>(
downloaded_blocks,
result,
};
sync_send.try_send(msg).unwrap_or_else(|_| {
sync_send.send(msg).unwrap_or_else(|_| {
debug!(
log,
"Block processor could not inform range sync result. Likely shutting down."
@@ -84,7 +96,7 @@ pub fn spawn_block_processor<T: BeaconChainTypes>(
(_, Err(e)) => {
warn!(log, "Parent lookup failed"; "last_peer_id" => format!("{}", peer_id), "error" => e);
sync_send
.try_send(SyncMessage::ParentLookupFailed(peer_id))
.send(SyncMessage::ParentLookupFailed(peer_id))
.unwrap_or_else(|_| {
// on failure, inform to downvote the peer
debug!(

View File

@@ -43,7 +43,6 @@ use eth2_libp2p::rpc::{methods::*, RequestId};
use eth2_libp2p::types::NetworkGlobals;
use eth2_libp2p::PeerId;
use fnv::FnvHashMap;
use futures::prelude::*;
use slog::{crit, debug, error, info, trace, warn, Logger};
use smallvec::SmallVec;
use std::boxed::Box;
@@ -182,7 +181,7 @@ impl SingleBlockRequest {
/// chain. This allows the chain to be
/// dropped during the syncing process which will gracefully end the `SyncManager`.
pub fn spawn<T: BeaconChainTypes>(
executor: &tokio::runtime::TaskExecutor,
runtime_handle: &tokio::runtime::Handle,
beacon_chain: Arc<BeaconChain<T>>,
network_globals: Arc<NetworkGlobals<T::EthSpec>>,
network_send: mpsc::UnboundedSender<NetworkMessage<T::EthSpec>>,
@@ -197,14 +196,14 @@ pub fn spawn<T: BeaconChainTypes>(
let (sync_send, sync_recv) = mpsc::unbounded_channel::<SyncMessage<T::EthSpec>>();
// create an instance of the SyncManager
let sync_manager = SyncManager {
let mut sync_manager = SyncManager {
range_sync: RangeSync::new(
beacon_chain.clone(),
network_globals.clone(),
sync_send.clone(),
log.clone(),
),
network: SyncNetworkContext::new(network_send, log.clone()),
network: SyncNetworkContext::new(network_send, network_globals.clone(), log.clone()),
chain: beacon_chain,
network_globals,
input_channel: sync_recv,
@@ -216,14 +215,10 @@ pub fn spawn<T: BeaconChainTypes>(
// spawn the sync manager thread
debug!(log, "Sync Manager started");
executor.spawn(
sync_manager
.select(exit_rx.then(|_| Ok(())))
.then(move |_| {
info!(log.clone(), "Sync Manager shutdown");
Ok(())
}),
);
runtime_handle.spawn(async move {
futures::future::select(Box::pin(sync_manager.main()), exit_rx).await;
info!(log.clone(), "Sync Manager shutdown");
});
(sync_send, sync_exit)
}
@@ -470,6 +465,8 @@ impl<T: BeaconChainTypes> SyncManager<T> {
}
}
debug!(self.log, "Unknown block received. Starting a parent lookup"; "block_slot" => block.message.slot, "block_hash" => format!("{}", block.canonical_root()));
let parent_request = ParentRequests {
downloaded_blocks: vec![block],
failed_attempts: 0,
@@ -730,17 +727,13 @@ impl<T: BeaconChainTypes> SyncManager<T> {
self.parent_queue.push(parent_request);
}
}
}
impl<T: BeaconChainTypes> Future for SyncManager<T> {
type Item = ();
type Error = String;
fn poll(&mut self) -> Result<Async<Self::Item>, Self::Error> {
/// The main driving future for the sync manager.
async fn main(&mut self) {
// process any inbound messages
loop {
match self.input_channel.poll() {
Ok(Async::Ready(Some(message))) => match message {
if let Some(sync_message) = self.input_channel.recv().await {
match sync_message {
SyncMessage::AddPeer(peer_id, info) => {
self.add_peer(peer_id, info);
}
@@ -792,17 +785,8 @@ impl<T: BeaconChainTypes> Future for SyncManager<T> {
SyncMessage::ParentLookupFailed(peer_id) => {
self.network.downvote_peer(peer_id);
}
},
Ok(Async::NotReady) => break,
Ok(Async::Ready(None)) => {
return Err("Sync manager channel closed".into());
}
Err(e) => {
return Err(format!("Sync Manager channel error: {:?}", e));
}
}
}
Ok(Async::NotReady)
}
}

View File

@@ -6,7 +6,7 @@ use crate::service::NetworkMessage;
use beacon_chain::{BeaconChain, BeaconChainTypes};
use eth2_libp2p::rpc::methods::*;
use eth2_libp2p::rpc::{RPCEvent, RPCRequest, RequestId};
use eth2_libp2p::PeerId;
use eth2_libp2p::{Client, NetworkGlobals, PeerId};
use slog::{debug, trace, warn};
use std::sync::Arc;
use tokio::sync::mpsc;
@@ -18,20 +18,39 @@ pub struct SyncNetworkContext<T: EthSpec> {
/// The network channel to relay messages to the Network service.
network_send: mpsc::UnboundedSender<NetworkMessage<T>>,
/// Access to the network global vars.
network_globals: Arc<NetworkGlobals<T>>,
/// A sequential ID for all RPC requests.
request_id: RequestId,
/// Logger for the `SyncNetworkContext`.
log: slog::Logger,
}
impl<T: EthSpec> SyncNetworkContext<T> {
pub fn new(network_send: mpsc::UnboundedSender<NetworkMessage<T>>, log: slog::Logger) -> Self {
pub fn new(
network_send: mpsc::UnboundedSender<NetworkMessage<T>>,
network_globals: Arc<NetworkGlobals<T>>,
log: slog::Logger,
) -> Self {
Self {
network_send,
network_globals,
request_id: 1,
log,
}
}
/// Returns the Client type of the peer if known
pub fn client_type(&self, peer_id: &PeerId) -> Client {
self.network_globals
.peers
.read()
.peer_info(peer_id)
.map(|info| info.client.clone())
.unwrap_or_default()
}
pub fn status_peer<U: BeaconChainTypes>(
&mut self,
chain: Arc<BeaconChain<U>>,
@@ -104,7 +123,7 @@ impl<T: EthSpec> SyncNetworkContext<T> {
// ignore the error if the channel send fails
let _ = self.send_rpc_request(peer_id.clone(), RPCRequest::Goodbye(reason));
self.network_send
.try_send(NetworkMessage::Disconnect { peer_id })
.send(NetworkMessage::Disconnect { peer_id })
.unwrap_or_else(|_| {
warn!(
self.log,
@@ -130,7 +149,7 @@ impl<T: EthSpec> SyncNetworkContext<T> {
rpc_event: RPCEvent<T>,
) -> Result<(), &'static str> {
self.network_send
.try_send(NetworkMessage::RPC(peer_id, rpc_event))
.send(NetworkMessage::RPC(peer_id, rpc_event))
.map_err(|_| {
debug!(
self.log,

View File

@@ -31,6 +31,7 @@ const BATCH_BUFFER_SIZE: u8 = 5;
/// be downvoted.
const INVALID_BATCH_LOOKUP_ATTEMPTS: u8 = 3;
#[derive(PartialEq)]
/// A return type for functions that act on a `Chain` which informs the caller whether the chain
/// has been completed and should be removed or to be kept if further processing is
/// required.
@@ -380,8 +381,8 @@ impl<T: BeaconChainTypes> SyncingChain<T> {
}
}
BatchProcessResult::Failed => {
warn!(self.log, "Batch processing failed";
"chain_id" => self.id,"id" => *batch.id, "peer" => format!("{}", batch.current_peer));
debug!(self.log, "Batch processing failed";
"chain_id" => self.id,"id" => *batch.id, "peer" => batch.current_peer.to_string(), "client" => network.client_type(&batch.current_peer).to_string());
// The batch processing failed
// This could be because this batch is invalid, or a previous invalidated batch
// is invalid. We need to find out which and downvote the peer that has sent us

View File

@@ -369,7 +369,19 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
.find_map(|(index, chain)| Some((index, func(chain)?)))
}
/// Runs a function on all finalized chains.
/// Given a chain iterator, runs a given function on each chain and return all `Some` results.
fn request_function_all<'a, F, I, U>(chain: I, mut func: F) -> Vec<(usize, U)>
where
I: Iterator<Item = &'a mut SyncingChain<T>>,
F: FnMut(&'a mut SyncingChain<T>) -> Option<U>,
{
chain
.enumerate()
.filter_map(|(index, chain)| Some((index, func(chain)?)))
.collect()
}
/// Runs a function on finalized chains until we get the first `Some` result from `F`.
pub fn finalized_request<F, U>(&mut self, func: F) -> Option<(usize, U)>
where
F: FnMut(&mut SyncingChain<T>) -> Option<U>,
@@ -377,7 +389,7 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
ChainCollection::request_function(self.finalized_chains.iter_mut(), func)
}
/// Runs a function on all head chains.
/// Runs a function on head chains until we get the first `Some` result from `F`.
pub fn head_request<F, U>(&mut self, func: F) -> Option<(usize, U)>
where
F: FnMut(&mut SyncingChain<T>) -> Option<U>,
@@ -385,7 +397,7 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
ChainCollection::request_function(self.head_chains.iter_mut(), func)
}
/// Runs a function on all finalized and head chains.
/// Runs a function on finalized and head chains until we get the first `Some` result from `F`.
pub fn head_finalized_request<F, U>(&mut self, func: F) -> Option<(usize, U)>
where
F: FnMut(&mut SyncingChain<T>) -> Option<U>,
@@ -398,6 +410,19 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
)
}
/// Runs a function on all finalized and head chains and collects all `Some` results from `F`.
pub fn head_finalized_request_all<F, U>(&mut self, func: F) -> Vec<(usize, U)>
where
F: FnMut(&mut SyncingChain<T>) -> Option<U>,
{
ChainCollection::request_function_all(
self.finalized_chains
.iter_mut()
.chain(self.head_chains.iter_mut()),
func,
)
}
/// Removes any outdated finalized or head chains.
///
/// This removes chains with no peers, or chains whose start block slot is less than our current

View File

@@ -355,7 +355,7 @@ impl<T: BeaconChainTypes> RangeSync<T> {
peer_id: &PeerId,
) {
// if the peer is in the awaiting head mapping, remove it
self.awaiting_head_peers.remove(&peer_id);
self.awaiting_head_peers.remove(peer_id);
// remove the peer from any peer pool
self.remove_peer(network, peer_id);
@@ -370,26 +370,26 @@ impl<T: BeaconChainTypes> RangeSync<T> {
/// for this peer. If so we mark the batch as failed. The batch may then hit it's maximum
/// retries. In this case, we need to remove the chain and re-status all the peers.
fn remove_peer(&mut self, network: &mut SyncNetworkContext<T::EthSpec>, peer_id: &PeerId) {
if let Some((index, ProcessingResult::RemoveChain)) =
self.chains.head_finalized_request(|chain| {
if chain.peer_pool.remove(peer_id) {
// this chain contained the peer
while let Some(batch) = chain.pending_batches.remove_batch_by_peer(peer_id) {
if let ProcessingResult::RemoveChain = chain.failed_batch(network, batch) {
// a single batch failed, remove the chain
return Some(ProcessingResult::RemoveChain);
}
for (index, result) in self.chains.head_finalized_request_all(|chain| {
if chain.peer_pool.remove(peer_id) {
// this chain contained the peer
while let Some(batch) = chain.pending_batches.remove_batch_by_peer(peer_id) {
if let ProcessingResult::RemoveChain = chain.failed_batch(network, batch) {
// a single batch failed, remove the chain
return Some(ProcessingResult::RemoveChain);
}
// peer removed from chain, no batch failed
Some(ProcessingResult::KeepChain)
} else {
None
}
})
{
// the chain needed to be removed
debug!(self.log, "Chain being removed due to failed batch");
self.chains.remove_chain(network, index);
// peer removed from chain, no batch failed
Some(ProcessingResult::KeepChain)
} else {
None
}
}) {
if result == ProcessingResult::RemoveChain {
// the chain needed to be removed
debug!(self.log, "Chain being removed due to failed batch");
self.chains.remove_chain(network, index);
}
}
}