Stable futures (#879)

* Port eth1 lib to use stable futures

* Port eth1_test_rig to stable futures

* Port eth1 tests to stable futures

* Port genesis service to stable futures

* Port genesis tests to stable futures

* Port beacon_chain to stable futures

* Port lcli to stable futures

* Fix eth1_test_rig (#1014)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Update hashmap hashset to stable futures

* Adds panic test to hashset delay

* Port remote_beacon_node to stable futures

* Fix lcli merge conflicts

* Non rpc stuff compiles

* protocol.rs compiles

* Port websockets, timer and notifier to stable futures (#1035)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Port remote_beacon_node to stable futures

* Partial eth2-libp2p stable future upgrade

* Finished first round of fighting RPC types

* Further progress towards porting eth2-libp2p adds caching to discovery

* Update behaviour

* RPC handler to stable futures

* Update RPC to master libp2p

* Network service additions

* Fix the fallback transport construction (#1102)

* Correct warning

* Remove hashmap delay

* Compiling version of eth2-libp2p

* Update all crates versions

* Fix conversion function and add tests (#1113)

* Port validator_client to stable futures (#1114)

* Add PH & MS slot clock changes

* Account for genesis time

* Add progress on duties refactor

* Add simple is_aggregator bool to val subscription

* Start work on attestation_verification.rs

* Add progress on ObservedAttestations

* Progress with ObservedAttestations

* Fix tests

* Add observed attestations to the beacon chain

* Add attestation observation to processing code

* Add progress on attestation verification

* Add first draft of ObservedAttesters

* Add more tests

* Add observed attesters to beacon chain

* Add observers to attestation processing

* Add more attestation verification

* Create ObservedAggregators map

* Remove commented-out code

* Add observed aggregators into chain

* Add progress

* Finish adding features to attestation verification

* Ensure beacon chain compiles

* Link attn verification into chain

* Integrate new attn verification in chain

* Remove old attestation processing code

* Start trying to fix beacon_chain tests

* Split adding into pools into two functions

* Add aggregation to harness

* Get test harness working again

* Adjust the number of aggregators for test harness

* Fix edge-case in harness

* Integrate new attn processing in network

* Fix compile bug in validator_client

* Update validator API endpoints

* Fix aggreagation in test harness

* Fix enum thing

* Fix attestation observation bug:

* Patch failing API tests

* Start adding comments to attestation verification

* Remove unused attestation field

* Unify "is block known" logic

* Update comments

* Supress fork choice errors for network processing

* Add todos

* Tidy

* Add gossip attn tests

* Disallow test harness to produce old attns

* Comment out in-progress tests

* Partially address pruning tests

* Fix failing store test

* Add aggregate tests

* Add comments about which spec conditions we check

* Dont re-aggregate

* Split apart test harness attn production

* Fix compile error in network

* Make progress on commented-out test

* Fix skipping attestation test

* Add fork choice verification tests

* Tidy attn tests, remove dead code

* Remove some accidentally added code

* Fix clippy lint

* Rename test file

* Add block tests, add cheap block proposer check

* Rename block testing file

* Add observed_block_producers

* Tidy

* Switch around block signature verification

* Finish block testing

* Remove gossip from signature tests

* First pass of self review

* Fix deviation in spec

* Update test spec tags

* Start moving over to hashset

* Finish moving observed attesters to hashmap

* Move aggregation pool over to hashmap

* Make fc attn borrow again

* Fix rest_api compile error

* Fix missing comments

* Fix monster test

* Uncomment increasing slots test

* Address remaining comments

* Remove unsafe, use cfg test

* Remove cfg test flag

* Fix dodgy comment

* Revert "Update hashmap hashset to stable futures"

This reverts commit d432378a3c.

* Revert "Adds panic test to hashset delay"

This reverts commit 281502396f.

* Ported attestation_service

* Ported duties_service

* Ported fork_service

* More ports

* Port block_service

* Minor fixes

* VC compiles

* Update TODOS

* Borrow self where possible

* Ignore aggregates that are already known.

* Unify aggregator modulo logic

* Fix typo in logs

* Refactor validator subscription logic

* Avoid reproducing selection proof

* Skip HTTP call if no subscriptions

* Rename DutyAndState -> DutyAndProof

* Tidy logs

* Print root as dbg

* Fix compile errors in tests

* Fix compile error in test

* Re-Fix attestation and duties service

* Minor fixes

Co-authored-by: Paul Hauner <paul@paulhauner.com>

* Network crate update to stable futures

* Port account_manager to stable futures (#1121)

* Port account_manager to stable futures

* Run async fns in tokio environment

* Port rest_api crate to stable futures (#1118)

* Port rest_api lib to stable futures

* Reduce tokio features

* Update notifier to stable futures

* Builder update

* Further updates

* Convert self referential async functions

* stable futures fixes (#1124)

* Fix eth1 update functions

* Fix genesis and client

* Fix beacon node lib

* Return appropriate runtimes from environment

* Fix test rig

* Refactor eth1 service update

* Upgrade simulator to stable futures

* Lighthouse compiles on stable futures

* Remove println debugging statement

* Update libp2p service, start rpc test upgrade

* Update network crate for new libp2p

* Update tokio::codec to futures_codec (#1128)

* Further work towards RPC corrections

* Correct http timeout and network service select

* Use tokio runtime for libp2p

* Revert "Update tokio::codec to futures_codec (#1128)"

This reverts commit e57aea924a.

* Upgrade RPC libp2p tests

* Upgrade secio fallback test

* Upgrade gossipsub examples

* Clean up RPC protocol

* Test fixes (#1133)

* Correct websocket timeout and run on os thread

* Fix network test

* Clean up PR

* Correct tokio tcp move attestation service tests

* Upgrade attestation service tests

* Correct network test

* Correct genesis test

* Test corrections

* Log info when block is received

* Modify logs and update attester service events

* Stable futures: fixes to vc, eth1 and account manager (#1142)

* Add local testnet scripts

* Remove whiteblock script

* Rename local testnet script

* Move spawns onto handle

* Fix VC panic

* Initial fix to block production issue

* Tidy block producer fix

* Tidy further

* Add local testnet clean script

* Run cargo fmt

* Tidy duties service

* Tidy fork service

* Tidy ForkService

* Tidy AttestationService

* Tidy notifier

* Ensure await is not suppressed in eth1

* Ensure await is not suppressed in account_manager

* Use .ok() instead of .unwrap_or(())

* RPC decoding test for proto

* Update discv5 and eth2-libp2p deps

* Fix lcli double runtime issue (#1144)

* Handle stream termination and dialing peer errors

* Correct peer_info variant types

* Remove unnecessary warnings

* Handle subnet unsubscription removal and improve logigng

* Add logs around ping

* Upgrade discv5 and improve logging

* Handle peer connection status for multiple connections

* Improve network service logging

* Improve logging around peer manager

* Upgrade swarm poll centralise peer management

* Identify clients on error

* Fix `remove_peer` in sync (#1150)

* remove_peer removes from all chains

* Remove logs

* Fix early return from loop

* Improved logging, fix panic

* Partially correct tests

* Stable futures: Vc sync (#1149)

* Improve syncing heuristic

* Add comments

* Use safer method for tolerance

* Fix tests

* Stable futures: Fix VC bug, update agg pool, add more metrics (#1151)

* Expose epoch processing summary

* Expose participation metrics to prometheus

* Switch to f64

* Reduce precision

* Change precision

* Expose observed attesters metrics

* Add metrics for agg/unagg attn counts

* Add metrics for gossip rx

* Add metrics for gossip tx

* Adds ignored attns to prom

* Add attestation timing

* Add timer for aggregation pool sig agg

* Add write lock timer for agg pool

* Add more metrics to agg pool

* Change map lock code

* Add extra metric to agg pool

* Change lock handling in agg pool

* Change .write() to .read()

* Add another agg pool timer

* Fix for is_aggregator

* Fix pruning bug

Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
This commit is contained in:
Age Manning
2020-05-17 21:16:48 +10:00
committed by GitHub
parent 21901b1615
commit b6408805a2
165 changed files with 7924 additions and 7733 deletions

View File

@@ -1,14 +1,13 @@
use crate::checks::{epoch_delay, verify_all_finalized_at};
use crate::local_network::LocalNetwork;
use clap::ArgMatches;
use futures::{future, stream, Future, IntoFuture, Stream};
use futures::prelude::*;
use node_test_rig::ClientConfig;
use node_test_rig::{
environment::EnvironmentBuilder, testing_client_config, ClientGenesis, ValidatorConfig,
};
use std::net::{IpAddr, Ipv4Addr};
use std::time::{Duration, SystemTime, UNIX_EPOCH};
use tokio::timer::Interval;
use types::{Epoch, EthSpec};
pub fn run_syncing_sim(matches: &ArgMatches) -> Result<(), String> {
@@ -78,110 +77,118 @@ fn syncing_sim(
beacon_config.network.enr_address = Some(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)));
let future = LocalNetwork::new(context, beacon_config.clone())
let main_future = async {
/*
* Create a new `LocalNetwork` with one beacon node.
*/
let network = LocalNetwork::new(context, beacon_config.clone()).await?;
/*
* Add a validator client which handles all validators from the genesis state.
*/
.and_then(move |network| {
network
.add_validator_client(ValidatorConfig::default(), 0, (0..num_validators).collect())
.map(|_| network)
})
/*
* Start the processes that will run checks on the network as it runs.
*/
.and_then(move |network| {
// The `final_future` either completes immediately or never completes, depending on the value
// of `end_after_checks`.
let final_future: Box<dyn Future<Item = (), Error = String> + Send> =
if end_after_checks {
Box::new(future::ok(()).map_err(|()| "".to_string()))
} else {
Box::new(future::empty().map_err(|()| "".to_string()))
};
network
.add_validator_client(ValidatorConfig::default(), 0, (0..num_validators).collect())
.await?;
// Check all syncing strategies one after other.
pick_strategy(
&strategy,
network.clone(),
beacon_config.clone(),
slot_duration,
initial_delay,
sync_timeout,
)
.await?;
// The `final_future` either completes immediately or never completes, depending on the value
// of `end_after_checks`.
if !end_after_checks {
future::pending::<()>().await;
}
future::ok(())
// Check all syncing strategies one after other.
.join(pick_strategy(
&strategy,
network.clone(),
beacon_config.clone(),
slot_duration,
initial_delay,
sync_timeout,
))
.join(final_future)
.map(|_| network)
})
/*
* End the simulation by dropping the network. This will kill all running beacon nodes and
* validator clients.
*/
.map(|network| {
println!(
"Simulation complete. Finished with {} beacon nodes and {} validator clients",
network.beacon_node_count(),
network.validator_client_count()
);
println!(
"Simulation complete. Finished with {} beacon nodes and {} validator clients",
network.beacon_node_count(),
network.validator_client_count()
);
// Be explicit about dropping the network, as this kills all the nodes. This ensures
// all the checks have adequate time to pass.
drop(network)
});
// Be explicit about dropping the network, as this kills all the nodes. This ensures
// all the checks have adequate time to pass.
drop(network);
Ok::<(), String>(())
};
env.runtime().block_on(future)
env.runtime().block_on(main_future)
}
pub fn pick_strategy<E: EthSpec>(
pub async fn pick_strategy<E: EthSpec>(
strategy: &str,
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> Box<dyn Future<Item = (), Error = String> + Send + 'static> {
) -> Result<(), String> {
match strategy {
"one-node" => Box::new(verify_one_node_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)),
"two-nodes" => Box::new(verify_two_nodes_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)),
"mixed" => Box::new(verify_in_between_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)),
"all" => Box::new(verify_syncing(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)),
_ => Box::new(Err("Invalid strategy".into()).into_future()),
"one-node" => {
verify_one_node_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await
}
"two-nodes" => {
verify_two_nodes_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await
}
"mixed" => {
verify_in_between_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await
}
"all" => {
verify_syncing(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await
}
_ => Err("Invalid strategy".into()),
}
}
/// Verify one node added after `initial_delay` epochs is in sync
/// after `sync_timeout` epochs.
pub fn verify_one_node_sync<E: EthSpec>(
pub async fn verify_one_node_sync<E: EthSpec>(
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> impl Future<Item = (), Error = String> {
) -> Result<(), String> {
let epoch_duration = slot_duration * (E::slots_per_epoch() as u32);
let network_c = network.clone();
// Delay for `initial_delay` epochs before adding another node to start syncing
@@ -190,35 +197,34 @@ pub fn verify_one_node_sync<E: EthSpec>(
slot_duration,
E::slots_per_epoch(),
)
.and_then(move |_| {
// Add a beacon node
network.add_beacon_node(beacon_config).map(|_| network)
})
.and_then(move |network| {
// Check every `epoch_duration` if nodes are synced
// limited to at most `sync_timeout` epochs
Interval::new_interval(epoch_duration)
.take(sync_timeout)
.map_err(|_| "Failed to create interval".to_string())
.take_while(move |_| check_still_syncing(&network_c))
.for_each(|_| Ok(())) // consume the stream
.map(|_| network)
})
.and_then(move |network| network.bootnode_epoch().map(|e| (e, network)))
.and_then(move |(epoch, network)| {
verify_all_finalized_at(network, epoch).map_err(|e| format!("One node sync error: {}", e))
})
.await;
// Add a beacon node
network.add_beacon_node(beacon_config).await?;
// Check every `epoch_duration` if nodes are synced
// limited to at most `sync_timeout` epochs
let mut interval = tokio::time::interval(epoch_duration);
let mut count = 0;
while let Some(_) = interval.next().await {
if count >= sync_timeout || !check_still_syncing(&network_c).await? {
break;
}
count += 1;
}
let epoch = network.bootnode_epoch().await?;
verify_all_finalized_at(network, epoch)
.map_err(|e| format!("One node sync error: {}", e))
.await
}
/// Verify two nodes added after `initial_delay` epochs are in sync
/// after `sync_timeout` epochs.
pub fn verify_two_nodes_sync<E: EthSpec>(
pub async fn verify_two_nodes_sync<E: EthSpec>(
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> impl Future<Item = (), Error = String> {
) -> Result<(), String> {
let epoch_duration = slot_duration * (E::slots_per_epoch() as u32);
let network_c = network.clone();
// Delay for `initial_delay` epochs before adding another node to start syncing
@@ -227,41 +233,36 @@ pub fn verify_two_nodes_sync<E: EthSpec>(
slot_duration,
E::slots_per_epoch(),
)
.and_then(move |_| {
// Add beacon nodes
network
.add_beacon_node(beacon_config.clone())
.map(|_| (network, beacon_config))
.and_then(|(network, beacon_config)| {
network.add_beacon_node(beacon_config).map(|_| network)
})
})
.and_then(move |network| {
// Check every `epoch_duration` if nodes are synced
// limited to at most `sync_timeout` epochs
Interval::new_interval(epoch_duration)
.take(sync_timeout)
.map_err(|_| "Failed to create interval".to_string())
.take_while(move |_| check_still_syncing(&network_c))
.for_each(|_| Ok(())) // consume the stream
.map(|_| network)
})
.and_then(move |network| network.bootnode_epoch().map(|e| (e, network)))
.and_then(move |(epoch, network)| {
verify_all_finalized_at(network, epoch).map_err(|e| format!("Two node sync error: {}", e))
})
.await;
// Add beacon nodes
network.add_beacon_node(beacon_config.clone()).await?;
network.add_beacon_node(beacon_config).await?;
// Check every `epoch_duration` if nodes are synced
// limited to at most `sync_timeout` epochs
let mut interval = tokio::time::interval(epoch_duration);
let mut count = 0;
while let Some(_) = interval.next().await {
if count >= sync_timeout || !check_still_syncing(&network_c).await? {
break;
}
count += 1;
}
let epoch = network.bootnode_epoch().await?;
verify_all_finalized_at(network, epoch)
.map_err(|e| format!("One node sync error: {}", e))
.await
}
/// Add 2 syncing nodes after `initial_delay` epochs,
/// Add another node after `sync_timeout - 5` epochs and verify all are
/// in sync after `sync_timeout + 5` epochs.
pub fn verify_in_between_sync<E: EthSpec>(
pub async fn verify_in_between_sync<E: EthSpec>(
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> impl Future<Item = (), Error = String> {
) -> Result<(), String> {
let epoch_duration = slot_duration * (E::slots_per_epoch() as u32);
let network_c = network.clone();
// Delay for `initial_delay` epochs before adding another node to start syncing
@@ -271,52 +272,43 @@ pub fn verify_in_between_sync<E: EthSpec>(
slot_duration,
E::slots_per_epoch(),
)
.and_then(move |_| {
// Add a beacon node
network
.add_beacon_node(beacon_config.clone())
.map(|_| (network, beacon_config))
.and_then(|(network, beacon_config)| {
network.add_beacon_node(beacon_config).map(|_| network)
})
})
.and_then(move |network| {
// Delay before adding additional syncing nodes.
epoch_delay(
Epoch::new(sync_timeout - 5),
slot_duration,
E::slots_per_epoch(),
)
.map(|_| network)
})
.and_then(move |network| {
// Add a beacon node
network.add_beacon_node(config1.clone()).map(|_| network)
})
.and_then(move |network| {
// Check every `epoch_duration` if nodes are synced
// limited to at most `sync_timeout` epochs
Interval::new_interval(epoch_duration)
.take(sync_timeout + 5)
.map_err(|_| "Failed to create interval".to_string())
.take_while(move |_| check_still_syncing(&network_c))
.for_each(|_| Ok(())) // consume the stream
.map(|_| network)
})
.and_then(move |network| network.bootnode_epoch().map(|e| (e, network)))
.and_then(move |(epoch, network)| {
verify_all_finalized_at(network, epoch).map_err(|e| format!("In between sync error: {}", e))
})
.await;
// Add two beacon nodes
network.add_beacon_node(beacon_config.clone()).await?;
network.add_beacon_node(beacon_config).await?;
// Delay before adding additional syncing nodes.
epoch_delay(
Epoch::new(sync_timeout - 5),
slot_duration,
E::slots_per_epoch(),
)
.await;
// Add a beacon node
network.add_beacon_node(config1.clone()).await?;
// Check every `epoch_duration` if nodes are synced
// limited to at most `sync_timeout` epochs
let mut interval = tokio::time::interval(epoch_duration);
let mut count = 0;
while let Some(_) = interval.next().await {
if count >= sync_timeout || !check_still_syncing(&network_c).await? {
break;
}
count += 1;
}
let epoch = network.bootnode_epoch().await?;
verify_all_finalized_at(network, epoch)
.map_err(|e| format!("One node sync error: {}", e))
.await
}
/// Run syncing strategies one after other.
pub fn verify_syncing<E: EthSpec>(
pub async fn verify_syncing<E: EthSpec>(
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> impl Future<Item = (), Error = String> {
) -> Result<(), String> {
verify_one_node_sync(
network.clone(),
beacon_config.clone(),
@@ -324,53 +316,42 @@ pub fn verify_syncing<E: EthSpec>(
initial_delay,
sync_timeout,
)
.map(|_| println!("Completed one node sync"))
.and_then(move |_| {
verify_two_nodes_sync(
network.clone(),
beacon_config.clone(),
slot_duration,
initial_delay,
sync_timeout,
)
.map(|_| {
println!("Completed two node sync");
(network, beacon_config)
})
})
.and_then(move |(network, beacon_config)| {
verify_in_between_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.map(|_| println!("Completed in between sync"))
})
.await?;
println!("Completed one node sync");
verify_two_nodes_sync(
network.clone(),
beacon_config.clone(),
slot_duration,
initial_delay,
sync_timeout,
)
.await?;
println!("Completed two node sync");
verify_in_between_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await?;
println!("Completed in between sync");
Ok(())
}
pub fn check_still_syncing<E: EthSpec>(
network: &LocalNetwork<E>,
) -> impl Future<Item = bool, Error = String> {
network
.remote_nodes()
.into_future()
// get syncing status of nodes
.and_then(|remote_nodes| {
stream::unfold(remote_nodes.into_iter(), |mut iter| {
iter.next().map(|remote_node| {
remote_node
.http
.node()
.syncing_status()
.map(|status| status.is_syncing)
.map(|status| (status, iter))
.map_err(|e| format!("Get syncing status via http failed: {:?}", e))
})
})
.collect()
})
.and_then(move |status| Ok(status.iter().any(|is_syncing| *is_syncing)))
.map_err(|e| format!("Failed syncing check: {:?}", e))
pub async fn check_still_syncing<E: EthSpec>(network: &LocalNetwork<E>) -> Result<bool, String> {
// get syncing status of nodes
let mut status = Vec::new();
for remote_node in network.remote_nodes()? {
status.push(
remote_node
.http
.node()
.syncing_status()
.await
.map(|status| status.is_syncing)
.map_err(|e| format!("Get syncing status via http failed: {:?}", e))?,
)
}
Ok(status.iter().any(|is_syncing| *is_syncing))
}