Files
lighthouse/tests/simulator/src/sync_sim.rs
Age Manning b6408805a2 Stable futures (#879)
* Port eth1 lib to use stable futures

* Port eth1_test_rig to stable futures

* Port eth1 tests to stable futures

* Port genesis service to stable futures

* Port genesis tests to stable futures

* Port beacon_chain to stable futures

* Port lcli to stable futures

* Fix eth1_test_rig (#1014)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Update hashmap hashset to stable futures

* Adds panic test to hashset delay

* Port remote_beacon_node to stable futures

* Fix lcli merge conflicts

* Non rpc stuff compiles

* protocol.rs compiles

* Port websockets, timer and notifier to stable futures (#1035)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Port remote_beacon_node to stable futures

* Partial eth2-libp2p stable future upgrade

* Finished first round of fighting RPC types

* Further progress towards porting eth2-libp2p adds caching to discovery

* Update behaviour

* RPC handler to stable futures

* Update RPC to master libp2p

* Network service additions

* Fix the fallback transport construction (#1102)

* Correct warning

* Remove hashmap delay

* Compiling version of eth2-libp2p

* Update all crates versions

* Fix conversion function and add tests (#1113)

* Port validator_client to stable futures (#1114)

* Add PH & MS slot clock changes

* Account for genesis time

* Add progress on duties refactor

* Add simple is_aggregator bool to val subscription

* Start work on attestation_verification.rs

* Add progress on ObservedAttestations

* Progress with ObservedAttestations

* Fix tests

* Add observed attestations to the beacon chain

* Add attestation observation to processing code

* Add progress on attestation verification

* Add first draft of ObservedAttesters

* Add more tests

* Add observed attesters to beacon chain

* Add observers to attestation processing

* Add more attestation verification

* Create ObservedAggregators map

* Remove commented-out code

* Add observed aggregators into chain

* Add progress

* Finish adding features to attestation verification

* Ensure beacon chain compiles

* Link attn verification into chain

* Integrate new attn verification in chain

* Remove old attestation processing code

* Start trying to fix beacon_chain tests

* Split adding into pools into two functions

* Add aggregation to harness

* Get test harness working again

* Adjust the number of aggregators for test harness

* Fix edge-case in harness

* Integrate new attn processing in network

* Fix compile bug in validator_client

* Update validator API endpoints

* Fix aggreagation in test harness

* Fix enum thing

* Fix attestation observation bug:

* Patch failing API tests

* Start adding comments to attestation verification

* Remove unused attestation field

* Unify "is block known" logic

* Update comments

* Supress fork choice errors for network processing

* Add todos

* Tidy

* Add gossip attn tests

* Disallow test harness to produce old attns

* Comment out in-progress tests

* Partially address pruning tests

* Fix failing store test

* Add aggregate tests

* Add comments about which spec conditions we check

* Dont re-aggregate

* Split apart test harness attn production

* Fix compile error in network

* Make progress on commented-out test

* Fix skipping attestation test

* Add fork choice verification tests

* Tidy attn tests, remove dead code

* Remove some accidentally added code

* Fix clippy lint

* Rename test file

* Add block tests, add cheap block proposer check

* Rename block testing file

* Add observed_block_producers

* Tidy

* Switch around block signature verification

* Finish block testing

* Remove gossip from signature tests

* First pass of self review

* Fix deviation in spec

* Update test spec tags

* Start moving over to hashset

* Finish moving observed attesters to hashmap

* Move aggregation pool over to hashmap

* Make fc attn borrow again

* Fix rest_api compile error

* Fix missing comments

* Fix monster test

* Uncomment increasing slots test

* Address remaining comments

* Remove unsafe, use cfg test

* Remove cfg test flag

* Fix dodgy comment

* Revert "Update hashmap hashset to stable futures"

This reverts commit d432378a3c.

* Revert "Adds panic test to hashset delay"

This reverts commit 281502396f.

* Ported attestation_service

* Ported duties_service

* Ported fork_service

* More ports

* Port block_service

* Minor fixes

* VC compiles

* Update TODOS

* Borrow self where possible

* Ignore aggregates that are already known.

* Unify aggregator modulo logic

* Fix typo in logs

* Refactor validator subscription logic

* Avoid reproducing selection proof

* Skip HTTP call if no subscriptions

* Rename DutyAndState -> DutyAndProof

* Tidy logs

* Print root as dbg

* Fix compile errors in tests

* Fix compile error in test

* Re-Fix attestation and duties service

* Minor fixes

Co-authored-by: Paul Hauner <paul@paulhauner.com>

* Network crate update to stable futures

* Port account_manager to stable futures (#1121)

* Port account_manager to stable futures

* Run async fns in tokio environment

* Port rest_api crate to stable futures (#1118)

* Port rest_api lib to stable futures

* Reduce tokio features

* Update notifier to stable futures

* Builder update

* Further updates

* Convert self referential async functions

* stable futures fixes (#1124)

* Fix eth1 update functions

* Fix genesis and client

* Fix beacon node lib

* Return appropriate runtimes from environment

* Fix test rig

* Refactor eth1 service update

* Upgrade simulator to stable futures

* Lighthouse compiles on stable futures

* Remove println debugging statement

* Update libp2p service, start rpc test upgrade

* Update network crate for new libp2p

* Update tokio::codec to futures_codec (#1128)

* Further work towards RPC corrections

* Correct http timeout and network service select

* Use tokio runtime for libp2p

* Revert "Update tokio::codec to futures_codec (#1128)"

This reverts commit e57aea924a.

* Upgrade RPC libp2p tests

* Upgrade secio fallback test

* Upgrade gossipsub examples

* Clean up RPC protocol

* Test fixes (#1133)

* Correct websocket timeout and run on os thread

* Fix network test

* Clean up PR

* Correct tokio tcp move attestation service tests

* Upgrade attestation service tests

* Correct network test

* Correct genesis test

* Test corrections

* Log info when block is received

* Modify logs and update attester service events

* Stable futures: fixes to vc, eth1 and account manager (#1142)

* Add local testnet scripts

* Remove whiteblock script

* Rename local testnet script

* Move spawns onto handle

* Fix VC panic

* Initial fix to block production issue

* Tidy block producer fix

* Tidy further

* Add local testnet clean script

* Run cargo fmt

* Tidy duties service

* Tidy fork service

* Tidy ForkService

* Tidy AttestationService

* Tidy notifier

* Ensure await is not suppressed in eth1

* Ensure await is not suppressed in account_manager

* Use .ok() instead of .unwrap_or(())

* RPC decoding test for proto

* Update discv5 and eth2-libp2p deps

* Fix lcli double runtime issue (#1144)

* Handle stream termination and dialing peer errors

* Correct peer_info variant types

* Remove unnecessary warnings

* Handle subnet unsubscription removal and improve logigng

* Add logs around ping

* Upgrade discv5 and improve logging

* Handle peer connection status for multiple connections

* Improve network service logging

* Improve logging around peer manager

* Upgrade swarm poll centralise peer management

* Identify clients on error

* Fix `remove_peer` in sync (#1150)

* remove_peer removes from all chains

* Remove logs

* Fix early return from loop

* Improved logging, fix panic

* Partially correct tests

* Stable futures: Vc sync (#1149)

* Improve syncing heuristic

* Add comments

* Use safer method for tolerance

* Fix tests

* Stable futures: Fix VC bug, update agg pool, add more metrics (#1151)

* Expose epoch processing summary

* Expose participation metrics to prometheus

* Switch to f64

* Reduce precision

* Change precision

* Expose observed attesters metrics

* Add metrics for agg/unagg attn counts

* Add metrics for gossip rx

* Add metrics for gossip tx

* Adds ignored attns to prom

* Add attestation timing

* Add timer for aggregation pool sig agg

* Add write lock timer for agg pool

* Add more metrics to agg pool

* Change map lock code

* Add extra metric to agg pool

* Change lock handling in agg pool

* Change .write() to .read()

* Add another agg pool timer

* Fix for is_aggregator

* Fix pruning bug

Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00

358 lines
11 KiB
Rust

use crate::checks::{epoch_delay, verify_all_finalized_at};
use crate::local_network::LocalNetwork;
use clap::ArgMatches;
use futures::prelude::*;
use node_test_rig::ClientConfig;
use node_test_rig::{
environment::EnvironmentBuilder, testing_client_config, ClientGenesis, ValidatorConfig,
};
use std::net::{IpAddr, Ipv4Addr};
use std::time::{Duration, SystemTime, UNIX_EPOCH};
use types::{Epoch, EthSpec};
pub fn run_syncing_sim(matches: &ArgMatches) -> Result<(), String> {
let initial_delay = value_t!(matches, "initial_delay", u64).unwrap();
let sync_timeout = value_t!(matches, "sync_timeout", u64).unwrap();
let speed_up_factor = value_t!(matches, "speedup", u64).unwrap();
let strategy = value_t!(matches, "strategy", String).unwrap();
println!("Syncing Simulator:");
println!(" initial_delay:{}", initial_delay);
println!(" sync timeout: {}", sync_timeout);
println!(" speed up factor:{}", speed_up_factor);
println!(" strategy:{}", strategy);
let log_level = "debug";
let log_format = None;
syncing_sim(
speed_up_factor,
initial_delay,
sync_timeout,
strategy,
log_level,
log_format,
)
}
fn syncing_sim(
speed_up_factor: u64,
initial_delay: u64,
sync_timeout: u64,
strategy: String,
log_level: &str,
log_format: Option<&str>,
) -> Result<(), String> {
let mut env = EnvironmentBuilder::minimal()
.async_logger(log_level, log_format)?
.multi_threaded_tokio_runtime()?
.build()?;
let spec = &mut env.eth2_config.spec;
let end_after_checks = true;
let eth1_block_time = Duration::from_millis(15_000 / speed_up_factor);
spec.milliseconds_per_slot /= speed_up_factor;
spec.eth1_follow_distance = 16;
spec.min_genesis_delay = eth1_block_time.as_secs() * spec.eth1_follow_distance * 2;
spec.min_genesis_time = 0;
spec.min_genesis_active_validator_count = 64;
spec.seconds_per_eth1_block = 1;
let num_validators = 8;
let slot_duration = Duration::from_millis(spec.milliseconds_per_slot);
let context = env.core_context();
let mut beacon_config = testing_client_config();
let genesis_time = SystemTime::now()
.duration_since(UNIX_EPOCH)
.map_err(|_| "should get system time")?
+ Duration::from_secs(5);
beacon_config.genesis = ClientGenesis::Interop {
validator_count: num_validators,
genesis_time: genesis_time.as_secs(),
};
beacon_config.dummy_eth1_backend = true;
beacon_config.sync_eth1_chain = true;
beacon_config.network.enr_address = Some(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)));
let main_future = async {
/*
* Create a new `LocalNetwork` with one beacon node.
*/
let network = LocalNetwork::new(context, beacon_config.clone()).await?;
/*
* Add a validator client which handles all validators from the genesis state.
*/
network
.add_validator_client(ValidatorConfig::default(), 0, (0..num_validators).collect())
.await?;
// Check all syncing strategies one after other.
pick_strategy(
&strategy,
network.clone(),
beacon_config.clone(),
slot_duration,
initial_delay,
sync_timeout,
)
.await?;
// The `final_future` either completes immediately or never completes, depending on the value
// of `end_after_checks`.
if !end_after_checks {
future::pending::<()>().await;
}
/*
* End the simulation by dropping the network. This will kill all running beacon nodes and
* validator clients.
*/
println!(
"Simulation complete. Finished with {} beacon nodes and {} validator clients",
network.beacon_node_count(),
network.validator_client_count()
);
// Be explicit about dropping the network, as this kills all the nodes. This ensures
// all the checks have adequate time to pass.
drop(network);
Ok::<(), String>(())
};
env.runtime().block_on(main_future)
}
pub async fn pick_strategy<E: EthSpec>(
strategy: &str,
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> Result<(), String> {
match strategy {
"one-node" => {
verify_one_node_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await
}
"two-nodes" => {
verify_two_nodes_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await
}
"mixed" => {
verify_in_between_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await
}
"all" => {
verify_syncing(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await
}
_ => Err("Invalid strategy".into()),
}
}
/// Verify one node added after `initial_delay` epochs is in sync
/// after `sync_timeout` epochs.
pub async fn verify_one_node_sync<E: EthSpec>(
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> Result<(), String> {
let epoch_duration = slot_duration * (E::slots_per_epoch() as u32);
let network_c = network.clone();
// Delay for `initial_delay` epochs before adding another node to start syncing
epoch_delay(
Epoch::new(initial_delay),
slot_duration,
E::slots_per_epoch(),
)
.await;
// Add a beacon node
network.add_beacon_node(beacon_config).await?;
// Check every `epoch_duration` if nodes are synced
// limited to at most `sync_timeout` epochs
let mut interval = tokio::time::interval(epoch_duration);
let mut count = 0;
while let Some(_) = interval.next().await {
if count >= sync_timeout || !check_still_syncing(&network_c).await? {
break;
}
count += 1;
}
let epoch = network.bootnode_epoch().await?;
verify_all_finalized_at(network, epoch)
.map_err(|e| format!("One node sync error: {}", e))
.await
}
/// Verify two nodes added after `initial_delay` epochs are in sync
/// after `sync_timeout` epochs.
pub async fn verify_two_nodes_sync<E: EthSpec>(
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> Result<(), String> {
let epoch_duration = slot_duration * (E::slots_per_epoch() as u32);
let network_c = network.clone();
// Delay for `initial_delay` epochs before adding another node to start syncing
epoch_delay(
Epoch::new(initial_delay),
slot_duration,
E::slots_per_epoch(),
)
.await;
// Add beacon nodes
network.add_beacon_node(beacon_config.clone()).await?;
network.add_beacon_node(beacon_config).await?;
// Check every `epoch_duration` if nodes are synced
// limited to at most `sync_timeout` epochs
let mut interval = tokio::time::interval(epoch_duration);
let mut count = 0;
while let Some(_) = interval.next().await {
if count >= sync_timeout || !check_still_syncing(&network_c).await? {
break;
}
count += 1;
}
let epoch = network.bootnode_epoch().await?;
verify_all_finalized_at(network, epoch)
.map_err(|e| format!("One node sync error: {}", e))
.await
}
/// Add 2 syncing nodes after `initial_delay` epochs,
/// Add another node after `sync_timeout - 5` epochs and verify all are
/// in sync after `sync_timeout + 5` epochs.
pub async fn verify_in_between_sync<E: EthSpec>(
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> Result<(), String> {
let epoch_duration = slot_duration * (E::slots_per_epoch() as u32);
let network_c = network.clone();
// Delay for `initial_delay` epochs before adding another node to start syncing
let config1 = beacon_config.clone();
epoch_delay(
Epoch::new(initial_delay),
slot_duration,
E::slots_per_epoch(),
)
.await;
// Add two beacon nodes
network.add_beacon_node(beacon_config.clone()).await?;
network.add_beacon_node(beacon_config).await?;
// Delay before adding additional syncing nodes.
epoch_delay(
Epoch::new(sync_timeout - 5),
slot_duration,
E::slots_per_epoch(),
)
.await;
// Add a beacon node
network.add_beacon_node(config1.clone()).await?;
// Check every `epoch_duration` if nodes are synced
// limited to at most `sync_timeout` epochs
let mut interval = tokio::time::interval(epoch_duration);
let mut count = 0;
while let Some(_) = interval.next().await {
if count >= sync_timeout || !check_still_syncing(&network_c).await? {
break;
}
count += 1;
}
let epoch = network.bootnode_epoch().await?;
verify_all_finalized_at(network, epoch)
.map_err(|e| format!("One node sync error: {}", e))
.await
}
/// Run syncing strategies one after other.
pub async fn verify_syncing<E: EthSpec>(
network: LocalNetwork<E>,
beacon_config: ClientConfig,
slot_duration: Duration,
initial_delay: u64,
sync_timeout: u64,
) -> Result<(), String> {
verify_one_node_sync(
network.clone(),
beacon_config.clone(),
slot_duration,
initial_delay,
sync_timeout,
)
.await?;
println!("Completed one node sync");
verify_two_nodes_sync(
network.clone(),
beacon_config.clone(),
slot_duration,
initial_delay,
sync_timeout,
)
.await?;
println!("Completed two node sync");
verify_in_between_sync(
network,
beacon_config,
slot_duration,
initial_delay,
sync_timeout,
)
.await?;
println!("Completed in between sync");
Ok(())
}
pub async fn check_still_syncing<E: EthSpec>(network: &LocalNetwork<E>) -> Result<bool, String> {
// get syncing status of nodes
let mut status = Vec::new();
for remote_node in network.remote_nodes()? {
status.push(
remote_node
.http
.node()
.syncing_status()
.await
.map(|status| status.is_syncing)
.map_err(|e| format!("Get syncing status via http failed: {:?}", e))?,
)
}
Ok(status.iter().any(|is_syncing| *is_syncing))
}