Stable futures (#879)

* Port eth1 lib to use stable futures

* Port eth1_test_rig to stable futures

* Port eth1 tests to stable futures

* Port genesis service to stable futures

* Port genesis tests to stable futures

* Port beacon_chain to stable futures

* Port lcli to stable futures

* Fix eth1_test_rig (#1014)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Update hashmap hashset to stable futures

* Adds panic test to hashset delay

* Port remote_beacon_node to stable futures

* Fix lcli merge conflicts

* Non rpc stuff compiles

* protocol.rs compiles

* Port websockets, timer and notifier to stable futures (#1035)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Port remote_beacon_node to stable futures

* Partial eth2-libp2p stable future upgrade

* Finished first round of fighting RPC types

* Further progress towards porting eth2-libp2p adds caching to discovery

* Update behaviour

* RPC handler to stable futures

* Update RPC to master libp2p

* Network service additions

* Fix the fallback transport construction (#1102)

* Correct warning

* Remove hashmap delay

* Compiling version of eth2-libp2p

* Update all crates versions

* Fix conversion function and add tests (#1113)

* Port validator_client to stable futures (#1114)

* Add PH & MS slot clock changes

* Account for genesis time

* Add progress on duties refactor

* Add simple is_aggregator bool to val subscription

* Start work on attestation_verification.rs

* Add progress on ObservedAttestations

* Progress with ObservedAttestations

* Fix tests

* Add observed attestations to the beacon chain

* Add attestation observation to processing code

* Add progress on attestation verification

* Add first draft of ObservedAttesters

* Add more tests

* Add observed attesters to beacon chain

* Add observers to attestation processing

* Add more attestation verification

* Create ObservedAggregators map

* Remove commented-out code

* Add observed aggregators into chain

* Add progress

* Finish adding features to attestation verification

* Ensure beacon chain compiles

* Link attn verification into chain

* Integrate new attn verification in chain

* Remove old attestation processing code

* Start trying to fix beacon_chain tests

* Split adding into pools into two functions

* Add aggregation to harness

* Get test harness working again

* Adjust the number of aggregators for test harness

* Fix edge-case in harness

* Integrate new attn processing in network

* Fix compile bug in validator_client

* Update validator API endpoints

* Fix aggreagation in test harness

* Fix enum thing

* Fix attestation observation bug:

* Patch failing API tests

* Start adding comments to attestation verification

* Remove unused attestation field

* Unify "is block known" logic

* Update comments

* Supress fork choice errors for network processing

* Add todos

* Tidy

* Add gossip attn tests

* Disallow test harness to produce old attns

* Comment out in-progress tests

* Partially address pruning tests

* Fix failing store test

* Add aggregate tests

* Add comments about which spec conditions we check

* Dont re-aggregate

* Split apart test harness attn production

* Fix compile error in network

* Make progress on commented-out test

* Fix skipping attestation test

* Add fork choice verification tests

* Tidy attn tests, remove dead code

* Remove some accidentally added code

* Fix clippy lint

* Rename test file

* Add block tests, add cheap block proposer check

* Rename block testing file

* Add observed_block_producers

* Tidy

* Switch around block signature verification

* Finish block testing

* Remove gossip from signature tests

* First pass of self review

* Fix deviation in spec

* Update test spec tags

* Start moving over to hashset

* Finish moving observed attesters to hashmap

* Move aggregation pool over to hashmap

* Make fc attn borrow again

* Fix rest_api compile error

* Fix missing comments

* Fix monster test

* Uncomment increasing slots test

* Address remaining comments

* Remove unsafe, use cfg test

* Remove cfg test flag

* Fix dodgy comment

* Revert "Update hashmap hashset to stable futures"

This reverts commit d432378a3c.

* Revert "Adds panic test to hashset delay"

This reverts commit 281502396f.

* Ported attestation_service

* Ported duties_service

* Ported fork_service

* More ports

* Port block_service

* Minor fixes

* VC compiles

* Update TODOS

* Borrow self where possible

* Ignore aggregates that are already known.

* Unify aggregator modulo logic

* Fix typo in logs

* Refactor validator subscription logic

* Avoid reproducing selection proof

* Skip HTTP call if no subscriptions

* Rename DutyAndState -> DutyAndProof

* Tidy logs

* Print root as dbg

* Fix compile errors in tests

* Fix compile error in test

* Re-Fix attestation and duties service

* Minor fixes

Co-authored-by: Paul Hauner <paul@paulhauner.com>

* Network crate update to stable futures

* Port account_manager to stable futures (#1121)

* Port account_manager to stable futures

* Run async fns in tokio environment

* Port rest_api crate to stable futures (#1118)

* Port rest_api lib to stable futures

* Reduce tokio features

* Update notifier to stable futures

* Builder update

* Further updates

* Convert self referential async functions

* stable futures fixes (#1124)

* Fix eth1 update functions

* Fix genesis and client

* Fix beacon node lib

* Return appropriate runtimes from environment

* Fix test rig

* Refactor eth1 service update

* Upgrade simulator to stable futures

* Lighthouse compiles on stable futures

* Remove println debugging statement

* Update libp2p service, start rpc test upgrade

* Update network crate for new libp2p

* Update tokio::codec to futures_codec (#1128)

* Further work towards RPC corrections

* Correct http timeout and network service select

* Use tokio runtime for libp2p

* Revert "Update tokio::codec to futures_codec (#1128)"

This reverts commit e57aea924a.

* Upgrade RPC libp2p tests

* Upgrade secio fallback test

* Upgrade gossipsub examples

* Clean up RPC protocol

* Test fixes (#1133)

* Correct websocket timeout and run on os thread

* Fix network test

* Clean up PR

* Correct tokio tcp move attestation service tests

* Upgrade attestation service tests

* Correct network test

* Correct genesis test

* Test corrections

* Log info when block is received

* Modify logs and update attester service events

* Stable futures: fixes to vc, eth1 and account manager (#1142)

* Add local testnet scripts

* Remove whiteblock script

* Rename local testnet script

* Move spawns onto handle

* Fix VC panic

* Initial fix to block production issue

* Tidy block producer fix

* Tidy further

* Add local testnet clean script

* Run cargo fmt

* Tidy duties service

* Tidy fork service

* Tidy ForkService

* Tidy AttestationService

* Tidy notifier

* Ensure await is not suppressed in eth1

* Ensure await is not suppressed in account_manager

* Use .ok() instead of .unwrap_or(())

* RPC decoding test for proto

* Update discv5 and eth2-libp2p deps

* Fix lcli double runtime issue (#1144)

* Handle stream termination and dialing peer errors

* Correct peer_info variant types

* Remove unnecessary warnings

* Handle subnet unsubscription removal and improve logigng

* Add logs around ping

* Upgrade discv5 and improve logging

* Handle peer connection status for multiple connections

* Improve network service logging

* Improve logging around peer manager

* Upgrade swarm poll centralise peer management

* Identify clients on error

* Fix `remove_peer` in sync (#1150)

* remove_peer removes from all chains

* Remove logs

* Fix early return from loop

* Improved logging, fix panic

* Partially correct tests

* Stable futures: Vc sync (#1149)

* Improve syncing heuristic

* Add comments

* Use safer method for tolerance

* Fix tests

* Stable futures: Fix VC bug, update agg pool, add more metrics (#1151)

* Expose epoch processing summary

* Expose participation metrics to prometheus

* Switch to f64

* Reduce precision

* Change precision

* Expose observed attesters metrics

* Add metrics for agg/unagg attn counts

* Add metrics for gossip rx

* Add metrics for gossip tx

* Adds ignored attns to prom

* Add attestation timing

* Add timer for aggregation pool sig agg

* Add write lock timer for agg pool

* Add more metrics to agg pool

* Change map lock code

* Add extra metric to agg pool

* Change lock handling in agg pool

* Change .write() to .read()

* Add another agg pool timer

* Fix for is_aggregator

* Fix pruning bug

Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
This commit is contained in:
Age Manning
2020-05-17 21:16:48 +10:00
committed by GitHub
parent 21901b1615
commit b6408805a2
165 changed files with 7924 additions and 7733 deletions

View File

@@ -4,6 +4,7 @@ mod cli;
mod config;
mod duties_service;
mod fork_service;
mod is_synced;
mod notifier;
mod validator_store;
@@ -19,18 +20,13 @@ use duties_service::{DutiesService, DutiesServiceBuilder};
use environment::RuntimeContext;
use exit_future::Signal;
use fork_service::{ForkService, ForkServiceBuilder};
use futures::{
future::{self, loop_fn, Loop},
Future, IntoFuture,
};
use notifier::spawn_notifier;
use remote_beacon_node::RemoteBeaconNode;
use slog::{error, info, Logger};
use slot_clock::SlotClock;
use slot_clock::SystemTimeSlotClock;
use std::time::{Duration, Instant};
use std::time::{SystemTime, UNIX_EPOCH};
use tokio::timer::Delay;
use tokio::time::{delay_for, Duration};
use types::EthSpec;
use validator_store::ValidatorStore;
@@ -47,27 +43,24 @@ pub struct ProductionValidatorClient<T: EthSpec> {
block_service: BlockService<SystemTimeSlotClock, T>,
attestation_service: AttestationService<SystemTimeSlotClock, T>,
exit_signals: Vec<Signal>,
config: Config,
}
impl<T: EthSpec> ProductionValidatorClient<T> {
/// Instantiates the validator client, _without_ starting the timers to trigger block
/// and attestation production.
pub fn new_from_cli(
pub async fn new_from_cli(
context: RuntimeContext<T>,
cli_args: &ArgMatches,
) -> impl Future<Item = Self, Error = String> {
Config::from_cli(&cli_args)
.into_future()
.map_err(|e| format!("Unable to initialize config: {}", e))
.and_then(|config| Self::new(context, config))
cli_args: &ArgMatches<'_>,
) -> Result<Self, String> {
let config = Config::from_cli(&cli_args)
.map_err(|e| format!("Unable to initialize config: {}", e))?;
Self::new(context, config).await
}
/// Instantiates the validator client, _without_ starting the timers to trigger block
/// and attestation production.
pub fn new(
mut context: RuntimeContext<T>,
config: Config,
) -> impl Future<Item = Self, Error = String> {
pub async fn new(mut context: RuntimeContext<T>, config: Config) -> Result<Self, String> {
let log_1 = context.log.clone();
let log_2 = context.log.clone();
let log_3 = context.log.clone();
@@ -80,204 +73,178 @@ impl<T: EthSpec> ProductionValidatorClient<T> {
"datadir" => format!("{:?}", config.data_dir),
);
RemoteBeaconNode::new_with_timeout(config.http_server.clone(), HTTP_TIMEOUT)
.map_err(|e| format!("Unable to init beacon node http client: {}", e))
.into_future()
.and_then(move |beacon_node| wait_for_node(beacon_node, log_2))
.and_then(|beacon_node| {
beacon_node
.http
.spec()
.get_eth2_config()
.map(|eth2_config| (beacon_node, eth2_config))
.map_err(|e| format!("Unable to read eth2 config from beacon node: {:?}", e))
})
.and_then(|(beacon_node, eth2_config)| {
beacon_node
.http
.beacon()
.get_genesis_time()
.map(|genesis_time| (beacon_node, eth2_config, genesis_time))
.map_err(|e| format!("Unable to read genesis time from beacon node: {:?}", e))
})
.and_then(move |(beacon_node, remote_eth2_config, genesis_time)| {
SystemTime::now()
.duration_since(UNIX_EPOCH)
.into_future()
.map_err(|e| format!("Unable to read system time: {:?}", e))
.and_then::<_, Box<dyn Future<Item = _, Error = _> + Send>>(move |now| {
let log = log_3.clone();
let genesis = Duration::from_secs(genesis_time);
let beacon_node =
RemoteBeaconNode::new_with_timeout(config.http_server.clone(), HTTP_TIMEOUT)
.map_err(|e| format!("Unable to init beacon node http client: {}", e))?;
// If the time now is less than (prior to) genesis, then delay until the
// genesis instant.
//
// If the validator client starts before genesis, it will get errors from
// the slot clock.
if now < genesis {
info!(
log,
"Starting node prior to genesis";
"seconds_to_wait" => (genesis - now).as_secs()
);
// TODO: check if all logs in wait_for_node are produed while awaiting
let beacon_node = wait_for_node(beacon_node, log_2).await?;
let eth2_config = beacon_node
.http
.spec()
.get_eth2_config()
.await
.map_err(|e| format!("Unable to read eth2 config from beacon node: {:?}", e))?;
let genesis_time = beacon_node
.http
.beacon()
.get_genesis_time()
.await
.map_err(|e| format!("Unable to read genesis time from beacon node: {:?}", e))?;
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)
.map_err(|e| format!("Unable to read system time: {:?}", e))?;
let log = log_3.clone();
let genesis = Duration::from_secs(genesis_time);
Box::new(
Delay::new(Instant::now() + (genesis - now))
.map_err(|e| {
format!("Unable to create genesis wait delay: {:?}", e)
})
.map(move |_| (beacon_node, remote_eth2_config, genesis_time)),
)
} else {
info!(
log,
"Genesis has already occurred";
"seconds_ago" => (now - genesis).as_secs()
);
// If the time now is less than (prior to) genesis, then delay until the
// genesis instant.
//
// If the validator client starts before genesis, it will get errors from
// the slot clock.
if now < genesis {
info!(
log,
"Starting node prior to genesis";
"seconds_to_wait" => (genesis - now).as_secs()
);
Box::new(future::ok((beacon_node, remote_eth2_config, genesis_time)))
}
})
})
.and_then(|(beacon_node, eth2_config, genesis_time)| {
beacon_node
.http
.beacon()
.get_genesis_validators_root()
.map(move |genesis_validators_root| {
(
beacon_node,
eth2_config,
genesis_time,
genesis_validators_root,
)
})
.map_err(|e| {
format!(
"Unable to read genesis validators root from beacon node: {:?}",
e
)
})
})
.and_then(
move |(beacon_node, remote_eth2_config, genesis_time, genesis_validators_root)| {
let log = log_4.clone();
delay_for(genesis - now).await
} else {
info!(
log,
"Genesis has already occurred";
"seconds_ago" => (now - genesis).as_secs()
);
}
let genesis_validators_root = beacon_node
.http
.beacon()
.get_genesis_validators_root()
.await
.map_err(|e| {
format!(
"Unable to read genesis validators root from beacon node: {:?}",
e
)
})?;
let log = log_4.clone();
// Do not permit a connection to a beacon node using different spec constants.
if context.eth2_config.spec_constants != remote_eth2_config.spec_constants {
return Err(format!(
"Beacon node is using an incompatible spec. Got {}, expected {}",
remote_eth2_config.spec_constants, context.eth2_config.spec_constants
));
}
// Do not permit a connection to a beacon node using different spec constants.
if context.eth2_config.spec_constants != eth2_config.spec_constants {
return Err(format!(
"Beacon node is using an incompatible spec. Got {}, expected {}",
eth2_config.spec_constants, context.eth2_config.spec_constants
));
}
// Note: here we just assume the spec variables of the remote node. This is very useful
// for testnets, but perhaps a security issue when it comes to mainnet.
//
// A damaging attack would be for a beacon node to convince the validator client of a
// different `SLOTS_PER_EPOCH` variable. This could result in slashable messages being
// produced. We are safe from this because `SLOTS_PER_EPOCH` is a type-level constant
// for Lighthouse.
context.eth2_config = remote_eth2_config;
// Note: here we just assume the spec variables of the remote node. This is very useful
// for testnets, but perhaps a security issue when it comes to mainnet.
//
// A damaging attack would be for a beacon node to convince the validator client of a
// different `SLOTS_PER_EPOCH` variable. This could result in slashable messages being
// produced. We are safe from this because `SLOTS_PER_EPOCH` is a type-level constant
// for Lighthouse.
context.eth2_config = eth2_config;
let slot_clock = SystemTimeSlotClock::new(
context.eth2_config.spec.genesis_slot,
Duration::from_secs(genesis_time),
Duration::from_millis(context.eth2_config.spec.milliseconds_per_slot),
);
let slot_clock = SystemTimeSlotClock::new(
context.eth2_config.spec.genesis_slot,
Duration::from_secs(genesis_time),
Duration::from_millis(context.eth2_config.spec.milliseconds_per_slot),
);
let fork_service = ForkServiceBuilder::new()
.slot_clock(slot_clock.clone())
.beacon_node(beacon_node.clone())
.runtime_context(context.service_context("fork".into()))
.build()?;
let fork_service = ForkServiceBuilder::new()
.slot_clock(slot_clock.clone())
.beacon_node(beacon_node.clone())
.runtime_context(context.service_context("fork".into()))
.build()?;
let validator_store: ValidatorStore<SystemTimeSlotClock, T> =
match &config.key_source {
// Load pre-existing validators from the data dir.
//
// Use the `account_manager` to generate these files.
KeySource::Disk => ValidatorStore::load_from_disk(
config.data_dir.clone(),
genesis_validators_root,
context.eth2_config.spec.clone(),
fork_service.clone(),
log.clone(),
)?,
// Generate ephemeral insecure keypairs for testing purposes.
//
// Do not use in production.
KeySource::InsecureKeypairs(indices) => {
ValidatorStore::insecure_ephemeral_validators(
&indices,
genesis_validators_root,
context.eth2_config.spec.clone(),
fork_service.clone(),
log.clone(),
)?
}
};
let validator_store: ValidatorStore<SystemTimeSlotClock, T> = match &config.key_source {
// Load pre-existing validators from the data dir.
//
// Use the `account_manager` to generate these files.
KeySource::Disk => ValidatorStore::load_from_disk(
config.data_dir.clone(),
genesis_validators_root,
context.eth2_config.spec.clone(),
fork_service.clone(),
log.clone(),
)?,
// Generate ephemeral insecure keypairs for testing purposes.
//
// Do not use in production.
KeySource::InsecureKeypairs(indices) => ValidatorStore::insecure_ephemeral_validators(
&indices,
genesis_validators_root,
context.eth2_config.spec.clone(),
fork_service.clone(),
log.clone(),
)?,
};
info!(
log,
"Loaded validator keypair store";
"voting_validators" => validator_store.num_voting_validators()
);
info!(
log,
"Loaded validator keypair store";
"voting_validators" => validator_store.num_voting_validators()
);
let duties_service = DutiesServiceBuilder::new()
.slot_clock(slot_clock.clone())
.validator_store(validator_store.clone())
.beacon_node(beacon_node.clone())
.runtime_context(context.service_context("duties".into()))
.allow_unsynced_beacon_node(config.allow_unsynced_beacon_node)
.build()?;
let duties_service = DutiesServiceBuilder::new()
.slot_clock(slot_clock.clone())
.validator_store(validator_store.clone())
.beacon_node(beacon_node.clone())
.runtime_context(context.service_context("duties".into()))
.allow_unsynced_beacon_node(config.allow_unsynced_beacon_node)
.build()?;
let block_service = BlockServiceBuilder::new()
.duties_service(duties_service.clone())
.slot_clock(slot_clock.clone())
.validator_store(validator_store.clone())
.beacon_node(beacon_node.clone())
.runtime_context(context.service_context("block".into()))
.build()?;
let block_service = BlockServiceBuilder::new()
.duties_service(duties_service.clone())
.slot_clock(slot_clock.clone())
.validator_store(validator_store.clone())
.beacon_node(beacon_node.clone())
.runtime_context(context.service_context("block".into()))
.build()?;
let attestation_service = AttestationServiceBuilder::new()
.duties_service(duties_service.clone())
.slot_clock(slot_clock)
.validator_store(validator_store)
.beacon_node(beacon_node)
.runtime_context(context.service_context("attestation".into()))
.build()?;
let attestation_service = AttestationServiceBuilder::new()
.duties_service(duties_service.clone())
.slot_clock(slot_clock)
.validator_store(validator_store)
.beacon_node(beacon_node)
.runtime_context(context.service_context("attestation".into()))
.build()?;
Ok(Self {
context,
duties_service,
fork_service,
block_service,
attestation_service,
exit_signals: vec![],
})
},
)
Ok(Self {
context,
duties_service,
fork_service,
block_service,
attestation_service,
exit_signals: vec![],
config,
})
}
pub fn start_service(&mut self) -> Result<(), String> {
let duties_exit = self
.duties_service
.clone()
.start_update_service(&self.context.eth2_config.spec)
.map_err(|e| format!("Unable to start duties service: {}", e))?;
let fork_exit = self
.fork_service
.clone()
.start_update_service(&self.context.eth2_config.spec)
.map_err(|e| format!("Unable to start fork service: {}", e))?;
let block_exit = self
.block_service
.clone()
.start_update_service(&self.context.eth2_config.spec)
.map_err(|e| format!("Unable to start block service: {}", e))?;
let attestation_exit = self
.attestation_service
.clone()
.start_update_service(&self.context.eth2_config.spec)
.map_err(|e| format!("Unable to start attestation service: {}", e))?;
@@ -298,48 +265,39 @@ impl<T: EthSpec> ProductionValidatorClient<T> {
/// Request the version from the node, looping back and trying again on failure. Exit once the node
/// has been contacted.
fn wait_for_node<E: EthSpec>(
async fn wait_for_node<E: EthSpec>(
beacon_node: RemoteBeaconNode<E>,
log: Logger,
) -> impl Future<Item = RemoteBeaconNode<E>, Error = String> {
) -> Result<RemoteBeaconNode<E>, String> {
// Try to get the version string from the node, looping until success is returned.
loop_fn(beacon_node.clone(), move |beacon_node| {
loop {
let log = log.clone();
beacon_node
let result = beacon_node
.clone()
.http
.node()
.get_version()
.map_err(|e| format!("{:?}", e))
.then(move |result| {
let future: Box<dyn Future<Item = Loop<_, _>, Error = String> + Send> = match result
{
Ok(version) => {
info!(
log,
"Connected to beacon node";
"version" => version,
);
.await
.map_err(|e| format!("{:?}", e));
Box::new(future::ok(Loop::Break(beacon_node)))
}
Err(e) => {
error!(
log,
"Unable to connect to beacon node";
"error" => format!("{:?}", e),
);
match result {
Ok(version) => {
info!(
log,
"Connected to beacon node";
"version" => version,
);
Box::new(
Delay::new(Instant::now() + RETRY_DELAY)
.map_err(|e| format!("Failed to trigger delay: {:?}", e))
.and_then(|_| future::ok(Loop::Continue(beacon_node))),
)
}
};
future
})
})
.map(|_| beacon_node)
return Ok(beacon_node);
}
Err(e) => {
error!(
log,
"Unable to connect to beacon node";
"error" => format!("{:?}", e),
);
delay_for(RETRY_DELAY).await;
}
}
}
}