Stable futures (#879)

* Port eth1 lib to use stable futures

* Port eth1_test_rig to stable futures

* Port eth1 tests to stable futures

* Port genesis service to stable futures

* Port genesis tests to stable futures

* Port beacon_chain to stable futures

* Port lcli to stable futures

* Fix eth1_test_rig (#1014)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Update hashmap hashset to stable futures

* Adds panic test to hashset delay

* Port remote_beacon_node to stable futures

* Fix lcli merge conflicts

* Non rpc stuff compiles

* protocol.rs compiles

* Port websockets, timer and notifier to stable futures (#1035)

* Fix lcli

* Port timer to stable futures

* Fix timer

* Port websocket_server to stable futures

* Port notifier to stable futures

* Add TODOS

* Port remote_beacon_node to stable futures

* Partial eth2-libp2p stable future upgrade

* Finished first round of fighting RPC types

* Further progress towards porting eth2-libp2p adds caching to discovery

* Update behaviour

* RPC handler to stable futures

* Update RPC to master libp2p

* Network service additions

* Fix the fallback transport construction (#1102)

* Correct warning

* Remove hashmap delay

* Compiling version of eth2-libp2p

* Update all crates versions

* Fix conversion function and add tests (#1113)

* Port validator_client to stable futures (#1114)

* Add PH & MS slot clock changes

* Account for genesis time

* Add progress on duties refactor

* Add simple is_aggregator bool to val subscription

* Start work on attestation_verification.rs

* Add progress on ObservedAttestations

* Progress with ObservedAttestations

* Fix tests

* Add observed attestations to the beacon chain

* Add attestation observation to processing code

* Add progress on attestation verification

* Add first draft of ObservedAttesters

* Add more tests

* Add observed attesters to beacon chain

* Add observers to attestation processing

* Add more attestation verification

* Create ObservedAggregators map

* Remove commented-out code

* Add observed aggregators into chain

* Add progress

* Finish adding features to attestation verification

* Ensure beacon chain compiles

* Link attn verification into chain

* Integrate new attn verification in chain

* Remove old attestation processing code

* Start trying to fix beacon_chain tests

* Split adding into pools into two functions

* Add aggregation to harness

* Get test harness working again

* Adjust the number of aggregators for test harness

* Fix edge-case in harness

* Integrate new attn processing in network

* Fix compile bug in validator_client

* Update validator API endpoints

* Fix aggreagation in test harness

* Fix enum thing

* Fix attestation observation bug:

* Patch failing API tests

* Start adding comments to attestation verification

* Remove unused attestation field

* Unify "is block known" logic

* Update comments

* Supress fork choice errors for network processing

* Add todos

* Tidy

* Add gossip attn tests

* Disallow test harness to produce old attns

* Comment out in-progress tests

* Partially address pruning tests

* Fix failing store test

* Add aggregate tests

* Add comments about which spec conditions we check

* Dont re-aggregate

* Split apart test harness attn production

* Fix compile error in network

* Make progress on commented-out test

* Fix skipping attestation test

* Add fork choice verification tests

* Tidy attn tests, remove dead code

* Remove some accidentally added code

* Fix clippy lint

* Rename test file

* Add block tests, add cheap block proposer check

* Rename block testing file

* Add observed_block_producers

* Tidy

* Switch around block signature verification

* Finish block testing

* Remove gossip from signature tests

* First pass of self review

* Fix deviation in spec

* Update test spec tags

* Start moving over to hashset

* Finish moving observed attesters to hashmap

* Move aggregation pool over to hashmap

* Make fc attn borrow again

* Fix rest_api compile error

* Fix missing comments

* Fix monster test

* Uncomment increasing slots test

* Address remaining comments

* Remove unsafe, use cfg test

* Remove cfg test flag

* Fix dodgy comment

* Revert "Update hashmap hashset to stable futures"

This reverts commit d432378a3c.

* Revert "Adds panic test to hashset delay"

This reverts commit 281502396f.

* Ported attestation_service

* Ported duties_service

* Ported fork_service

* More ports

* Port block_service

* Minor fixes

* VC compiles

* Update TODOS

* Borrow self where possible

* Ignore aggregates that are already known.

* Unify aggregator modulo logic

* Fix typo in logs

* Refactor validator subscription logic

* Avoid reproducing selection proof

* Skip HTTP call if no subscriptions

* Rename DutyAndState -> DutyAndProof

* Tidy logs

* Print root as dbg

* Fix compile errors in tests

* Fix compile error in test

* Re-Fix attestation and duties service

* Minor fixes

Co-authored-by: Paul Hauner <paul@paulhauner.com>

* Network crate update to stable futures

* Port account_manager to stable futures (#1121)

* Port account_manager to stable futures

* Run async fns in tokio environment

* Port rest_api crate to stable futures (#1118)

* Port rest_api lib to stable futures

* Reduce tokio features

* Update notifier to stable futures

* Builder update

* Further updates

* Convert self referential async functions

* stable futures fixes (#1124)

* Fix eth1 update functions

* Fix genesis and client

* Fix beacon node lib

* Return appropriate runtimes from environment

* Fix test rig

* Refactor eth1 service update

* Upgrade simulator to stable futures

* Lighthouse compiles on stable futures

* Remove println debugging statement

* Update libp2p service, start rpc test upgrade

* Update network crate for new libp2p

* Update tokio::codec to futures_codec (#1128)

* Further work towards RPC corrections

* Correct http timeout and network service select

* Use tokio runtime for libp2p

* Revert "Update tokio::codec to futures_codec (#1128)"

This reverts commit e57aea924a.

* Upgrade RPC libp2p tests

* Upgrade secio fallback test

* Upgrade gossipsub examples

* Clean up RPC protocol

* Test fixes (#1133)

* Correct websocket timeout and run on os thread

* Fix network test

* Clean up PR

* Correct tokio tcp move attestation service tests

* Upgrade attestation service tests

* Correct network test

* Correct genesis test

* Test corrections

* Log info when block is received

* Modify logs and update attester service events

* Stable futures: fixes to vc, eth1 and account manager (#1142)

* Add local testnet scripts

* Remove whiteblock script

* Rename local testnet script

* Move spawns onto handle

* Fix VC panic

* Initial fix to block production issue

* Tidy block producer fix

* Tidy further

* Add local testnet clean script

* Run cargo fmt

* Tidy duties service

* Tidy fork service

* Tidy ForkService

* Tidy AttestationService

* Tidy notifier

* Ensure await is not suppressed in eth1

* Ensure await is not suppressed in account_manager

* Use .ok() instead of .unwrap_or(())

* RPC decoding test for proto

* Update discv5 and eth2-libp2p deps

* Fix lcli double runtime issue (#1144)

* Handle stream termination and dialing peer errors

* Correct peer_info variant types

* Remove unnecessary warnings

* Handle subnet unsubscription removal and improve logigng

* Add logs around ping

* Upgrade discv5 and improve logging

* Handle peer connection status for multiple connections

* Improve network service logging

* Improve logging around peer manager

* Upgrade swarm poll centralise peer management

* Identify clients on error

* Fix `remove_peer` in sync (#1150)

* remove_peer removes from all chains

* Remove logs

* Fix early return from loop

* Improved logging, fix panic

* Partially correct tests

* Stable futures: Vc sync (#1149)

* Improve syncing heuristic

* Add comments

* Use safer method for tolerance

* Fix tests

* Stable futures: Fix VC bug, update agg pool, add more metrics (#1151)

* Expose epoch processing summary

* Expose participation metrics to prometheus

* Switch to f64

* Reduce precision

* Change precision

* Expose observed attesters metrics

* Add metrics for agg/unagg attn counts

* Add metrics for gossip rx

* Add metrics for gossip tx

* Adds ignored attns to prom

* Add attestation timing

* Add timer for aggregation pool sig agg

* Add write lock timer for agg pool

* Add more metrics to agg pool

* Change map lock code

* Add extra metric to agg pool

* Change lock handling in agg pool

* Change .write() to .read()

* Add another agg pool timer

* Fix for is_aggregator

* Fix pruning bug

Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
This commit is contained in:
Age Manning
2020-05-17 21:16:48 +10:00
committed by GitHub
parent 21901b1615
commit b6408805a2
165 changed files with 7924 additions and 7733 deletions

View File

@@ -1,15 +1,14 @@
use crate::{duties_service::DutiesService, validator_store::ValidatorStore};
use environment::RuntimeContext;
use exit_future::Signal;
use futures::{stream, Future, IntoFuture, Stream};
use futures::{FutureExt, StreamExt, TryFutureExt};
use remote_beacon_node::{PublishStatus, RemoteBeaconNode};
use slog::{crit, error, info, trace};
use slot_clock::SlotClock;
use std::ops::Deref;
use std::sync::Arc;
use std::time::{Duration, Instant};
use tokio::timer::Interval;
use types::{ChainSpec, EthSpec};
use tokio::time::{interval_at, Duration, Instant};
use types::{ChainSpec, EthSpec, PublicKey, Slot};
/// Delay this period of time after the slot starts. This allows the node to process the new slot.
const TIME_DELAY_FROM_SLOT: Duration = Duration::from_millis(100);
@@ -114,7 +113,7 @@ impl<T, E: EthSpec> Deref for BlockService<T, E> {
impl<T: SlotClock + 'static, E: EthSpec> BlockService<T, E> {
/// Starts the service that periodically attempts to produce blocks.
pub fn start_update_service(&self, spec: &ChainSpec) -> Result<Signal, String> {
pub fn start_update_service(self, spec: &ChainSpec) -> Result<Signal, String> {
let log = self.context.log.clone();
let duration_to_next_slot = self
@@ -122,145 +121,138 @@ impl<T: SlotClock + 'static, E: EthSpec> BlockService<T, E> {
.duration_to_next_slot()
.ok_or_else(|| "Unable to determine duration to next slot".to_string())?;
let interval = {
info!(
log,
"Block production service started";
"next_update_millis" => duration_to_next_slot.as_millis()
);
let mut interval = {
let slot_duration = Duration::from_millis(spec.milliseconds_per_slot);
Interval::new(
// Note: interval_at panics if slot_duration = 0
interval_at(
Instant::now() + duration_to_next_slot + TIME_DELAY_FROM_SLOT,
slot_duration,
)
};
let (exit_signal, exit_fut) = exit_future::signal();
let service = self.clone();
let log_1 = log.clone();
let log_2 = log.clone();
let runtime_handle = self.inner.context.runtime_handle.clone();
self.context.executor.spawn(
exit_fut
.until(
interval
.map_err(move |e| {
crit! {
log_1,
"Timer thread failed";
"error" => format!("{}", e)
}
})
.for_each(move |_| service.clone().do_update().then(|_| Ok(()))),
)
.map(move |_| info!(log_2, "Shutdown complete")),
let interval_fut = async move {
while interval.next().await.is_some() {
self.do_update().await.ok();
}
};
let (exit_signal, exit_fut) = exit_future::signal();
let future = futures::future::select(
Box::pin(interval_fut),
exit_fut.map(move |_| info!(log, "Shutdown complete")),
);
runtime_handle.spawn(future);
Ok(exit_signal)
}
/// Attempt to produce a block for any block producers in the `ValidatorStore`.
fn do_update(self) -> impl Future<Item = (), Error = ()> {
let service = self.clone();
let log_1 = self.context.log.clone();
let log_2 = self.context.log.clone();
async fn do_update(&self) -> Result<(), ()> {
let log = &self.context.log;
self.slot_clock
.now()
.ok_or_else(move || {
crit!(log_1, "Duties manager failed to read slot clock");
})
.into_future()
.and_then(move |slot| {
let iter = service.duties_service.block_producers(slot).into_iter();
let slot = self.slot_clock.now().ok_or_else(move || {
crit!(log, "Duties manager failed to read slot clock");
})?;
if iter.len() == 0 {
trace!(
log_2,
"No local block proposers for this slot";
"slot" => slot.as_u64()
)
} else if iter.len() > 1 {
error!(
log_2,
"Multiple block proposers for this slot";
"action" => "producing blocks for all proposers",
"num_proposers" => iter.len(),
"slot" => slot.as_u64(),
)
}
trace!(
log,
"Block service update started";
"slot" => slot.as_u64()
);
stream::unfold(iter, move |mut block_producers| {
let log_1 = service.context.log.clone();
let log_2 = service.context.log.clone();
let service_1 = service.clone();
let service_2 = service.clone();
let service_3 = service.clone();
let iter = self.duties_service.block_producers(slot).into_iter();
block_producers.next().map(move |validator_pubkey| {
service_1
.validator_store
.randao_reveal(&validator_pubkey, slot.epoch(E::slots_per_epoch()))
.ok_or_else(|| "Unable to produce randao reveal".to_string())
.into_future()
.and_then(move |randao_reveal| {
service_1
.beacon_node
.http
.validator()
.produce_block(slot, randao_reveal)
.map_err(|e| {
format!(
"Error from beacon node when producing block: {:?}",
e
)
})
})
.and_then(move |block| {
service_2
.validator_store
.sign_block(&validator_pubkey, block)
.ok_or_else(|| "Unable to sign block".to_string())
})
.and_then(move |block| {
service_3
.beacon_node
.http
.validator()
.publish_block(block.clone())
.map(|publish_status| (block, publish_status))
.map_err(|e| {
format!(
"Error from beacon node when publishing block: {:?}",
e
)
})
})
.map(move |(block, publish_status)| match publish_status {
PublishStatus::Valid => info!(
log_1,
"Successfully published block";
"deposits" => block.message.body.deposits.len(),
"attestations" => block.message.body.attestations.len(),
"slot" => block.slot().as_u64(),
),
PublishStatus::Invalid(msg) => crit!(
log_1,
"Published block was invalid";
"message" => msg,
"slot" => block.slot().as_u64(),
),
PublishStatus::Unknown => {
crit!(log_1, "Unknown condition when publishing block")
}
})
.map_err(move |e| {
crit!(
log_2,
"Error whilst producing block";
"message" => e
)
})
.then(|_| Ok(((), block_producers)))
})
})
.collect()
.map(|_| ())
})
if iter.len() == 0 {
trace!(
log,
"No local block proposers for this slot";
"slot" => slot.as_u64()
)
} else if iter.len() > 1 {
error!(
log,
"Multiple block proposers for this slot";
"action" => "producing blocks for all proposers",
"num_proposers" => iter.len(),
"slot" => slot.as_u64(),
)
}
iter.for_each(|validator_pubkey| {
let service = self.clone();
let log = log.clone();
self.inner.context.runtime_handle.spawn(
service
.publish_block(slot, validator_pubkey)
.map_err(move |e| {
crit!(
log,
"Error whilst producing block";
"message" => e
)
}),
);
});
Ok(())
}
/// Produce a block at the given slot for validator_pubkey
async fn publish_block(self, slot: Slot, validator_pubkey: PublicKey) -> Result<(), String> {
let log = &self.context.log;
let randao_reveal = self
.validator_store
.randao_reveal(&validator_pubkey, slot.epoch(E::slots_per_epoch()))
.ok_or_else(|| "Unable to produce randao reveal".to_string())?;
let block = self
.beacon_node
.http
.validator()
.produce_block(slot, randao_reveal)
.await
.map_err(|e| format!("Error from beacon node when producing block: {:?}", e))?;
let signed_block = self
.validator_store
.sign_block(&validator_pubkey, block)
.ok_or_else(|| "Unable to sign block".to_string())?;
let publish_status = self
.beacon_node
.http
.validator()
.publish_block(signed_block.clone())
.await
.map_err(|e| format!("Error from beacon node when publishing block: {:?}", e))?;
match publish_status {
PublishStatus::Valid => info!(
log,
"Successfully published block";
"deposits" => signed_block.message.body.deposits.len(),
"attestations" => signed_block.message.body.attestations.len(),
"slot" => signed_block.slot().as_u64(),
),
PublishStatus::Invalid(msg) => crit!(
log,
"Published block was invalid";
"message" => msg,
"slot" => signed_block.slot().as_u64(),
),
PublishStatus::Unknown => crit!(log, "Unknown condition when publishing block"),
}
Ok(())
}
}