Optimize validator duties (#2243)

## Issue Addressed

Closes #2052

## Proposed Changes

- Refactor the attester/proposer duties endpoints in the BN
    - Performance improvements
    - Fixes some potential inconsistencies with the dependent root fields.
    - Removes `http_api::beacon_proposer_cache` and just uses the one on the `BeaconChain` instead.
    - Move the code for the proposer/attester duties endpoints into separate files, for readability.
- Refactor the `DutiesService` in the VC
    - Required to reduce the delay on broadcasting new blocks.
    - Gets rid of the `ValidatorDuty` shim struct that came about when we adopted the standard API.
    - Separate block/attestation duty tasks so that they don't block each other when one is slow.
- In the VC, use `PublicKeyBytes` to represent validators instead of `PublicKey`. `PublicKey` is a legit crypto object whilst `PublicKeyBytes` is just a byte-array, it's much faster to clone/hash `PublicKeyBytes` and this change has had a significant impact on runtimes.
    - Unfortunately this has created lots of dust changes.
 - In the BN, store `PublicKeyBytes` in the `beacon_proposer_cache` and allow access to them. The HTTP API always sends `PublicKeyBytes` over the wire and the conversion from `PublicKey` -> `PublickeyBytes` is non-trivial, especially when queries have 100s/1000s of validators (like Pyrmont).
 - Add the `state_processing::state_advance` mod which dedups a lot of the "apply `n` skip slots to the state" code.
    - This also fixes a bug with some functions which were failing to include a state root as per [this comment](072695284f/consensus/state_processing/src/state_advance.rs (L69-L74)). I couldn't find any instance of this bug that resulted in anything more severe than keying a shuffling cache by the wrong block root.
 - Swap the VC block service to use `mpsc` from `tokio` instead of `futures`. This is consistent with the rest of the code base.
    
~~This PR *reduces* the size of the codebase 🎉~~ It *used* to reduce the size of the code base before I added more comments. 

## Observations on Prymont

- Proposer duties times down from peaks of 450ms to consistent <1ms.
- Current epoch attester duties times down from >1s peaks to a consistent 20-30ms.
- Block production down from +600ms to 100-200ms.

## Additional Info

- ~~Blocked on #2241~~
- ~~Blocked on #2234~~

## TODO

- [x] ~~Refactor this into some smaller PRs?~~ Leaving this as-is for now.
- [x] Address `per_slot_processing` roots.
- [x] Investigate slow next epoch times. Not getting added to cache on block processing?
- [x] Consider [this](072695284f/beacon_node/store/src/hot_cold_store.rs (L811-L812)) in the scenario of replacing the state roots


Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
This commit is contained in:
Paul Hauner
2021-03-17 05:09:57 +00:00
parent 6a69b20be1
commit 015ab7d0a7
49 changed files with 2201 additions and 1833 deletions

View File

@@ -60,7 +60,9 @@ use state_processing::{
block_signature_verifier::{BlockSignatureVerifier, Error as BlockSignatureVerifierError},
per_block_processing,
per_epoch_processing::EpochProcessingSummary,
per_slot_processing, BlockProcessingError, BlockSignatureStrategy, SlotProcessingError,
per_slot_processing,
state_advance::partial_state_advance,
BlockProcessingError, BlockSignatureStrategy, SlotProcessingError,
};
use std::borrow::Cow;
use std::convert::TryFrom;
@@ -351,8 +353,12 @@ pub fn signature_verify_chain_segment<T: BeaconChainTypes>(
.map(|(_, block)| block.slot())
.unwrap_or_else(|| slot);
let state =
cheap_state_advance_to_obtain_committees(&mut parent.pre_state, highest_slot, &chain.spec)?;
let state = cheap_state_advance_to_obtain_committees(
&mut parent.pre_state,
parent.beacon_state_root,
highest_slot,
&chain.spec,
)?;
let pubkey_cache = get_validator_pubkey_cache(chain)?;
let mut signature_verifier = get_signature_verifier(&state, &pubkey_cache, &chain.spec);
@@ -564,7 +570,7 @@ impl<T: BeaconChainTypes> GossipVerifiedBlock<T> {
let proposer_opt = chain
.beacon_proposer_cache
.lock()
.get::<T::EthSpec>(proposer_shuffling_decision_block, block.slot());
.get_slot::<T::EthSpec>(proposer_shuffling_decision_block, block.slot());
let (expected_proposer, fork, parent, block) = if let Some(proposer) = proposer_opt {
// The proposer index was cached and we can return it without needing to load the
// parent.
@@ -586,6 +592,7 @@ impl<T: BeaconChainTypes> GossipVerifiedBlock<T> {
// The state produced is only valid for determining proposer/attester shuffling indices.
let state = cheap_state_advance_to_obtain_committees(
&mut parent.pre_state,
parent.beacon_state_root,
block.slot(),
&chain.spec,
)?;
@@ -694,6 +701,7 @@ impl<T: BeaconChainTypes> SignatureVerifiedBlock<T> {
let state = cheap_state_advance_to_obtain_committees(
&mut parent.pre_state,
parent.beacon_state_root,
block.slot(),
&chain.spec,
)?;
@@ -738,6 +746,7 @@ impl<T: BeaconChainTypes> SignatureVerifiedBlock<T> {
let state = cheap_state_advance_to_obtain_committees(
&mut parent.pre_state,
parent.beacon_state_root,
block.slot(),
&chain.spec,
)?;
@@ -1280,6 +1289,7 @@ fn load_parent<T: BeaconChainTypes>(
beacon_block: parent_block,
beacon_block_root: root,
pre_state: parent_state,
beacon_state_root: Some(parent_state_root),
},
block,
))
@@ -1303,6 +1313,7 @@ fn load_parent<T: BeaconChainTypes>(
/// mutated to be invalid (in fact, it is never changed beyond a simple committee cache build).
fn cheap_state_advance_to_obtain_committees<'a, E: EthSpec>(
state: &'a mut BeaconState<E>,
state_root_opt: Option<Hash256>,
block_slot: Slot,
spec: &ChainSpec,
) -> Result<Cow<'a, BeaconState<E>>, BlockError<E>> {
@@ -1319,14 +1330,12 @@ fn cheap_state_advance_to_obtain_committees<'a, E: EthSpec>(
})
} else {
let mut state = state.clone_with(CloneConfig::committee_caches_only());
let target_slot = block_epoch.start_slot(E::slots_per_epoch());
while state.current_epoch() < block_epoch {
// Don't calculate state roots since they aren't required for calculating
// shuffling (achieved by providing Hash256::zero()).
per_slot_processing(&mut state, Some(Hash256::zero()), spec).map_err(|e| {
BlockError::BeaconChainError(BeaconChainError::SlotProcessingError(e))
})?;
}
// Advance the state into the same epoch as the block. Use the "partial" method since state
// roots are not important for proposer/attester shuffling.
partial_state_advance(&mut state, state_root_opt, target_slot, spec)
.map_err(|e| BlockError::BeaconChainError(BeaconChainError::from(e)))?;
state.build_committee_cache(RelativeEpoch::Current, spec)?;