mirror of
https://github.com/sigp/lighthouse.git
synced 2026-03-14 10:22:38 +00:00
Hierarchical state diffs (#5978)
* Start extracting freezer changes for tree-states * Remove unused config args * Add comments * Remove unwraps * Subjective more clear implementation * Clean up hdiff * Update xdelta3 * Tree states archive metrics (#6040) * Add store cache size metrics * Add compress timer metrics * Add diff apply compute timer metrics * Add diff buffer cache hit metrics * Add hdiff buffer load times * Add blocks replayed metric * Move metrics to store * Future proof some metrics --------- Co-authored-by: Michael Sproul <michael@sigmaprime.io> * Port and clean up forwards iterator changes * Add and polish hierarchy-config flag * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Cleaner errors * Fix beacon_chain test compilation * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Patch a few more freezer block roots * Fix genesis block root bug * Fix test failing due to pending updates * Beacon chain tests passing * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Fix doc lint * Implement DB schema upgrade for hierarchical state diffs (#6193) * DB upgrade * Add flag * Delete RestorePointHash * Update docs * Update docs * Implement hierarchical state diffs config migration (#6245) * Implement hierarchical state diffs config migration * Review PR * Remove TODO * Set CURRENT_SCHEMA_VERSION correctly * Fix genesis state loading * Re-delete some PartialBeaconState stuff --------- Co-authored-by: Michael Sproul <michael@sigmaprime.io> * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Fix test compilation * Update schema downgrade test * Fix tests * Fix null anchor migration * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Fix tree states upgrade migration (#6328) * Towards crash safety * Fix compilation * Move cold summaries and state roots to new columns * Rename StateRoots chunked field * Update prune states * Clean hdiff CLI flag and metrics * Fix "staged reconstruction" * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Fix alloy issues * Fix staged reconstruction logic * Prevent weird slot drift * Remove "allow" flag * Update CLI help * Remove FIXME about downgrade * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Remove some unnecessary error variants * Fix new test * Tree states archive - review comments and metrics (#6386) * Review PR comments and metrics * Comments * Add anchor metrics * drop prev comment * Update metadata.rs * Apply suggestions from code review --------- Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/store/src/hot_cold_store.rs Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com> * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Clarify comment and remove anchor_slot garbage * Simplify database anchor (#6397) * Simplify database anchor * Update beacon_node/store/src/reconstruct.rs * Add migration for anchor * Fix and simplify light_client store tests * Fix incompatible config test * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * More metrics * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * New historic state cache (#6475) * New historic state cache * Add more metrics * State cache hit rate metrics * Fix store metrics * More logs and metrics * Fix logger * Ensure cached states have built caches :O * Replay blocks in preference to diffing * Two separate caches * Distribute cache build time to next slot * Re-plumb historic-state-cache flag * Clean up metrics * Update book * Update beacon_node/store/src/hdiff.rs Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com> * Update beacon_node/store/src/historic_state_cache.rs Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com> --------- Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com> * Update database docs * Update diagram * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Update lockbud to work with bindgen/etc * Correct pkg name for Debian * Remove vestigial epochs_per_state_diff * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Markdown lint * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Address Jimmy's review comments * Simplify ReplayFrom case * Fix and document genesis_state_root * Typo Co-authored-by: Jimmy Chen <jchen.tc@gmail.com> * Merge branch 'unstable' into tree-states-archive * Compute diff of validators list manually (#6556) * Split hdiff computation * Dedicated logic for historical roots and summaries * Benchmark against real states * Mutated source? * Version the hdiff * Add lighthouse DB config for hierarchy exponents * Tidy up hierarchy exponents flag * Apply suggestions from code review Co-authored-by: Michael Sproul <micsproul@gmail.com> * Address PR review * Remove hardcoded paths in benchmarks * Delete unused function in benches * lint --------- Co-authored-by: Michael Sproul <michael@sigmaprime.io> * Test hdiff binary format stability (#6585) * Merge remote-tracking branch 'origin/unstable' into tree-states-archive * Add deprecation warning for SPRP * Update xdelta to get rid of duplicate deps * Document test
This commit is contained in:
@@ -1,14 +1,16 @@
|
||||
//! Implementation of historic state reconstruction (given complete block history).
|
||||
use crate::hot_cold_store::{HotColdDB, HotColdDBError};
|
||||
use crate::metadata::ANCHOR_FOR_ARCHIVE_NODE;
|
||||
use crate::metrics;
|
||||
use crate::{Error, ItemStore};
|
||||
use itertools::{process_results, Itertools};
|
||||
use slog::info;
|
||||
use slog::{debug, info};
|
||||
use state_processing::{
|
||||
per_block_processing, per_slot_processing, BlockSignatureStrategy, ConsensusContext,
|
||||
VerifyBlockRoot,
|
||||
};
|
||||
use std::sync::Arc;
|
||||
use types::{EthSpec, Hash256};
|
||||
use types::EthSpec;
|
||||
|
||||
impl<E, Hot, Cold> HotColdDB<E, Hot, Cold>
|
||||
where
|
||||
@@ -16,11 +18,16 @@ where
|
||||
Hot: ItemStore<E>,
|
||||
Cold: ItemStore<E>,
|
||||
{
|
||||
pub fn reconstruct_historic_states(self: &Arc<Self>) -> Result<(), Error> {
|
||||
let Some(mut anchor) = self.get_anchor_info() else {
|
||||
// Nothing to do, history is complete.
|
||||
pub fn reconstruct_historic_states(
|
||||
self: &Arc<Self>,
|
||||
num_blocks: Option<usize>,
|
||||
) -> Result<(), Error> {
|
||||
let mut anchor = self.get_anchor_info();
|
||||
|
||||
// Nothing to do, history is complete.
|
||||
if anchor.all_historic_states_stored() {
|
||||
return Ok(());
|
||||
};
|
||||
}
|
||||
|
||||
// Check that all historic blocks are known.
|
||||
if anchor.oldest_block_slot != 0 {
|
||||
@@ -29,37 +36,30 @@ where
|
||||
});
|
||||
}
|
||||
|
||||
info!(
|
||||
debug!(
|
||||
self.log,
|
||||
"Beginning historic state reconstruction";
|
||||
"Starting state reconstruction batch";
|
||||
"start_slot" => anchor.state_lower_limit,
|
||||
);
|
||||
|
||||
let slots_per_restore_point = self.config.slots_per_restore_point;
|
||||
let _t = metrics::start_timer(&metrics::STORE_BEACON_RECONSTRUCTION_TIME);
|
||||
|
||||
// Iterate blocks from the state lower limit to the upper limit.
|
||||
let lower_limit_slot = anchor.state_lower_limit;
|
||||
let split = self.get_split_info();
|
||||
let upper_limit_state = self.get_restore_point(
|
||||
anchor.state_upper_limit.as_u64() / slots_per_restore_point,
|
||||
&split,
|
||||
)?;
|
||||
let upper_limit_slot = upper_limit_state.slot();
|
||||
let lower_limit_slot = anchor.state_lower_limit;
|
||||
let upper_limit_slot = std::cmp::min(split.slot, anchor.state_upper_limit);
|
||||
|
||||
// Use a dummy root, as we never read the block for the upper limit state.
|
||||
let upper_limit_block_root = Hash256::repeat_byte(0xff);
|
||||
|
||||
let block_root_iter = self.forwards_block_roots_iterator(
|
||||
lower_limit_slot,
|
||||
upper_limit_state,
|
||||
upper_limit_block_root,
|
||||
&self.spec,
|
||||
)?;
|
||||
// If `num_blocks` is not specified iterate all blocks. Add 1 so that we end on an epoch
|
||||
// boundary when `num_blocks` is a multiple of an epoch boundary. We want to be *inclusive*
|
||||
// of the state at slot `lower_limit_slot + num_blocks`.
|
||||
let block_root_iter = self
|
||||
.forwards_block_roots_iterator_until(lower_limit_slot, upper_limit_slot - 1, || {
|
||||
Err(Error::StateShouldNotBeRequired(upper_limit_slot - 1))
|
||||
})?
|
||||
.take(num_blocks.map_or(usize::MAX, |n| n + 1));
|
||||
|
||||
// The state to be advanced.
|
||||
let mut state = self
|
||||
.load_cold_state_by_slot(lower_limit_slot)?
|
||||
.ok_or(HotColdDBError::MissingLowerLimitState(lower_limit_slot))?;
|
||||
let mut state = self.load_cold_state_by_slot(lower_limit_slot)?;
|
||||
|
||||
state.build_caches(&self.spec)?;
|
||||
|
||||
@@ -110,8 +110,19 @@ where
|
||||
// Stage state for storage in freezer DB.
|
||||
self.store_cold_state(&state_root, &state, &mut io_batch)?;
|
||||
|
||||
// If the slot lies on an epoch boundary, commit the batch and update the anchor.
|
||||
if slot % slots_per_restore_point == 0 || slot + 1 == upper_limit_slot {
|
||||
let batch_complete =
|
||||
num_blocks.map_or(false, |n_blocks| slot == lower_limit_slot + n_blocks as u64);
|
||||
let reconstruction_complete = slot + 1 == upper_limit_slot;
|
||||
|
||||
// Commit the I/O batch if:
|
||||
//
|
||||
// - The diff/snapshot for this slot is required for future slots, or
|
||||
// - The reconstruction batch is complete (we are about to return), or
|
||||
// - Reconstruction is complete.
|
||||
if self.hierarchy.should_commit_immediately(slot)?
|
||||
|| batch_complete
|
||||
|| reconstruction_complete
|
||||
{
|
||||
info!(
|
||||
self.log,
|
||||
"State reconstruction in progress";
|
||||
@@ -122,9 +133,9 @@ where
|
||||
self.cold_db.do_atomically(std::mem::take(&mut io_batch))?;
|
||||
|
||||
// Update anchor.
|
||||
let old_anchor = Some(anchor.clone());
|
||||
let old_anchor = anchor.clone();
|
||||
|
||||
if slot + 1 == upper_limit_slot {
|
||||
if reconstruction_complete {
|
||||
// The two limits have met in the middle! We're done!
|
||||
// Perform one last integrity check on the state reached.
|
||||
let computed_state_root = state.update_tree_hash_cache()?;
|
||||
@@ -136,23 +147,36 @@ where
|
||||
});
|
||||
}
|
||||
|
||||
self.compare_and_set_anchor_info_with_write(old_anchor, None)?;
|
||||
self.compare_and_set_anchor_info_with_write(
|
||||
old_anchor,
|
||||
ANCHOR_FOR_ARCHIVE_NODE,
|
||||
)?;
|
||||
|
||||
return Ok(());
|
||||
} else {
|
||||
// The lower limit has been raised, store it.
|
||||
anchor.state_lower_limit = slot;
|
||||
|
||||
self.compare_and_set_anchor_info_with_write(
|
||||
old_anchor,
|
||||
Some(anchor.clone()),
|
||||
)?;
|
||||
self.compare_and_set_anchor_info_with_write(old_anchor, anchor.clone())?;
|
||||
}
|
||||
|
||||
// If this is the end of the batch, return Ok. The caller will run another
|
||||
// batch when there is idle capacity.
|
||||
if batch_complete {
|
||||
debug!(
|
||||
self.log,
|
||||
"Finished state reconstruction batch";
|
||||
"start_slot" => lower_limit_slot,
|
||||
"end_slot" => slot,
|
||||
);
|
||||
return Ok(());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Should always reach the `upper_limit_slot` and return early above.
|
||||
Err(Error::StateReconstructionDidNotComplete)
|
||||
// Should always reach the `upper_limit_slot` or the end of the batch and return early
|
||||
// above.
|
||||
Err(Error::StateReconstructionLogicError)
|
||||
})??;
|
||||
|
||||
// Check that the split point wasn't mutated during the state reconstruction process.
|
||||
|
||||
Reference in New Issue
Block a user