Hierarchical state diffs (#5978)

* Start extracting freezer changes for tree-states

* Remove unused config args

* Add comments

* Remove unwraps

* Subjective more clear implementation

* Clean up hdiff

* Update xdelta3

* Tree states archive metrics (#6040)

* Add store cache size metrics

* Add compress timer metrics

* Add diff apply compute timer metrics

* Add diff buffer cache hit metrics

* Add hdiff buffer load times

* Add blocks replayed metric

* Move metrics to store

* Future proof some metrics

---------

Co-authored-by: Michael Sproul <michael@sigmaprime.io>

* Port and clean up forwards iterator changes

* Add and polish hierarchy-config flag

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Cleaner errors

* Fix beacon_chain test compilation

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Patch a few more freezer block roots

* Fix genesis block root bug

* Fix test failing due to pending updates

* Beacon chain tests passing

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix doc lint

* Implement DB schema upgrade for hierarchical state diffs (#6193)

* DB upgrade

* Add flag

* Delete RestorePointHash

* Update docs

* Update docs

* Implement hierarchical state diffs config migration (#6245)

* Implement hierarchical state diffs config migration

* Review PR

* Remove TODO

* Set CURRENT_SCHEMA_VERSION correctly

* Fix genesis state loading

* Re-delete some PartialBeaconState stuff

---------

Co-authored-by: Michael Sproul <michael@sigmaprime.io>

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix test compilation

* Update schema downgrade test

* Fix tests

* Fix null anchor migration

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix tree states upgrade migration (#6328)

* Towards crash safety

* Fix compilation

* Move cold summaries and state roots to new columns

* Rename StateRoots chunked field

* Update prune states

* Clean hdiff CLI flag and metrics

* Fix "staged reconstruction"

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix alloy issues

* Fix staged reconstruction logic

* Prevent weird slot drift

* Remove "allow" flag

* Update CLI help

* Remove FIXME about downgrade

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Remove some unnecessary error variants

* Fix new test

* Tree states archive - review comments and metrics (#6386)

* Review PR comments and metrics

* Comments

* Add anchor metrics

* drop prev comment

* Update metadata.rs

* Apply suggestions from code review

---------

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/store/src/hot_cold_store.rs

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Clarify comment and remove anchor_slot garbage

* Simplify database anchor (#6397)

* Simplify database anchor

* Update beacon_node/store/src/reconstruct.rs

* Add migration for anchor

* Fix and simplify light_client store tests

* Fix incompatible config test

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* More metrics

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* New historic state cache (#6475)

* New historic state cache

* Add more metrics

* State cache hit rate metrics

* Fix store metrics

* More logs and metrics

* Fix logger

* Ensure cached states have built caches :O

* Replay blocks in preference to diffing

* Two separate caches

* Distribute cache build time to next slot

* Re-plumb historic-state-cache flag

* Clean up metrics

* Update book

* Update beacon_node/store/src/hdiff.rs

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

* Update beacon_node/store/src/historic_state_cache.rs

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

---------

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

* Update database docs

* Update diagram

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Update lockbud to work with bindgen/etc

* Correct pkg name for Debian

* Remove vestigial epochs_per_state_diff

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Markdown lint

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Address Jimmy's review comments

* Simplify ReplayFrom case

* Fix and document genesis_state_root

* Typo

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

* Merge branch 'unstable' into tree-states-archive

* Compute diff of validators list manually (#6556)

* Split hdiff computation

* Dedicated logic for historical roots and summaries

* Benchmark against real states

* Mutated source?

* Version the hdiff

* Add lighthouse DB config for hierarchy exponents

* Tidy up hierarchy exponents flag

* Apply suggestions from code review

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Address PR review

* Remove hardcoded paths in benchmarks

* Delete unused function in benches

* lint

---------

Co-authored-by: Michael Sproul <michael@sigmaprime.io>

* Test hdiff binary format stability (#6585)

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Add deprecation warning for SPRP

* Update xdelta to get rid of duplicate deps

* Document test
This commit is contained in:
Michael Sproul
2024-11-18 12:51:44 +11:00
committed by GitHub
parent 654fc6acdc
commit 9fdd53df56
57 changed files with 3360 additions and 1691 deletions

View File

@@ -158,26 +158,20 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
log: slog::Logger,
) -> Self {
// Determine if backfill is enabled or not.
// Get the anchor info, if this returns None, then backfill is not required for this
// running instance.
// If, for some reason a backfill has already been completed (or we've used a trusted
// genesis root) then backfill has been completed.
let (state, current_start) = match beacon_chain.store.get_anchor_info() {
Some(anchor_info) => {
if anchor_info.block_backfill_complete(beacon_chain.genesis_backfill_slot) {
(BackFillState::Completed, Epoch::new(0))
} else {
(
BackFillState::Paused,
anchor_info
.oldest_block_slot
.epoch(T::EthSpec::slots_per_epoch()),
)
}
}
None => (BackFillState::NotRequired, Epoch::new(0)),
};
let anchor_info = beacon_chain.store.get_anchor_info();
let (state, current_start) =
if anchor_info.block_backfill_complete(beacon_chain.genesis_backfill_slot) {
(BackFillState::Completed, Epoch::new(0))
} else {
(
BackFillState::Paused,
anchor_info
.oldest_block_slot
.epoch(T::EthSpec::slots_per_epoch()),
)
};
let bfs = BackFillSync {
batches: BTreeMap::new(),
@@ -253,25 +247,15 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
self.set_state(BackFillState::Syncing);
// Obtain a new start slot, from the beacon chain and handle possible errors.
match self.reset_start_epoch() {
Err(ResetEpochError::SyncCompleted) => {
error!(self.log, "Backfill sync completed whilst in failed status");
self.set_state(BackFillState::Completed);
return Err(BackFillError::InvalidSyncState(String::from(
"chain completed",
)));
}
Err(ResetEpochError::NotRequired) => {
error!(
self.log,
"Backfill sync not required whilst in failed status"
);
self.set_state(BackFillState::NotRequired);
return Err(BackFillError::InvalidSyncState(String::from(
"backfill not required",
)));
}
Ok(_) => {}
if let Err(e) = self.reset_start_epoch() {
// This infallible match exists to force us to update this code if a future
// refactor of `ResetEpochError` adds a variant.
let ResetEpochError::SyncCompleted = e;
error!(self.log, "Backfill sync completed whilst in failed status");
self.set_state(BackFillState::Completed);
return Err(BackFillError::InvalidSyncState(String::from(
"chain completed",
)));
}
debug!(self.log, "Resuming a failed backfill sync"; "start_epoch" => self.current_start);
@@ -279,9 +263,7 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
// begin requesting blocks from the peer pool, until all peers are exhausted.
self.request_batches(network)?;
}
BackFillState::Completed | BackFillState::NotRequired => {
return Ok(SyncStart::NotSyncing)
}
BackFillState::Completed => return Ok(SyncStart::NotSyncing),
}
Ok(SyncStart::Syncing {
@@ -313,10 +295,7 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
peer_id: &PeerId,
network: &mut SyncNetworkContext<T>,
) -> Result<(), BackFillError> {
if matches!(
self.state(),
BackFillState::Failed | BackFillState::NotRequired
) {
if matches!(self.state(), BackFillState::Failed) {
return Ok(());
}
@@ -1142,17 +1121,14 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
/// This errors if the beacon chain indicates that backfill sync has already completed or is
/// not required.
fn reset_start_epoch(&mut self) -> Result<(), ResetEpochError> {
if let Some(anchor_info) = self.beacon_chain.store.get_anchor_info() {
if anchor_info.block_backfill_complete(self.beacon_chain.genesis_backfill_slot) {
Err(ResetEpochError::SyncCompleted)
} else {
self.current_start = anchor_info
.oldest_block_slot
.epoch(T::EthSpec::slots_per_epoch());
Ok(())
}
let anchor_info = self.beacon_chain.store.get_anchor_info();
if anchor_info.block_backfill_complete(self.beacon_chain.genesis_backfill_slot) {
Err(ResetEpochError::SyncCompleted)
} else {
Err(ResetEpochError::NotRequired)
self.current_start = anchor_info
.oldest_block_slot
.epoch(T::EthSpec::slots_per_epoch());
Ok(())
}
}
@@ -1160,13 +1136,12 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
fn check_completed(&mut self) -> bool {
if self.would_complete(self.current_start) {
// Check that the beacon chain agrees
if let Some(anchor_info) = self.beacon_chain.store.get_anchor_info() {
// Conditions that we have completed a backfill sync
if anchor_info.block_backfill_complete(self.beacon_chain.genesis_backfill_slot) {
return true;
} else {
error!(self.log, "Backfill out of sync with beacon chain");
}
let anchor_info = self.beacon_chain.store.get_anchor_info();
// Conditions that we have completed a backfill sync
if anchor_info.block_backfill_complete(self.beacon_chain.genesis_backfill_slot) {
return true;
} else {
error!(self.log, "Backfill out of sync with beacon chain");
}
}
false
@@ -1195,6 +1170,4 @@ impl<T: BeaconChainTypes> BackFillSync<T> {
enum ResetEpochError {
/// The chain has already completed.
SyncCompleted,
/// Backfill is not required.
NotRequired,
}