Hierarchical state diffs in hot DB (#6750)

This PR implements https://github.com/sigp/lighthouse/pull/5978 (tree-states) but on the hot DB. It allows Lighthouse to massively reduce its disk footprint during non-finality and overall I/O in all cases.

Closes https://github.com/sigp/lighthouse/issues/6580

Conga into https://github.com/sigp/lighthouse/pull/6744

### TODOs

- [x] Fix OOM in CI https://github.com/sigp/lighthouse/pull/7176
- [x] optimise store_hot_state to avoid storing a duplicate state if the summary already exists (should be safe from races now that pruning is cleaner)
- [x] mispelled: get_ancenstor_state_root
- [x] get_ancestor_state_root should use state summaries
- [x] Prevent split from changing during ancestor calc
- [x] Use same hierarchy for hot and cold

### TODO Good optimization for future PRs

- [ ] On the migration, if the latest hot snapshot is aligned with the cold snapshot migrate the diffs instead of the full states.
```
align slot  time
10485760    Nov-26-2024
12582912    Sep-14-2025
14680064    Jul-02-2026
```

### TODO Maybe things good to have

- [ ] Rename anchor_slot https://github.com/sigp/lighthouse/compare/tree-states-hot-rebase-oom...dapplion:lighthouse:tree-states-hot-anchor-slot-rename?expand=1
- [ ] Make anchor fields not public such that they must be mutated through a method. To prevent un-wanted changes of the anchor_slot

### NOTTODO

- [ ] Use fork-choice and a new method [`descendants_of_checkpoint`](ca2388e196 (diff-046fbdb517ca16b80e4464c2c824cf001a74a0a94ac0065e635768ac391062a8)) to filter only the state summaries that descend of finalized checkpoint]
This commit is contained in:
Lion - dapplion
2025-06-19 04:43:25 +02:00
committed by GitHub
parent 6786b9d12a
commit dd98534158
33 changed files with 2695 additions and 812 deletions

View File

@@ -1,6 +1,6 @@
use crate::chunked_vector::ChunkError;
use crate::config::StoreConfigError;
use crate::hot_cold_store::HotColdDBError;
use crate::hot_cold_store::{HotColdDBError, StateSummaryIteratorError};
use crate::{hdiff, DBColumn};
#[cfg(feature = "leveldb")]
use leveldb::error::Error as LevelDBError;
@@ -26,6 +26,9 @@ pub enum Error {
SplitPointModified(Slot, Slot),
ConfigError(StoreConfigError),
MigrationError(String),
/// The store's `anchor_info` is still the default uninitialized value when attempting a state
/// write
AnchorUninitialized,
/// The store's `anchor_info` was mutated concurrently, the latest modification wasn't applied.
AnchorInfoConcurrentMutation,
/// The store's `blob_info` was mutated concurrently, the latest modification wasn't applied.
@@ -47,11 +50,16 @@ pub enum Error {
expected: Hash256,
computed: Hash256,
},
MissingState(Hash256),
MissingHotStateSummary(Hash256),
MissingHotStateSnapshot(Hash256, Slot),
MissingGenesisState,
MissingSnapshot(Slot),
LoadingHotHdiffBufferError(String, Hash256, Box<Error>),
LoadingHotStateError(String, Hash256, Box<Error>),
BlockReplayError(BlockReplayError),
AddPayloadLogicError,
InvalidKey,
InvalidKey(String),
InvalidBytes,
InconsistentFork(InconsistentFork),
#[cfg(feature = "leveldb")]
@@ -75,6 +83,26 @@ pub enum Error {
MissingBlock(Hash256),
GenesisStateUnknown,
ArithError(safe_arith::ArithError),
MismatchedDiffBaseState {
expected_slot: Slot,
stored_slot: Slot,
},
SnapshotDiffBaseState {
slot: Slot,
},
LoadAnchorInfo(Box<Error>),
LoadSplit(Box<Error>),
LoadBlobInfo(Box<Error>),
LoadDataColumnInfo(Box<Error>),
LoadConfig(Box<Error>),
LoadHotStateSummary(Hash256, Box<Error>),
LoadHotStateSummaryForSplit(Box<Error>),
StateSummaryIteratorError {
error: StateSummaryIteratorError,
from_state_root: Hash256,
from_state_slot: Slot,
target_slot: Slot,
},
}
pub trait HandleUnavailable<T> {