Hierarchical state diffs in hot DB (#6750)

This PR implements https://github.com/sigp/lighthouse/pull/5978 (tree-states) but on the hot DB. It allows Lighthouse to massively reduce its disk footprint during non-finality and overall I/O in all cases.

Closes https://github.com/sigp/lighthouse/issues/6580

Conga into https://github.com/sigp/lighthouse/pull/6744

### TODOs

- [x] Fix OOM in CI https://github.com/sigp/lighthouse/pull/7176
- [x] optimise store_hot_state to avoid storing a duplicate state if the summary already exists (should be safe from races now that pruning is cleaner)
- [x] mispelled: get_ancenstor_state_root
- [x] get_ancestor_state_root should use state summaries
- [x] Prevent split from changing during ancestor calc
- [x] Use same hierarchy for hot and cold

### TODO Good optimization for future PRs

- [ ] On the migration, if the latest hot snapshot is aligned with the cold snapshot migrate the diffs instead of the full states.
```
align slot  time
10485760    Nov-26-2024
12582912    Sep-14-2025
14680064    Jul-02-2026
```

### TODO Maybe things good to have

- [ ] Rename anchor_slot https://github.com/sigp/lighthouse/compare/tree-states-hot-rebase-oom...dapplion:lighthouse:tree-states-hot-anchor-slot-rename?expand=1
- [ ] Make anchor fields not public such that they must be mutated through a method. To prevent un-wanted changes of the anchor_slot

### NOTTODO

- [ ] Use fork-choice and a new method [`descendants_of_checkpoint`](ca2388e196 (diff-046fbdb517ca16b80e4464c2c824cf001a74a0a94ac0065e635768ac391062a8)) to filter only the state summaries that descend of finalized checkpoint]
This commit is contained in:
Lion - dapplion
2025-06-19 04:43:25 +02:00
committed by GitHub
parent 6786b9d12a
commit dd98534158
33 changed files with 2695 additions and 812 deletions

View File

@@ -384,9 +384,9 @@ fn slot_of_prev_restore_point<E: EthSpec>(current_slot: Slot) -> Slot {
#[cfg(test)]
mod test {
use super::*;
use crate::StoreConfig as Config;
use crate::{MemoryStore, StoreConfig as Config};
use beacon_chain::test_utils::BeaconChainHarness;
use beacon_chain::types::{ChainSpec, MainnetEthSpec};
use beacon_chain::types::MainnetEthSpec;
use std::sync::Arc;
use types::FixedBytesExtended;
@@ -400,10 +400,31 @@ mod test {
harness.get_current_state()
}
fn get_store<E: EthSpec>() -> HotColdDB<E, MemoryStore<E>, MemoryStore<E>> {
let store =
HotColdDB::open_ephemeral(Config::default(), Arc::new(E::default_spec())).unwrap();
// Init achor info so anchor slot is set. Use a random block as it is only used for the
// parent_root
let _ = store
.init_anchor_info(Hash256::ZERO, Slot::new(0), Slot::new(0), false)
.unwrap();
// Write a state with state root 0 which is the base `put_state` below tries to diff from
{
let harness = BeaconChainHarness::builder(E::default())
.default_spec()
.deterministic_keypairs(1)
.fresh_ephemeral_store()
.build();
let genesis_state = harness.get_current_state();
store.put_state(&Hash256::ZERO, &genesis_state).unwrap();
}
store
}
#[test]
fn block_root_iter() {
let store =
HotColdDB::open_ephemeral(Config::default(), Arc::new(ChainSpec::minimal())).unwrap();
let store = get_store::<MainnetEthSpec>();
let slots_per_historical_root = MainnetEthSpec::slots_per_historical_root();
let mut state_a: BeaconState<MainnetEthSpec> = get_state();
@@ -449,8 +470,8 @@ mod test {
#[test]
fn state_root_iter() {
let store =
HotColdDB::open_ephemeral(Config::default(), Arc::new(ChainSpec::minimal())).unwrap();
let store = get_store::<MainnetEthSpec>();
let slots_per_historical_root = MainnetEthSpec::slots_per_historical_root();
let mut state_a: BeaconState<MainnetEthSpec> = get_state();