Hierarchical state diffs (#5978)

* Start extracting freezer changes for tree-states

* Remove unused config args

* Add comments

* Remove unwraps

* Subjective more clear implementation

* Clean up hdiff

* Update xdelta3

* Tree states archive metrics (#6040)

* Add store cache size metrics

* Add compress timer metrics

* Add diff apply compute timer metrics

* Add diff buffer cache hit metrics

* Add hdiff buffer load times

* Add blocks replayed metric

* Move metrics to store

* Future proof some metrics

---------

Co-authored-by: Michael Sproul <michael@sigmaprime.io>

* Port and clean up forwards iterator changes

* Add and polish hierarchy-config flag

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Cleaner errors

* Fix beacon_chain test compilation

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Patch a few more freezer block roots

* Fix genesis block root bug

* Fix test failing due to pending updates

* Beacon chain tests passing

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix doc lint

* Implement DB schema upgrade for hierarchical state diffs (#6193)

* DB upgrade

* Add flag

* Delete RestorePointHash

* Update docs

* Update docs

* Implement hierarchical state diffs config migration (#6245)

* Implement hierarchical state diffs config migration

* Review PR

* Remove TODO

* Set CURRENT_SCHEMA_VERSION correctly

* Fix genesis state loading

* Re-delete some PartialBeaconState stuff

---------

Co-authored-by: Michael Sproul <michael@sigmaprime.io>

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix test compilation

* Update schema downgrade test

* Fix tests

* Fix null anchor migration

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix tree states upgrade migration (#6328)

* Towards crash safety

* Fix compilation

* Move cold summaries and state roots to new columns

* Rename StateRoots chunked field

* Update prune states

* Clean hdiff CLI flag and metrics

* Fix "staged reconstruction"

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix alloy issues

* Fix staged reconstruction logic

* Prevent weird slot drift

* Remove "allow" flag

* Update CLI help

* Remove FIXME about downgrade

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Remove some unnecessary error variants

* Fix new test

* Tree states archive - review comments and metrics (#6386)

* Review PR comments and metrics

* Comments

* Add anchor metrics

* drop prev comment

* Update metadata.rs

* Apply suggestions from code review

---------

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/store/src/hot_cold_store.rs

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Clarify comment and remove anchor_slot garbage

* Simplify database anchor (#6397)

* Simplify database anchor

* Update beacon_node/store/src/reconstruct.rs

* Add migration for anchor

* Fix and simplify light_client store tests

* Fix incompatible config test

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* More metrics

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* New historic state cache (#6475)

* New historic state cache

* Add more metrics

* State cache hit rate metrics

* Fix store metrics

* More logs and metrics

* Fix logger

* Ensure cached states have built caches :O

* Replay blocks in preference to diffing

* Two separate caches

* Distribute cache build time to next slot

* Re-plumb historic-state-cache flag

* Clean up metrics

* Update book

* Update beacon_node/store/src/hdiff.rs

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

* Update beacon_node/store/src/historic_state_cache.rs

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

---------

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

* Update database docs

* Update diagram

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Update lockbud to work with bindgen/etc

* Correct pkg name for Debian

* Remove vestigial epochs_per_state_diff

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Markdown lint

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Address Jimmy's review comments

* Simplify ReplayFrom case

* Fix and document genesis_state_root

* Typo

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

* Merge branch 'unstable' into tree-states-archive

* Compute diff of validators list manually (#6556)

* Split hdiff computation

* Dedicated logic for historical roots and summaries

* Benchmark against real states

* Mutated source?

* Version the hdiff

* Add lighthouse DB config for hierarchy exponents

* Tidy up hierarchy exponents flag

* Apply suggestions from code review

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Address PR review

* Remove hardcoded paths in benchmarks

* Delete unused function in benches

* lint

---------

Co-authored-by: Michael Sproul <michael@sigmaprime.io>

* Test hdiff binary format stability (#6585)

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Add deprecation warning for SPRP

* Update xdelta to get rid of duplicate deps

* Document test
This commit is contained in:
Michael Sproul
2024-11-18 12:51:44 +11:00
committed by GitHub
parent 654fc6acdc
commit 9fdd53df56
57 changed files with 3360 additions and 1691 deletions

View File

@@ -73,6 +73,27 @@ pub static DISK_DB_DELETE_COUNT: LazyLock<Result<IntCounterVec>> = LazyLock::new
&["col"],
)
});
/*
* Anchor Info
*/
pub static STORE_BEACON_ANCHOR_SLOT: LazyLock<Result<IntGauge>> = LazyLock::new(|| {
try_create_int_gauge(
"store_beacon_anchor_slot",
"Current anchor info anchor_slot value",
)
});
pub static STORE_BEACON_OLDEST_BLOCK_SLOT: LazyLock<Result<IntGauge>> = LazyLock::new(|| {
try_create_int_gauge(
"store_beacon_oldest_block_slot",
"Current anchor info oldest_block_slot value",
)
});
pub static STORE_BEACON_STATE_LOWER_LIMIT: LazyLock<Result<IntGauge>> = LazyLock::new(|| {
try_create_int_gauge(
"store_beacon_state_lower_limit",
"Current anchor info state_lower_limit value",
)
});
/*
* Beacon State
*/
@@ -130,6 +151,24 @@ pub static BEACON_STATE_WRITE_BYTES: LazyLock<Result<IntCounter>> = LazyLock::ne
"Total number of beacon state bytes written to the DB",
)
});
pub static BEACON_HDIFF_READ_TIMES: LazyLock<Result<Histogram>> = LazyLock::new(|| {
try_create_histogram(
"store_hdiff_read_seconds",
"Time required to read the hierarchical diff bytes from the database",
)
});
pub static BEACON_HDIFF_DECODE_TIMES: LazyLock<Result<Histogram>> = LazyLock::new(|| {
try_create_histogram(
"store_hdiff_decode_seconds",
"Time required to decode hierarchical diff bytes",
)
});
pub static BEACON_HDIFF_BUFFER_CLONE_TIMES: LazyLock<Result<Histogram>> = LazyLock::new(|| {
try_create_histogram(
"store_hdiff_buffer_clone_seconds",
"Time required to clone hierarchical diff buffer bytes",
)
});
/*
* Beacon Block
*/
@@ -145,12 +184,181 @@ pub static BEACON_BLOCK_CACHE_HIT_COUNT: LazyLock<Result<IntCounter>> = LazyLock
"Number of hits to the store's block cache",
)
});
/*
* Caches
*/
pub static BEACON_BLOBS_CACHE_HIT_COUNT: LazyLock<Result<IntCounter>> = LazyLock::new(|| {
try_create_int_counter(
"store_beacon_blobs_cache_hit_total",
"Number of hits to the store's blob cache",
)
});
pub static STORE_BEACON_BLOCK_CACHE_SIZE: LazyLock<Result<IntGauge>> = LazyLock::new(|| {
try_create_int_gauge(
"store_beacon_block_cache_size",
"Current count of items in beacon store block cache",
)
});
pub static STORE_BEACON_BLOB_CACHE_SIZE: LazyLock<Result<IntGauge>> = LazyLock::new(|| {
try_create_int_gauge(
"store_beacon_blob_cache_size",
"Current count of items in beacon store blob cache",
)
});
pub static STORE_BEACON_STATE_CACHE_SIZE: LazyLock<Result<IntGauge>> = LazyLock::new(|| {
try_create_int_gauge(
"store_beacon_state_cache_size",
"Current count of items in beacon store state cache",
)
});
pub static STORE_BEACON_HISTORIC_STATE_CACHE_SIZE: LazyLock<Result<IntGauge>> =
LazyLock::new(|| {
try_create_int_gauge(
"store_beacon_historic_state_cache_size",
"Current count of states in the historic state cache",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_CACHE_SIZE: LazyLock<Result<IntGauge>> = LazyLock::new(|| {
try_create_int_gauge(
"store_beacon_hdiff_buffer_cache_size",
"Current count of hdiff buffers in the historic state cache",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_CACHE_BYTE_SIZE: LazyLock<Result<IntGauge>> =
LazyLock::new(|| {
try_create_int_gauge(
"store_beacon_hdiff_buffer_cache_byte_size",
"Memory consumed by hdiff buffers in the historic state cache",
)
});
pub static STORE_BEACON_STATE_FREEZER_COMPRESS_TIME: LazyLock<Result<Histogram>> =
LazyLock::new(|| {
try_create_histogram(
"store_beacon_state_compress_seconds",
"Time taken to compress a state snapshot for the freezer DB",
)
});
pub static STORE_BEACON_STATE_FREEZER_DECOMPRESS_TIME: LazyLock<Result<Histogram>> =
LazyLock::new(|| {
try_create_histogram(
"store_beacon_state_decompress_seconds",
"Time taken to decompress a state snapshot for the freezer DB",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_APPLY_TIME: LazyLock<Result<Histogram>> =
LazyLock::new(|| {
try_create_histogram(
"store_beacon_hdiff_buffer_apply_seconds",
"Time taken to apply hdiff buffer to a state buffer",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_COMPUTE_TIME: LazyLock<Result<Histogram>> =
LazyLock::new(|| {
try_create_histogram(
"store_beacon_hdiff_buffer_compute_seconds",
"Time taken to compute hdiff buffer to a state buffer",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_LOAD_TIME: LazyLock<Result<Histogram>> = LazyLock::new(|| {
try_create_histogram(
"store_beacon_hdiff_buffer_load_seconds",
"Time taken to load an hdiff buffer",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_LOAD_FOR_STORE_TIME: LazyLock<Result<Histogram>> =
LazyLock::new(|| {
try_create_histogram(
"store_beacon_hdiff_buffer_load_for_store_seconds",
"Time taken to load an hdiff buffer to store another hdiff",
)
});
pub static STORE_BEACON_HISTORIC_STATE_CACHE_HIT: LazyLock<Result<IntCounter>> =
LazyLock::new(|| {
try_create_int_counter(
"store_beacon_historic_state_cache_hit_total",
"Total count of historic state cache hits for full states",
)
});
pub static STORE_BEACON_HISTORIC_STATE_CACHE_MISS: LazyLock<Result<IntCounter>> =
LazyLock::new(|| {
try_create_int_counter(
"store_beacon_historic_state_cache_miss_total",
"Total count of historic state cache misses for full states",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_CACHE_HIT: LazyLock<Result<IntCounter>> =
LazyLock::new(|| {
try_create_int_counter(
"store_beacon_hdiff_buffer_cache_hit_total",
"Total count of hdiff buffer cache hits",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_CACHE_MISS: LazyLock<Result<IntCounter>> =
LazyLock::new(|| {
try_create_int_counter(
"store_beacon_hdiff_buffer_cache_miss_total",
"Total count of hdiff buffer cache miss",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_INTO_STATE_TIME: LazyLock<Result<Histogram>> =
LazyLock::new(|| {
try_create_histogram(
"store_beacon_hdiff_buffer_into_state_seconds",
"Time taken to recreate a BeaconState from an hdiff buffer",
)
});
pub static STORE_BEACON_HDIFF_BUFFER_FROM_STATE_TIME: LazyLock<Result<Histogram>> =
LazyLock::new(|| {
try_create_histogram(
"store_beacon_hdiff_buffer_from_state_seconds",
"Time taken to create an hdiff buffer from a BeaconState",
)
});
pub static STORE_BEACON_REPLAYED_BLOCKS: LazyLock<Result<IntCounter>> = LazyLock::new(|| {
try_create_int_counter(
"store_beacon_replayed_blocks_total",
"Total count of replayed blocks",
)
});
pub static STORE_BEACON_LOAD_COLD_BLOCKS_TIME: LazyLock<Result<Histogram>> = LazyLock::new(|| {
try_create_histogram(
"store_beacon_load_cold_blocks_time",
"Time spent loading blocks to replay for historic states",
)
});
pub static STORE_BEACON_LOAD_HOT_BLOCKS_TIME: LazyLock<Result<Histogram>> = LazyLock::new(|| {
try_create_histogram(
"store_beacon_load_hot_blocks_time",
"Time spent loading blocks to replay for hot states",
)
});
pub static STORE_BEACON_REPLAY_COLD_BLOCKS_TIME: LazyLock<Result<Histogram>> =
LazyLock::new(|| {
try_create_histogram(
"store_beacon_replay_cold_blocks_time",
"Time spent replaying blocks for historic states",
)
});
pub static STORE_BEACON_COLD_BUILD_BEACON_CACHES_TIME: LazyLock<Result<Histogram>> =
LazyLock::new(|| {
try_create_histogram(
"store_beacon_cold_build_beacon_caches_time",
"Time spent building caches on historic states",
)
});
pub static STORE_BEACON_REPLAY_HOT_BLOCKS_TIME: LazyLock<Result<Histogram>> = LazyLock::new(|| {
try_create_histogram(
"store_beacon_replay_hot_blocks_time",
"Time spent replaying blocks for hot states",
)
});
pub static STORE_BEACON_RECONSTRUCTION_TIME: LazyLock<Result<Histogram>> = LazyLock::new(|| {
try_create_histogram(
"store_beacon_reconstruction_time_seconds",
"Time taken to run a reconstruct historic states batch",
)
});
pub static BEACON_DATA_COLUMNS_CACHE_HIT_COUNT: LazyLock<Result<IntCounter>> =
LazyLock::new(|| {
try_create_int_counter(