Commit Graph

80 Commits

Author SHA1 Message Date
Michael Sproul
9fdd53df56 Hierarchical state diffs (#5978)
* Start extracting freezer changes for tree-states

* Remove unused config args

* Add comments

* Remove unwraps

* Subjective more clear implementation

* Clean up hdiff

* Update xdelta3

* Tree states archive metrics (#6040)

* Add store cache size metrics

* Add compress timer metrics

* Add diff apply compute timer metrics

* Add diff buffer cache hit metrics

* Add hdiff buffer load times

* Add blocks replayed metric

* Move metrics to store

* Future proof some metrics

---------

Co-authored-by: Michael Sproul <michael@sigmaprime.io>

* Port and clean up forwards iterator changes

* Add and polish hierarchy-config flag

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Cleaner errors

* Fix beacon_chain test compilation

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Patch a few more freezer block roots

* Fix genesis block root bug

* Fix test failing due to pending updates

* Beacon chain tests passing

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix doc lint

* Implement DB schema upgrade for hierarchical state diffs (#6193)

* DB upgrade

* Add flag

* Delete RestorePointHash

* Update docs

* Update docs

* Implement hierarchical state diffs config migration (#6245)

* Implement hierarchical state diffs config migration

* Review PR

* Remove TODO

* Set CURRENT_SCHEMA_VERSION correctly

* Fix genesis state loading

* Re-delete some PartialBeaconState stuff

---------

Co-authored-by: Michael Sproul <michael@sigmaprime.io>

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix test compilation

* Update schema downgrade test

* Fix tests

* Fix null anchor migration

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix tree states upgrade migration (#6328)

* Towards crash safety

* Fix compilation

* Move cold summaries and state roots to new columns

* Rename StateRoots chunked field

* Update prune states

* Clean hdiff CLI flag and metrics

* Fix "staged reconstruction"

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix alloy issues

* Fix staged reconstruction logic

* Prevent weird slot drift

* Remove "allow" flag

* Update CLI help

* Remove FIXME about downgrade

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Remove some unnecessary error variants

* Fix new test

* Tree states archive - review comments and metrics (#6386)

* Review PR comments and metrics

* Comments

* Add anchor metrics

* drop prev comment

* Update metadata.rs

* Apply suggestions from code review

---------

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/store/src/hot_cold_store.rs

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Clarify comment and remove anchor_slot garbage

* Simplify database anchor (#6397)

* Simplify database anchor

* Update beacon_node/store/src/reconstruct.rs

* Add migration for anchor

* Fix and simplify light_client store tests

* Fix incompatible config test

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* More metrics

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* New historic state cache (#6475)

* New historic state cache

* Add more metrics

* State cache hit rate metrics

* Fix store metrics

* More logs and metrics

* Fix logger

* Ensure cached states have built caches :O

* Replay blocks in preference to diffing

* Two separate caches

* Distribute cache build time to next slot

* Re-plumb historic-state-cache flag

* Clean up metrics

* Update book

* Update beacon_node/store/src/hdiff.rs

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

* Update beacon_node/store/src/historic_state_cache.rs

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

---------

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>

* Update database docs

* Update diagram

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Update lockbud to work with bindgen/etc

* Correct pkg name for Debian

* Remove vestigial epochs_per_state_diff

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Markdown lint

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Address Jimmy's review comments

* Simplify ReplayFrom case

* Fix and document genesis_state_root

* Typo

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

* Merge branch 'unstable' into tree-states-archive

* Compute diff of validators list manually (#6556)

* Split hdiff computation

* Dedicated logic for historical roots and summaries

* Benchmark against real states

* Mutated source?

* Version the hdiff

* Add lighthouse DB config for hierarchy exponents

* Tidy up hierarchy exponents flag

* Apply suggestions from code review

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Address PR review

* Remove hardcoded paths in benchmarks

* Delete unused function in benches

* lint

---------

Co-authored-by: Michael Sproul <michael@sigmaprime.io>

* Test hdiff binary format stability (#6585)

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Add deprecation warning for SPRP

* Update xdelta to get rid of duplicate deps

* Document test
2024-11-18 01:51:44 +00:00
Eitan Seri-Levi
a94b12b4d5 Persist light client bootstrap (#5915)
* persist light client updates

* update beacon chain to serve light client updates

* resolve todos

* cache best update

* extend cache parts

* is better light client update

* resolve merge conflict

* initial api changes

* add lc update db column

* fmt

* added tests

* add sim

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into persist-light-client-updates

* fix some weird issues with the simulator

* tests

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into persist-light-client-updates

* test changes

* merge conflict

* testing

* started work on ef tests and some code clean up

* update tests

* linting

* noop pre altair, were still failing on electra though

* allow for zeroed light client header

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into persist-light-client-updates

* merge unstable

* remove unwraps

* remove unwraps

* fetch bootstrap without always querying for state

* storing bootstrap parts in db

* mroe code cleanup

* test

* prune sync committee branches from dropped chains

* Update light_client_update.rs

* merge unstable

* move functionality to helper methods

* refactor is best update fn

* refactor is best update fn

* improve organization of light client server cache logic

* fork diget calc, and only spawn as many blcoks as we need for the lc update test

* resovle merge conflict

* add electra bootstrap logic, add logic to cache current sync committee

* add latest sync committe branch cache

* fetch lc update from the cache if it exists

* fmt

* Fix beacon_chain tests

* Add debug code to update ranking_order ef test

* Fix compare code

* merge conflicts

* merge conflict

* add better error messaging

* resolve merge conflicts

* remove lc update from basicsim

* rename sync comittte variable and fix persist condition

* refactor get_light_client_update logic

* add better comments, return helpful error messages over http and rpc

* pruning canonical non checkpoint slots

* fix test

* rerun test

* update pruning logic, add tests

* fix tests

* fix imports

* fmt

* refactor db code

* Refactor db method

* Refactor db method

* add additional comments

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into persist-light-client-bootstrap

* fix merge

* linting

* merge conflict

* prevent overflow

* enable lc server for http api tests

* fix tests

* remove prints

* remove warning

* revert change
2024-09-10 00:27:49 +00:00
Eitan Seri-Levi
99e53b88c3 Migrate from ethereum-types to alloy-primitives (#6078)
* Remove use of ethers_core::RlpStream

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into remove_use_of_ethers_core

* Remove old code

* Simplify keccak call

* Remove unused package

* Merge branch 'unstable' of https://github.com/ethDreamer/lighthouse into remove_use_of_ethers_core

* Merge branch 'unstable' into remove_use_of_ethers_core

* Run clippy

* Merge branch 'remove_use_of_ethers_core' of https://github.com/dospore/lighthouse into remove_use_of_ethers_core

* Check all cargo fmt

* migrate to alloy primitives init

* fix deps

* integrate alloy-primitives

* resolve dep issues

* more changes based on dep changes

* add TODOs

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into remove_use_of_ethers_core

* Revert lock

* Add BeaconBlocksByRange v3

* continue migration

* Revert "Add BeaconBlocksByRange v3"

This reverts commit e3ce7fc5ea.

* impl hash256 extended trait

* revert some uneeded diffs

* merge conflict resolved

* fix subnet id rshift calc

* rename to FixedBytesExtended

* debugging

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* fix failed test

* fixing more tests

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into remove_use_of_ethers_core

* introduce a shim to convert between the two u256 types

* move alloy to wrokspace

* align alloy versions

* update

* update web3signer test certs

* refactor

* resolve failing tests

* linting

* fix graffiti string test

* fmt

* fix ef test

* resolve merge conflicts

* remove udep and revert cert

* cargo patch

* cyclic dep

* fix build error

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* resolve conflicts, update deps

* merge unstable

* fmt

* fix deps

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* resolve merge conflicts

* resolve conflicts, make necessary changes

* Remove patch

* fmt

* remove file

* merge conflicts

* sneaking in a smol change

* bump versions

* Merge remote-tracking branch 'origin/unstable' into migrate-to-alloy-primitives

* Updates for peerDAS

* Update ethereum_hashing to prevent dupe

* updated alloy-consensus, removed TODOs

* cargo update

* endianess fix

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* fmt

* fix merge

* fix test

* fixed_bytes crate

* minor fixes

* convert u256 to i64

* panic free mixin to_low_u64_le

* from_str_radix

* computbe_subnet api and ensuring we use big-endian

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* fix test

* Simplify subnet_id test

* Simplify some more tests

* Add tests to fixed_bytes crate

* Merge branch 'unstable' into migrate-to-alloy-primitives
2024-09-02 08:03:24 +00:00
Eitan Seri-Levi
3913ea44c6 Persist light client updates (#5545)
* persist light client updates

* update beacon chain to serve light client updates

* resolve todos

* cache best update

* extend cache parts

* is better light client update

* resolve merge conflict

* initial api changes

* add lc update db column

* fmt

* added tests

* add sim

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into persist-light-client-updates

* fix some weird issues with the simulator

* tests

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into persist-light-client-updates

* test changes

* merge conflict

* testing

* started work on ef tests and some code clean up

* update tests

* linting

* noop pre altair, were still failing on electra though

* allow for zeroed light client header

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into persist-light-client-updates

* merge unstable

* remove unwraps

* remove unwraps

* Update light_client_update.rs

* merge unstable

* move functionality to helper methods

* refactor is best update fn

* refactor is best update fn

* improve organization of light client server cache logic

* fork diget calc, and only spawn as many blcoks as we need for the lc update test

* fetch lc update from the cache if it exists

* fmt

* Fix beacon_chain tests

* Add debug code to update ranking_order ef test

* Fix compare code

* merge conflicts

* fix test

* Merge branch 'persist-light-client-updates' of https://github.com/eserilev/lighthouse into persist-light-client-updates

* Use blinded blocks for light client proofs

* fix ef test

* merge conflicts

* fix lc update check

* Lint

* resolve merge conflict

* Merge branch 'persist-light-client-updates' of https://github.com/eserilev/lighthouse into persist-light-client-updates

* revert basic sim

* small fix

* revert sim

* Review PR

* resolve merge conflicts

* Merge branch 'unstable' into persist-light-client-updates
2024-08-09 07:36:20 +00:00
Jimmy Chen
0e96d4f105 Store changes to persist data columns (#6073)
* Store changes to persist data columns.

Co-authored-by: dapplion <35266934+dapplion@users.noreply.github.com>

* Update to use `eip7594_fork_epoch` for data column slot in Store.

* Fix formatting.

* Merge branch 'unstable' into data-columns-store

# Conflicts:
#	beacon_node/store/src/lib.rs
#	consensus/types/src/chain_spec.rs

* Minor refactor.

* Merge branch 'unstable' into data-columns-store

# Conflicts:
#	beacon_node/store/src/metrics.rs

* Init data colum info at PeerDAS epoch instead of Deneb fork epoch. Address review comments.

* Remove Deneb-related comments
2024-08-02 06:58:37 +00:00
Lion - dapplion
b0b142c9b7 Track store metrics by DB column (#6041)
* Track store metrics by DB column
2024-07-18 01:45:06 +00:00
Lion - dapplion
880523d8d7 Drop overflow cache (#5891)
* Drop overflow cache

* Update docs

* Update beacon_node/store/src/lib.rs

* Update data_availability_checker.rs

* Lint
2024-07-11 06:10:24 +00:00
Michael Sproul
21f3a191c5 Remove extern crate (#5922)
* Remove extern crate
2024-06-17 15:05:21 +00:00
Michael Sproul
61962898e2 In-memory tree states (#5533)
* Consensus changes

* EF tests

* lcli

* common and watch

* account manager

* cargo

* fork choice

* promise cache

* beacon chain

* interop genesis

* http api

* lighthouse

* op pool

* beacon chain misc

* parallel state cache

* store

* fix issues in store

* IT COMPILES

* Remove some unnecessary module qualification

* Revert Arced pubkey optimization (#5536)

* Merge remote-tracking branch 'origin/unstable' into tree-states-memory

* Fix caching, rebasing and some tests

* Remove unused deps

* Merge remote-tracking branch 'origin/unstable' into tree-states-memory

* Small cleanups

* Revert shuffling cache/promise cache changes

* Fix state advance bugs

* Fix shuffling tests

* Remove some resolved FIXMEs

* Remove StateProcessingStrategy

* Optimise withdrawals calculation

* Don't reorg if state cache is missed

* Remove inconsistent state func

* Fix beta compiler

* Rebase early, rebase often

* Fix state caching behaviour

* Update to milhouse release

* Fix on-disk consensus context format

* Merge remote-tracking branch 'origin/unstable' into tree-states-memory

* Squashed commit of the following:

commit 3a16649023
Author: Michael Sproul <michael@sigmaprime.io>
Date:   Thu Apr 18 14:26:09 2024 +1000

    Fix on-disk consensus context format

* Keep indexed attestations, thanks Sean

* Merge branch 'on-disk-consensus-context' into tree-states-memory

* Merge branch 'unstable' into tree-states-memory

* Address half of Sean's review

* More simplifications from Sean's review

* Cache state after get_advanced_hot_state
2024-04-24 01:22:36 +00:00
Michael Sproul
5a9e973f04 Fix on-disk consensus context format (#5598)
* Fix on-disk consensus context format

* Keep indexed attestations, thanks Sean
2024-04-19 08:18:30 +00:00
Michael Sproul
6f442f2bb8 Improve database compaction and prune-states (#5142)
* Fix no-op state prune check

* Compact freezer DB after pruning

* Refine DB compaction

* Add blobs-db options to inspect/compact

* Better key size

* Fix compaction end key
2024-02-08 10:05:08 +00:00
Jimmy Chen
36d8849813 Add commmand for pruning states (#4835)
## Issue Addressed

Closes #4481. 

(Continuation of #4648)

## Proposed Changes

- [x] Add `lighthouse db prune-states`
- [x] Make it work
- [x] Ensure block roots are handled correctly (to be addressed in 4735)
- [x] Check perf on mainnet/Goerli/Gnosis (takes a few seconds max)
- [x] Run block root healing logic (#4875 ) at the beginning
- [x] Add some tests
- [x] Update docs
- [x] Add `--freezer` flag and other improvements to `lighthouse db inspect`

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-11-03 00:12:19 +00:00
Michael Sproul
c574f8136e Fix block backfill with genesis skip slots (#4820)
## Issue Addressed

Closes #4817.

## Proposed Changes

- Fill in the linear block roots array between 0 and the slot of the first block (e.g. slots 0 and 1 on Holesky).
- Backport the `--freezer`, `--skip` and `--limit` options for `lighthouse db inspect` from tree-states. This allows us to easily view the database corruption of 4817 using `lighthouse db inspect --network holesky --freezer --column bbr --output values --limit 2`.
- Backport the `iter_column_from` change and `MemoryStore` overhaul from tree-states. These are required to enable `lighthouse db inspect`.
- Rework `freezer_upper_limit` to allow state lookups for slots below the `state_lower_limit`. Currently state lookups will fail until state reconstruction completes entirely.

There is a new regression test for the main bug, but no test for the `freezer_upper_limit` fix because we don't currently support running state reconstruction partially (see #3026). This will be fixed once we merge `tree-states`! In lieu of an automated test, I've tested manually on a Holesky node while it was reconstructing.

## Additional Info

Users who backfilled Holesky to slot 0 (e.g. using `--reconstruct-historic-states`) need to either:

- Re-sync from genesis.
- Re-sync using checkpoint sync and the changes from this PR.

Due to the recency of the Holesky genesis, writing a custom pass to fix up broken databases (which would require its own thorough testing) was deemed unnecessary. This is the primary reason for this PR being marked `backwards-incompat`.

This will create few conflicts with Deneb, which I've already resolved on `tree-states-deneb` and will be happy to backport to Deneb once this PR is merged to unstable.
2023-10-27 05:08:49 +00:00
Michael Sproul
9244f7f7bc Improvements to Deneb store upon review (#4693)
* Start testing blob pruning

* Get rid of unnecessary orphaned blob column

* Make random blob tests deterministic

* Test for pruning being blocked by finality

* Fix bugs and test fork boundary

* A few more tweaks to pruning conditions

* Tweak oldest_blob_slot semantics

* Test margin pruning

* Clean up some terminology and lints

* Schema migrations for v18

* Remove FIXME

* Prune blobs on finalization not every slot

* Fix more bugs + tests

* Address review comments
2023-09-25 14:21:54 -04:00
realbigsean
7d468cb487 More deneb cleanup (#4640)
* remove protoc and token from network tests github action

* delete unused beacon chain methods

* downgrade writing blobs to store log

* reduce diff in block import logic

* remove some todo's and deneb built in network

* remove unnecessary error, actually use some added metrics

* remove some metrics, fix missing components on publish funcitonality

* fix status tests

* rename sidecar by root to blobs by root

* clean up some metrics

* remove unnecessary feature gate from attestation subnet tests, clean up blobs by range response code

* pawan's suggestion in `protocol_info`, peer score in matching up batch sync block and blobs

* fix range tests for deneb

* pub block and blob db cache behind the same mutex

* remove unused errs and an empty file

* move sidecar trait to new file

* move types from payload to eth2 crate

* update comment and add flag value name

* make function private again, remove allow unused

* use reth rlp for tx decoding

* fix compile after merge

* rename kzg commitments

* cargo fmt

* remove unused dep

* Update beacon_node/execution_layer/src/lib.rs

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>

* Update beacon_node/beacon_processor/src/lib.rs

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>

* pawan's suggestiong for vec capacity

* cargo fmt

* Revert "use reth rlp for tx decoding"

This reverts commit 5181837d81.

* remove reth rlp

---------

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
2023-08-20 21:17:17 -04:00
Pawan Dhananjay
18760822fe Fix beta compiler warnings 2023-07-14 13:16:48 -07:00
ethDreamer
46db30416d Implement Overflow LRU Cache for Pending Blobs (#4203)
* All Necessary Objects Implement Encode/Decode

* Major Components for LRUOverflowCache Implemented

* Finish Database Code

* Add Maintenance Methods

* Added Maintenance Service

* Persist Blobs on Shutdown / Reload on Startup

* Address Clippy Complaints

* Add (emum_behaviour = "tag") to ssz_derive

* Convert Encode/Decode Implementations to "tag"

* Started Adding Tests

* Added a ton of tests

* 1 character fix

* Feature Guard Minimal Spec Tests

* Update beacon_node/beacon_chain/src/data_availability_checker.rs

Co-authored-by: realbigsean <seananderson33@GMAIL.com>

* Address Sean's Comments

* Add iter_raw_keys method

* Remove TODOs

---------

Co-authored-by: realbigsean <seananderson33@GMAIL.com>
2023-05-12 10:08:24 -04:00
ethDreamer
d84117c0d0 Removed TODO (#4128) 2023-03-24 17:33:49 -04:00
Pawan Dhananjay
b276af98b7 Rework block processing (#4092)
* introduce availability pending block

* add intoavailableblock trait

* small fixes

* add 'gossip blob cache' and start to clean up processing and transition types

* shard memory blob cache

* Initial commit

* Fix after rebase

* Add gossip verification conditions

* cache cleanup

* general chaos

* extended chaos

* cargo fmt

* more progress

* more progress

* tons of changes, just tryna compile

* everything, everywhere, all at once

* Reprocess an ExecutedBlock on unavailable blobs

* Add sus gossip verification for blobs

* Merge stuff

* Remove reprocessing cache stuff

* lint

* Add a wrapper to allow construction of only valid `AvailableBlock`s

* rename blob arc list to blob list

* merge cleanuo

* Revert "merge cleanuo"

This reverts commit 5e98326878.

* Revert "Revert "merge cleanuo""

This reverts commit 3a4009443a.

* fix rpc methods

* move beacon block and blob to eth2/types

* rename gossip blob cache to data availability checker

* lots of changes

* fix some compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* cargo fmt

* use a common data structure for block import types

* fix availability check on proposal import

* refactor the blob cache and split the block wrapper into two types

* add type conversion for signed block and block wrapper

* fix beacon chain tests and do some renaming, add some comments

* Partial processing (#4)

* move beacon block and blob to eth2/types

* rename gossip blob cache to data availability checker

* lots of changes

* fix some compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* cargo fmt

* use a common data structure for block import types

* fix availability check on proposal import

* refactor the blob cache and split the block wrapper into two types

* add type conversion for signed block and block wrapper

* fix beacon chain tests and do some renaming, add some comments

* cargo update (#6)

---------

Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: realbigsean <seananderson33@gmail.com>
2023-03-24 17:30:41 -04:00
ethDreamer
d1e653cfdb Update Blob Storage Structure (#4104)
* Initial Changes to Blob Storage

* Add Arc to SignedBlobSidecar Definition
2023-03-21 15:33:06 -04:00
Emilia Hane
89cccfc397 Fix rebase conflicts 2023-02-09 07:46:25 +01:00
Emilia Hane
f971f3a3a2 Fix rebase conflicts 2023-02-09 07:41:38 +01:00
Michael Sproul
ac4b5b580c Fix regression in DB write atomicity 2023-02-08 11:44:45 +01:00
Emilia Hane
6dff69bde9 Atomically update blob info with pruned blobs 2023-02-08 11:44:42 +01:00
Emilia Hane
756c881857 Keep uniform size small keys
Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-02-08 11:44:40 +01:00
Emilia Hane
d58a30b3de fixup! Store orphan block roots 2023-02-08 11:44:36 +01:00
Emilia Hane
8752deeced Store orphan block roots 2023-02-08 11:44:35 +01:00
Emilia Hane
7bf88c2336 Prune blobs before data availability breakpoint 2023-02-08 11:44:30 +01:00
Emilia Hane
e2a6da4274 Boiler plate code for blobs pruning 2023-02-08 11:44:20 +01:00
realbigsean
438126f19a merge upstream, fix compile errors 2023-01-11 13:52:58 -05:00
realbigsean
98b11bbd3f add historical summaries (#3865)
* add historical summaries

* fix tree hash caching, disable the sanity slots test with fake crypto

* add ssz static HistoricalSummary

* only store historical summaries after capella

* Teach `UpdatePattern` about Capella

* Tidy EF tests

* Clippy

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-01-11 12:40:21 +11:00
Divma
240854750c cleanup: remove unused imports, unusued fields (#3834) 2022-12-23 17:16:10 -05:00
Pawan Dhananjay
29f2ec46d3 Couple blocks and blobs in gossip (#3670)
* Revert "Add more gossip verification conditions"

This reverts commit 1430b561c3.

* Revert "Add todos"

This reverts commit 91efb9d4c7.

* Revert "Reprocess blob sidecar messages"

This reverts commit 21bf3d37cd.

* Add the coupled topic

* Decode SignedBeaconBlockAndBlobsSidecar correctly

* Process Block and Blobs in beacon processor

* Remove extra blob publishing logic from vc

* Remove blob signing in vc

* Ugly hack to compile
2022-11-01 10:28:21 -04:00
realbigsean
ba16a037a3 cleanup 2022-10-04 09:34:05 -04:00
realbigsean
e81dbbfea4 compile 2022-10-03 21:48:02 -04:00
Daniel Knopik
750c594f5f forgor something 2022-09-17 21:38:57 +02:00
Daniel Knopik
76572db9d5 add network config 2022-09-17 20:55:21 +02:00
Daniel Knopik
d4d40be870 storable blobs 2022-09-17 15:58:52 +02:00
ethDreamer
034260bd99 Initial Commit of Retrospective OTB Verification (#3372)
## Issue Addressed

* #2983 

## Proposed Changes

Basically followed the [instructions laid out here](https://github.com/sigp/lighthouse/issues/2983#issuecomment-1062494947)


Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
2022-07-30 00:22:38 +00:00
Paul Hauner
be4e261e74 Use async code when interacting with EL (#3244)
## Overview

This rather extensive PR achieves two primary goals:

1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state.
2. Refactors fork choice, block production and block processing to `async` functions.

Additionally, it achieves:

- Concurrent forkchoice updates to the EL and cache pruning after a new head is selected.
- Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production.
- Concurrent per-block-processing and execution payload verification during block processing.
- The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?):
    - I had to do this to deal with sending blocks into spawned tasks.
    - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones.
    - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap.
    - Avoids cloning *all the blocks* in *every chain segment* during sync.
    - It also has the potential to clean up our code where we need to pass an *owned* block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough 😅)
- The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs.

For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273

## Changes to `canonical_head` and `fork_choice`

Previously, the `BeaconChain` had two separate fields:

```
canonical_head: RwLock<Snapshot>,
fork_choice: RwLock<BeaconForkChoice>
```

Now, we have grouped these values under a single struct:

```
canonical_head: CanonicalHead {
  cached_head: RwLock<Arc<Snapshot>>,
  fork_choice: RwLock<BeaconForkChoice>
} 
```

Apart from ergonomics, the only *actual* change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously.

## Breaking Changes

### The `state` (root) field in the `finalized_checkpoint` SSE event

Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event:

1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`.
4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots.

Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](de2b2801c8/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java (L171-L182)) it uses [`getStateRootFromBlockRoot`](de2b2801c8/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java (L336-L341)) which uses (1).

I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku.

## Notes for Reviewers

I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct.

I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking".

I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run *at least once* means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows *when* it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it.

I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around.

Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2.

You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests:
- Changing tests to be `tokio::async` tests.
- Adding `.await` to fork choice, block processing and block production functions.
- Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`.
- Wrapping `SignedBeaconBlock` in an `Arc`.
- In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant.

I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic.

Co-authored-by: Mac L <mjladson@pm.me>
2022-07-03 05:36:50 +00:00
Michael Sproul
bcdd960ab1 Separate execution payloads in the DB (#3157)
## Proposed Changes

Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database.

⚠️ **This is achieved in a backwards-incompatible way for networks that have already merged** ⚠️. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins.

The main changes are:

- New column in the database called `ExecPayload`, keyed by beacon block root.
- The `BeaconBlock` column now stores blinded blocks only.
- Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc.
- On finalization:
    - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks.
    - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states.
- Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134.
- The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call.
   - I've tested manually that it works on Kiln, using Geth and Nethermind.
   - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146.
   - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134.
- Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed.
- Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated).

## Additional Info

- [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller.
- [x] We should measure the latency of blocks-by-root and blocks-by-range responses.
- [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159)
- [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks.

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-05-12 00:42:17 +00:00
Michael Sproul
41e7a07c51 Add lighthouse db command (#3129)
## Proposed Changes

Add a `lighthouse db` command with three initial subcommands:

- `lighthouse db version`: print the database schema version.
- `lighthouse db migrate --to N`: manually upgrade (or downgrade!) the database to a different version.
- `lighthouse db inspect --column C`: log the key and size in bytes of every value in a given `DBColumn`.

This PR lays the groundwork for other changes, namely:

- Mark's fast-deposit sync (https://github.com/sigp/lighthouse/pull/2915), for which I think we should implement a database downgrade (from v9 to v8).
- My `tree-states` work, which already implements a downgrade (v10 to v8).
- Standalone purge commands like `lighthouse db purge-dht` per https://github.com/sigp/lighthouse/issues/2824.

## Additional Info

I updated the `strum` crate to 0.24.0, which necessitated some changes in the network code to remove calls to deprecated methods.

Thanks to @winksaville for the motivation, and implementation work that I used as a source of inspiration (https://github.com/sigp/lighthouse/pull/2685).
2022-04-01 00:58:59 +00:00
Michael Sproul
a290a3c537 Add configurable block replayer (#2863)
## Issue Addressed

Successor to #2431

## Proposed Changes

* Add a `BlockReplayer` struct to abstract over the intricacies of calling `per_slot_processing` and `per_block_processing` while avoiding unnecessary tree hashing.
* Add a variant of the forwards state root iterator that does not require an `end_state`.
* Use the `BlockReplayer` when reconstructing states in the database. Use the efficient forwards iterator for frozen states.
* Refactor the iterators to remove `Arc<HotColdDB>` (this seems to be neater than making _everything_ an `Arc<HotColdDB>` as I did in #2431).

Supplying the state roots allow us to avoid building a tree hash cache at all when reconstructing historic states, which saves around 1 second flat (regardless of `slots-per-restore-point`). This is a small percentage of worst-case state load times with 200K validators and SPRP=2048 (~15s vs ~16s) but a significant speed-up for more frequent restore points: state loads with SPRP=32 should be now consistently <500ms instead of 1.5s (a ~3x speedup).

## Additional Info

Required by https://github.com/sigp/lighthouse/pull/2628
2021-12-21 06:30:52 +00:00
Michael Sproul
ed1fc7cca6 Fix I/O atomicity issues with checkpoint sync (#2671)
## Issue Addressed

This PR addresses an issue found by @YorickDowne during testing of v2.0.0-rc.0.

Due to a lack of atomic database writes on checkpoint sync start-up, it was possible for the database to get into an inconsistent state from which it couldn't recover without `--purge-db`. The core of the issue was that the store's anchor info was being stored _before_ the `PersistedBeaconChain`. If a crash occured so that anchor info was stored but _not_ the `PersistedBeaconChain`, then on restart Lighthouse would think the database was unitialized and attempt to compare-and-swap a `None` value, but would actually find the stale info from the previous run.

## Proposed Changes

The issue is fixed by writing the anchor info, the split point, and the `PersistedBeaconChain` atomically on start-up. Some type-hinting ugliness was required, which could possibly be cleaned up in future refactors.
2021-10-05 03:53:17 +00:00
Michael Sproul
9667dc2f03 Implement checkpoint sync (#2244)
## Issue Addressed

Closes #1891
Closes #1784

## Proposed Changes

Implement checkpoint sync for Lighthouse, enabling it to start from a weak subjectivity checkpoint.

## Additional Info

- [x] Return unavailable status for out-of-range blocks requested by peers (#2561)
- [x] Implement sync daemon for fetching historical blocks (#2561)
- [x] Verify chain hashes (either in `historical_blocks.rs` or the calling module)
- [x] Consistency check for initial block + state
- [x] Fetch the initial state and block from a beacon node HTTP endpoint
- [x] Don't crash fetching beacon states by slot from the API
- [x] Background service for state reconstruction, triggered by CLI flag or API call.

Considered out of scope for this PR:

- Drop the requirement to provide the `--checkpoint-block` (this would require some pretty heavy refactoring of block verification)


Co-authored-by: Diva M <divma@protonmail.com>
2021-09-22 00:37:28 +00:00
realbigsean
303deb9969 Rust 1.54.0 lints (#2483)
## Issue Addressed

N/A

## Proposed Changes

- Removing a bunch of unnecessary references
- Updated `Error::VariantError` to `Error::Variant`
- There were additional enum variant lints that I ignored, because I thought our variant names were fine
- removed `MonitoredValidator`'s `pubkey` field, because I couldn't find it used anywhere. It looks like we just use the string version of the pubkey (the `id` field) if there is no index

## Additional Info



Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-07-30 01:11:47 +00:00
realbigsean
b84ff9f793 rust 1.53.0 updates (#2411)
## Issue Addressed

`make lint` failing on rust 1.53.0.

## Proposed Changes

1.53.0 updates

## Additional Info

I haven't figure out why yet, we were now hitting the recursion limit in a few crates. So I had to add `#![recursion_limit = "256"]` in a few places


Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2021-06-18 05:58:01 +00:00
Pawan Dhananjay
fdaeec631b Monitoring service api (#2251)
## Issue Addressed

N/A

## Proposed Changes

Adds a client side api for collecting system and process metrics and pushing it to a monitoring service.
2021-05-26 05:58:41 +00:00
Michael Sproul
363f15f362 Use the database to persist the pubkey cache (#2234)
## Issue Addressed

Closes #1787

## Proposed Changes

* Abstract the `ValidatorPubkeyCache` over a "backing" which is either a file (legacy), or the database.
* Implement a migration from schema v2 to schema v3, whereby the contents of the cache file are copied to the DB, and then the file is deleted. The next release to include this change must be a minor version bump, and we will need to warn users of the inability to downgrade (this is our first DB schema change since mainnet genesis).
* Move the schema migration code from the `store` crate into the `beacon_chain` crate so that it can access the datadir and the `ValidatorPubkeyCache`, etc. It gets injected back into the `store` via a closure (similar to what we do in fork choice).
2021-03-04 01:25:12 +00:00
Michael Sproul
556190ff46 Compact database on finalization (#1871)
## Issue Addressed

Closes #1866

## Proposed Changes

* Compact the database on finalization. This removes the deleted states from disk completely. Because it happens in the background migrator, it doesn't block other database operations while it runs. On my Medalla node it took about 1 minute and shrank the database from 90GB to 9GB.
* Fix an inefficiency in the pruning algorithm where it would always use the genesis checkpoint as the `old_finalized_checkpoint` when running for the first time after start-up. This would result in loading lots of states one-at-a-time back to genesis, and storing a lot of block roots in memory. The new code stores the old finalized checkpoint on disk and only uses genesis if no checkpoint is already stored. This makes it both backwards compatible _and_ forwards compatible -- no schema change required!
* Introduce two new `INFO` logs to indicate when pruning has started and completed. Users seem to want to know this information without enabling debug logs!
2020-11-09 07:02:21 +00:00