Commit Graph

81 Commits

Author SHA1 Message Date
Michael Sproul
38fdaf791c Fix proposer shuffling decision slot at boundary (#8128)
Follow-up to the bug fixed in:

- https://github.com/sigp/lighthouse/pull/8121

This fixes the root cause of that bug, which was introduced by me in:

- https://github.com/sigp/lighthouse/pull/8101

Lion identified the issue here:

- https://github.com/sigp/lighthouse/pull/8101#discussion_r2382710356


  In the methods that compute the proposer shuffling decision root, ensure we don't use lookahead for the Fulu fork epoch itself. This is accomplished by checking if Fulu is enabled at `epoch - 1`, i.e. if `epoch > fulu_fork_epoch`.

I haven't updated the methods that _compute_ shufflings to use these new corrected bounds (e.g. `BeaconState::compute_proposer_indices`), although we could make this change in future. The `get_beacon_proposer_indices` method already gracefully handles the Fulu boundary case by using the `proposer_lookahead` field (if initialised).


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-09-29 01:13:33 +00:00
Michael Sproul
c754234b2c Fix bugs in proposer calculation post-Fulu (#8101)
As identified by a researcher during the Fusaka security competition, we were computing the proposer index incorrectly in some places by computing without lookahead.


  - [x] Add "low level" checks to computation functions in `consensus/types` to ensure they error cleanly
- [x] Re-work the determination of proposer shuffling decision roots, which are now fork aware.
- [x] Re-work and simplify the beacon proposer cache to be fork-aware.
- [x] Optimise `with_proposer_cache` to use `OnceCell`.
- [x] All tests passing.
- [x] Resolve all remaining `FIXME(sproul)`s.
- [x] Unit tests for `ProtoBlock::proposer_shuffling_root_for_child_block`.
- [x] End-to-end regression test.
- [x] Test on pre-Fulu network.
- [x] Test on post-Fulu network.


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-09-26 14:44:50 +00:00
Michael Sproul
836c39efaa Shrink persisted fork choice data (#7805)
Closes:

- https://github.com/sigp/lighthouse/issues/7760


  - [x] Remove `balances_cache` from `PersistedForkChoiceStore` (~65 MB saving on mainnet)
- [x] Remove `justified_balances` from `PersistedForkChoiceStore` (~16 MB saving on mainnet)
- [x] Remove `balances` from `ProtoArray`/`SszContainer`.
- [x] Implement zstd compression for votes
- [x] Fix bug in justified state usage
- [x] Bump schema version to V28 and implement migration.
2025-08-18 06:03:28 +00:00
chonghe
522bd9e9c6 Update Rust Edition to 2024 (#7766)
* #7749

Thanks @dknopik and @michaelsproul for your help!
2025-08-13 03:04:31 +00:00
chonghe
8e3c5d1524 Rust 1.89 compiler lint fix (#7644)
Fix lints for Rust 1.89 beta compiler
2025-06-25 05:33:17 +00:00
Mac L
39eb8145f8 Merge branch 'release-v7.0.0' into unstable 2025-04-11 21:32:24 +10:00
Lion - dapplion
70850fe58d Drop head tracker for summaries DAG (#6744)
The head tracker is a persisted piece of state that must be kept in sync with the fork-choice. It has been a source of pruning issues in the past, so we want to remove it
- see https://github.com/sigp/lighthouse/issues/1785

When implementing tree-states in the hot DB we have to change the pruning routine (more details below) so we want to do those changes first in isolation.
- see https://github.com/sigp/lighthouse/issues/6580
- If you want to see the full feature of tree-states hot https://github.com/dapplion/lighthouse/pull/39

Closes https://github.com/sigp/lighthouse/issues/1785


  **Current DB migration routine**

- Locate abandoned heads with head tracker
- Use a roots iterator to collect the ancestors of those heads can be pruned
- Delete those abandoned blocks / states
- Migrate the newly finalized chain to the freezer

In summary, it computes what it has to delete and keeps the rest. Then it migrates data to the freezer. If the abandoned forks routine has a bug it can break the freezer migration.

**Proposed migration routine (this PR)**

- Migrate the newly finalized chain to the freezer
- Load all state summaries from disk
- From those, just knowing the head and finalized block compute two sets: (1) descendants of finalized (2) newly finalized chain
- Iterate all summaries, if a summary does not belong to set (1) or (2), delete

This strategy is more sound as it just checks what's there in the hot DB, computes what it has to keep and deletes the rest. Because it does not rely and 3rd pieces of data we can drop the head tracker and pruning checkpoint. Since the DB migration happens **first** now, as long as the computation of the sets to keep is correct we won't have pruning issues.
2025-04-07 04:23:52 +00:00
Lion - dapplion
d511ca0494 Compute roots for unfinalized by_range requests with fork-choice (#7098)
Includes PRs

- https://github.com/sigp/lighthouse/pull/7058
- https://github.com/sigp/lighthouse/pull/7066

Cleaner for the `release-v7.0.0` branch
2025-04-07 03:16:41 +00:00
Mac L
82d1674455 Rust 1.86.0 lints (#7254)
Implement lints for the new Rust compiler version 1.86.0.
2025-04-04 02:30:22 +00:00
Pawan Dhananjay
1f6850fae2 Rust 1.84 lints (#6781)
* Fix few lints

* Fix remaining lints

* Use fully qualified syntax
2025-01-10 01:13:29 +00:00
Mac L
b2b1faad4e Enforce alphabetically ordered cargo deps (#6678)
* Enforce alphabetically ordered cargo deps

* Fix test-suite

* Another CI fix

* Merge branch 'unstable' into cargo-sort

* Fix conflicts

* Merge remote-tracking branch 'origin/unstable' into cargo-sort
2024-12-19 05:46:03 +00:00
Eitan Seri-Levi
99e53b88c3 Migrate from ethereum-types to alloy-primitives (#6078)
* Remove use of ethers_core::RlpStream

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into remove_use_of_ethers_core

* Remove old code

* Simplify keccak call

* Remove unused package

* Merge branch 'unstable' of https://github.com/ethDreamer/lighthouse into remove_use_of_ethers_core

* Merge branch 'unstable' into remove_use_of_ethers_core

* Run clippy

* Merge branch 'remove_use_of_ethers_core' of https://github.com/dospore/lighthouse into remove_use_of_ethers_core

* Check all cargo fmt

* migrate to alloy primitives init

* fix deps

* integrate alloy-primitives

* resolve dep issues

* more changes based on dep changes

* add TODOs

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into remove_use_of_ethers_core

* Revert lock

* Add BeaconBlocksByRange v3

* continue migration

* Revert "Add BeaconBlocksByRange v3"

This reverts commit e3ce7fc5ea.

* impl hash256 extended trait

* revert some uneeded diffs

* merge conflict resolved

* fix subnet id rshift calc

* rename to FixedBytesExtended

* debugging

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* fix failed test

* fixing more tests

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into remove_use_of_ethers_core

* introduce a shim to convert between the two u256 types

* move alloy to wrokspace

* align alloy versions

* update

* update web3signer test certs

* refactor

* resolve failing tests

* linting

* fix graffiti string test

* fmt

* fix ef test

* resolve merge conflicts

* remove udep and revert cert

* cargo patch

* cyclic dep

* fix build error

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* resolve conflicts, update deps

* merge unstable

* fmt

* fix deps

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* resolve merge conflicts

* resolve conflicts, make necessary changes

* Remove patch

* fmt

* remove file

* merge conflicts

* sneaking in a smol change

* bump versions

* Merge remote-tracking branch 'origin/unstable' into migrate-to-alloy-primitives

* Updates for peerDAS

* Update ethereum_hashing to prevent dupe

* updated alloy-consensus, removed TODOs

* cargo update

* endianess fix

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* fmt

* fix merge

* fix test

* fixed_bytes crate

* minor fixes

* convert u256 to i64

* panic free mixin to_low_u64_le

* from_str_radix

* computbe_subnet api and ensuring we use big-endian

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into migrate-to-alloy-primitives

* fix test

* Simplify subnet_id test

* Simplify some more tests

* Add tests to fixed_bytes crate

* Merge branch 'unstable' into migrate-to-alloy-primitives
2024-09-02 08:03:24 +00:00
Mac L
cc55e610b9 Rust 1.80.0 lints (#6183)
* Fix lints
2024-07-25 15:56:22 +00:00
Michael Sproul
9942c18c11 Delete old database schemas (#6051)
* Delete old database schemas

* Fix docs (thanks CK)

* Fix beacon-chain tests
2024-07-09 00:24:49 +00:00
kevaundray
a87f19d801 chore: change impl Into<T> for U to impl From<U> for T (#5948)
* chore: Change Into trait impl for KzgProof to From trait impl

* chore: change `impl Into <T> for U` to `impl From<U> for T`

* chore: remove `from-over-into` clippy lint exception
2024-06-19 05:02:26 +00:00
realbigsean
a74098044a Rust 1.79 lints (#5927)
* max_value -> MAX

* remove unnecesary closures

* a couple more max_value -> MAX

* a couple more max_value -> MAX

* Revert "a couple more max_value -> MAX"

This reverts commit 807fe7cae9.

* unused spec field -> phantom data

* ignore some dead code warnings

* update kurtosis repo location
2024-06-13 23:04:30 +00:00
Eitan Seri-Levi
ee69e14db9 Add is_parent_strong proposer re-org check (#5417)
* initial fork choice additions

* add helper fns

* add is_parent_strong

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into add_is_parent_strong_check

* disabling proposer reorg should set parent_threshold to u64 max

* add new flag, is_parent_strong check in override fcu params

* cherry-pick changes

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into add_is_parent_strong_check

* cleanup

* fmt

* Minor review tweaks
2024-04-04 19:38:06 +00:00
Mac L
969d12dc6f Use E for EthSpec globally (#5264)
* Use `E` for `EthSpec` globally

* Fix tests

* Merge branch 'unstable' into e-ethspec

* Merge branch 'unstable' into e-ethspec

# Conflicts:
#	beacon_node/execution_layer/src/engine_api.rs
#	beacon_node/execution_layer/src/engine_api/http.rs
#	beacon_node/execution_layer/src/engine_api/json_structures.rs
#	beacon_node/execution_layer/src/test_utils/handle_rpc.rs
#	beacon_node/store/src/partial_beacon_state.rs
#	consensus/types/src/beacon_block.rs
#	consensus/types/src/beacon_block_body.rs
#	consensus/types/src/beacon_state.rs
#	consensus/types/src/config_and_preset.rs
#	consensus/types/src/execution_payload.rs
#	consensus/types/src/execution_payload_header.rs
#	consensus/types/src/light_client_optimistic_update.rs
#	consensus/types/src/payload.rs
#	lcli/src/parse_ssz.rs
2024-04-02 15:12:25 +00:00
Eitan Seri-Levi
01ec42e75a Fix Rust beta compiler errors 1.78.0-beta.1 (#5439)
* remove redundant imports

* fix test

* contains key

* fmt

* Merge branch 'unstable' into fix-beta-compiler
2024-03-20 05:17:02 +00:00
realbigsean
4172d9f75c Update to consensus spec v1.4.0-beta.6 (#5094)
* get latest ef tests passing

* fix tests

* Fix invalid payload recovery tests

* Merge branch 'unstable' into update-to-spec-v1.4.0-beta.6

* Revert "fix tests"

This reverts commit 0c875b02e0.

* Fix fork choice def. tests

* Update beacon_node/beacon_chain/tests/payload_invalidation.rs
2024-02-08 18:08:21 +00:00
Pawan Dhananjay
31044402ee Sidecar inclusion proof (#4900)
* Refactor BlobSidecar to new type

* Fix some compile errors

* Gossip verification compiles

* Fix http api types take 1

* Fix another round of compile errors

* Beacon node crate compiles

* EF tests compile

* Remove all blob signing from VC

* fmt

* Tests compile

* Fix some tests

* Fix more http tests

* get compiling

* Fix gossip conditions and tests

* Add basic proof generation and verification

* remove unnecessary ssz decode

* add back build_sidecar

* remove default at fork for blobs

* fix beacon chain tests

* get relase tests compiling

* fix lints

* fix existing spec tests

* add new ef tests

* fix gossip duplicate rule

* lints

* add back sidecar signature check in gossip

* add finalized descendant check to blob sidecar gossip

* fix error conversion

* fix release tests

* sidecar inclusion self review cleanup

* Add proof verification and computation metrics

* Remove accidentally committed file

* Unify some block and blob errors; add slashing conditions for sidecars

* Address review comment

* Clean up re-org tests (#4957)

* Address more review comments

* Add Comments & Eliminate Unnecessary Clones

* update names

* Update beacon_node/beacon_chain/src/metrics.rs

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

* Update beacon_node/network/src/network_beacon_processor/tests.rs

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

* pr feedback

* fix test compile

* Sidecar Inclusion proof small refactor and updates (#4967)

* Update some comments, variables and small cosmetic fixes.

* Couple blobs and proofs into a tuple in `PayloadAndBlobs` for simplicity and safety.

* Update function comment.

* Update testing/ef_tests/src/cases/merkle_proof_validity.rs

Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>

* Rename the block and blob wrapper types used in the beacon API interfaces.

* make sure gossip invalid blobs are passed to the slasher (#4970)

* Add blob headers to slasher before adding to DA checker

* Replace Vec with HashSet in BlockQueue

* fmt

* Rename gindex -> index

* Simplify gossip condition

---------

Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>
2023-12-05 11:19:59 -05:00
Eitan Seri-Levi
4ce01ddd11 Activate clippy::manual_let_else lint (#4889)
## Issue Addressed

#4888

## Proposed Changes

Enabled `clippy::manual_let_else` lint and resolved the warning messages.
2023-10-31 10:31:02 +00:00
realbigsean
4555e33048 Remove serde derive references (#4830)
* remove remaining uses of serde_derive

* fix lockfile

---------

Co-authored-by: João Oliveira <hello@jxs.pt>
2023-10-11 13:01:30 -04:00
João Oliveira
dcd69dfc62 Move dependencies to workspace (#4650)
## Issue Addressed

Synchronize dependencies and edition on the workspace `Cargo.toml`

## Proposed Changes

with https://github.com/rust-lang/cargo/issues/8415 merged it's now possible to synchronize details on the workspace `Cargo.toml` like the metadata and dependencies.
By only having dependencies that are shared between multiple crates aligned on the workspace `Cargo.toml` it's easier to not miss duplicate versions of the same dependency and therefore ease on the compile times.

## Additional Info
this PR also removes the no longer required direct dependency of the `serde_derive` crate.

should be reviewed after https://github.com/sigp/lighthouse/pull/4639 get's merged.
closes https://github.com/sigp/lighthouse/issues/4651


Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-09-22 04:30:56 +00:00
Michael Sproul
20067b9465 Remove checkpoint alignment requirements and enable historic state pruning (#4610)
## Issue Addressed

Closes #3210
Closes #3211

## Proposed Changes

- Checkpoint sync from the latest finalized state regardless of its alignment.
- Add the `block_root` to the database's split point. This is _only_ added to the in-memory split in order to avoid a schema migration. See `load_split`.
- Add a new method to the DB called `get_advanced_state`, which looks up a state _by block root_, with a `state_root` as fallback. Using this method prevents accidental accesses of the split's unadvanced state, which does not exist in the hot DB and is not guaranteed to exist in the freezer DB at all. Previously Lighthouse would look up this state _from the freezer DB_, even if it was required for block/attestation processing, which was suboptimal.
- Replace several state look-ups in block and attestation processing with `get_advanced_state` so that they can't hit the split block's unadvanced state.
- Do not store any states in the freezer database by default. All states will be deleted upon being evicted from the hot database unless `--reconstruct-historic-states` is set. The anchor info which was previously used for checkpoint sync is used to implement this, including when syncing from genesis.

## Additional Info

Needs further testing. I want to stress-test the pruned database under Hydra.

The `get_advanced_state` method is intended to become more relevant over time: `tree-states` includes an identically named method that returns advanced states from its in-memory cache.

Co-authored-by: realbigsean <seananderson33@gmail.com>
2023-08-21 05:02:32 +00:00
zhiqiangxu
dfab24bf92 opt maybe_update_best_child_and_descendant: remove an impossible case (#4583)
Here `child.weight == best_child.weight` is impossible since it's already checked [above](dfcb3363c7/consensus/proto_array/src/proto_array.rs (L878)).
2023-08-14 03:16:04 +00:00
zhiqiangxu
f1ac12f23a Fix some typos (#4565) 2023-08-14 00:29:43 +00:00
Mac L
3c029d48bf DB migration for fork choice cleanup (#4265)
## Issue Addressed

#4233

## Proposed Changes

Remove the `best_justified_checkpoint` from the `PersistedForkChoiceStore` type as it is now unused.
Additionally, remove the `Option`'s wrapping the `justified_checkpoint` and `finalized_checkpoint` fields on `ProtoNode` which were only present to facilitate a previous migration.

Include the necessary code to facilitate the migration to a new DB schema.
2023-05-15 02:10:42 +00:00
Eitan Seri-Levi
b1416c8a43 simplify calculate_committee_fraction (#4213)
## Issue Addressed

[#4211](https://github.com/sigp/lighthouse/issues/4211)

## Proposed Changes

This PR conforms the helper function `calculate_committee_fraction` to the [v1.3.0 spec](https://github.com/ethereum/consensus-specs/blob/v1.3.0/specs/phase0/fork-choice.md#get_weight)

## Additional Info

the old definition of `calculate_committee_fraction` is almost identical, but the new definition is simpler.
2023-05-03 09:02:58 +00:00
Michael Sproul
c11638c36c Split common crates out into their own repos (#3890)
## Proposed Changes

Split out several crates which now exist in separate repos under `sigp`.

- [`ssz` and `ssz_derive`](https://github.com/sigp/ethereum_ssz)
- [`tree_hash` and `tree_hash_derive`](https://github.com/sigp/tree_hash)
- [`ethereum_hashing`](https://github.com/sigp/ethereum_hashing)
- [`ethereum_serde_utils`](https://github.com/sigp/ethereum_serde_utils)
- [`ssz_types`](https://github.com/sigp/ssz_types)

For the published crates see: https://crates.io/teams/github:sigp:crates-io?sort=recent-updates.

## Additional Info

- [x] Need to work out how to handle versioning. I was hoping to do 1.0 versions of several crates, but if they depend on `ethereum-types 0.x` that is not going to work. EDIT: decided to go with 0.5.x versions.
- [x] Need to port several changes from `tree-states`, `capella`, `eip4844` branches to the external repos.
2023-04-28 01:15:40 +00:00
Michael Sproul
b90c0c3fb1 Make re-org strat more cautious and add more config (#4151)
## Proposed Changes

This change attempts to prevent failed re-orgs by:

1. Lowering the re-org cutoff from 2s to 1s. This is informed by a failed re-org attempted by @yorickdowne's node. The failed block was requested in the 1.5-2s window due to a Vouch failure, and failed to propagate to the majority of the network before the attestation deadline at 4s.
2. Allow users to adjust their re-org cutoff depending on observed network conditions and their risk profile. The static 2 second cutoff was too rigid.
3. Add a `--proposer-reorg-disallowed-offsets` flag which can be used to prohibit reorgs at certain slots. This is intended to help workaround an issue whereby reorging blocks at slot 1 are currently taking ~1.6s to propagate on gossip rather than ~500ms. This is suspected to be due to a cache miss in current versions of Prysm, which should be fixed in their next release.

## Additional Info

I'm of two minds about removing the `shuffling_stable` check which checks for blocks at slot 0 in the epoch. If we removed it users would be able to configure Lighthouse to try reorging at slot 0, which likely wouldn't work very well due to interactions with the proposer index cache. I think we could leave it for now and revisit it later.
2023-04-13 07:05:01 +00:00
Christopher Chong
6bb28bc806 Add debug fork choice api (#4003)
## Issue Addressed

Which issue # does this PR address?
https://github.com/sigp/lighthouse/issues/3669

## Proposed Changes

Please list or describe the changes introduced by this PR.
- A new API to fetch fork choice data, as specified [here](https://github.com/ethereum/beacon-APIs/pull/232)
- A new integration test to test the new API

## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.

- `extra_data` field specified in the beacon-API spec is not implemented, please let me know if I should instead.

Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-03-29 02:56:37 +00:00
Paul Hauner
23ea1481e0 Fix fork choice error message (#4122)
## Issue Addressed

NA

## Proposed Changes

Ensures that we log the values of the *head* block rather than the *justified* block.

## Additional Info

NA
2023-03-23 07:16:49 +00:00
Paul Hauner
1f8c17b530 Fork choice modifications and cleanup (#3962)
## Issue Addressed

NA

## Proposed Changes

- Implements https://github.com/ethereum/consensus-specs/pull/3290/
- Bumps `ef-tests` to [v1.3.0-rc.4](https://github.com/ethereum/consensus-spec-tests/releases/tag/v1.3.0-rc.4).

The `CountRealizedFull` concept has been removed and the `--count-unrealized-full` and `--count-unrealized` BN flags now do nothing but log a `WARN` when used.

## Database Migration Debt

This PR removes the `best_justified_checkpoint` from fork choice. This field is persisted on-disk and the correct way to go about this would be to make a DB migration to remove the field. However, in this PR I've simply stubbed out the value with a junk value. I've taken this approach because if we're going to do a DB migration I'd love to remove the `Option`s around the justified and finalized checkpoints on `ProtoNode` whilst we're at it. Those options were added in #2822 which was included in Lighthouse v2.1.0. The options were only put there to handle the migration and they've been set to `Some` ever since v2.1.0. There's no reason to keep them as options anymore.

I started adding the DB migration to this branch but I started to feel like I was bloating this rather critical PR with nice-to-haves. I've kept the partially-complete migration [over in my repo](https://github.com/paulhauner/lighthouse/tree/fc-pr-18-migration) so we can pick it up after this PR is merged.
2023-03-21 07:34:41 +00:00
Michael Sproul
18c8cab4da Merge remote-tracking branch 'origin/unstable' into capella-merge 2023-02-14 12:07:27 +11:00
Paul Hauner
5276dd0cb0 Fix edge-case when finding the finalized descendant (#3924)
## Issue Addressed

NA

## Description

We were missing an edge case when checking to see if a block is a descendant of the finalized checkpoint. This edge case is described for one of the tests in this PR:

a119edc739/consensus/proto_array/src/proto_array_fork_choice.rs (L1018-L1047)

This bug presented itself in the following mainnet log:

```
Jan 26 15:12:42.841 ERRO Unable to validate attestation error: MissingBeaconState(0x7c30cb80ec3d4ec624133abfa70e4c6cfecfca456bfbbbff3393e14e5b20bf25), peer_id: 16Uiu2HAm8RPRciXJYtYc5c3qtCRdrZwkHn2BXN3XP1nSi1gxHYit, type: "unaggregated", slot: Slot(5660161), beacon_block_root: 0x4a45e59da7cb9487f4836c83bdd1b741b4f31c67010c7ae343fa6771b3330489
```

Here the BN is rejecting an attestation because of a "missing beacon state". Whilst it was correct to reject the attestation, it should have rejected it because it attests to a block that conflicts with finality rather than claiming that the database is inconsistent.

The block that this attestation points to (`0x4a45`) is block `C` in the above diagram. It is a non-canonical block in the first slot of an epoch that conflicts with the finalized checkpoint. Due to our lazy pruning of proto array, `0x4a45` was still present in proto-array. Our missed edge-case in [`ForkChoice::is_descendant_of_finalized`](38514c07f2/consensus/fork_choice/src/fork_choice.rs (L1375-L1379)) would have indicated to us that the block is a descendant of the finalized block. Therefore, we would have accepted the attestation thinking that it attests to a descendant of the finalized *checkpoint*.

Since we didn't have the shuffling for this erroneously processed block, we attempted to read its state from the database. This failed because we prune states from the database by keeping track of the tips of the chain and iterating back until we find a finalized block. This would have deleted `C` from the database, hence the `MissingBeaconState` error.
2023-02-09 23:51:18 +00:00
Michael Sproul
991e4094f8 Merge remote-tracking branch 'origin/unstable' into capella-update 2022-12-14 13:00:41 +11:00
Michael Sproul
775d222299 Enable proposer boost re-orging (#2860)
## Proposed Changes

With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks.

This PR adds three flags to the BN to control this behaviour:

* `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default).
* `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.
* `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.

For safety Lighthouse will only attempt a re-org under very specific conditions:

1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`.
2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes.
3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense.
4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost.


## Additional Info

For the initial idea and background, see: https://github.com/ethereum/consensus-specs/pull/2353#issuecomment-950238004

There is also a specification for this feature here: https://github.com/ethereum/consensus-specs/pull/3034

Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: pawan <pawandhananjay@gmail.com>
2022-12-13 09:57:26 +00:00
realbigsean
c45b809b76 Cleanup payload types (#3675)
* Add transparent support

* Add `Config` struct

* Deprecate `enum_behaviour`

* Partially remove enum_behaviour from project

* Revert "Partially remove enum_behaviour from project"

This reverts commit 46ffb7fe77.

* Revert "Deprecate `enum_behaviour`"

This reverts commit 89b64a6f53.

* Add `struct_behaviour`

* Tidy

* Move tests into `ssz_derive`

* Bump ssz derive

* Fix comment

* newtype transaparent ssz

* use ssz transparent and create macros for  per fork implementations

* use superstruct map macros

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-11-02 10:30:41 -04:00
realbigsean
cae40731a2 Strict count unrealized (#3522)
## Issue Addressed

Add a flag that can increase count unrealized strictness, defaults to false

## Proposed Changes

Please list or describe the changes introduced by this PR.

## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.


Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: sean <seananderson33@gmail.com>
2022-09-05 04:50:47 +00:00
Paul Hauner
1a833ecc17 Add more logging for invalid payloads (#3515)
## Issue Addressed

NA

## Proposed Changes

Adds more `debug` logging to help troubleshoot invalid execution payload blocks. I was doing some of this recently and found it to be challenging.

With this PR we should be able to grep `Invalid execution payload` and get one-liners that will show the block, slot and details about the proposer.

I also changed the log in `process_invalid_execution_payload` since it was a little misleading; the `block_root` wasn't necessary the block which had an invalid payload.

## Additional Info

NA
2022-08-29 14:34:42 +00:00
Paul Hauner
8609cced0e Reset payload statuses when resuming fork choice (#3498)
## Issue Addressed

NA

## Proposed Changes

This PR is motivated by a recent consensus failure in Geth where it returned `INVALID` for an `VALID` block. Without this PR, the only way to recover is by re-syncing Lighthouse. Whilst ELs "shouldn't have consensus failures", in reality it's something that we can expect from time to time due to the complex nature of Ethereum. Being able to recover easily will help the network recover and EL devs to troubleshoot.

The risk introduced with this PR is that genuinely INVALID payloads get a "second chance" at being imported. I believe the DoS risk here is negligible since LH needs to be restarted in order to re-process the payload. Furthermore, there's no reason to think that a well-performing EL will accept a truly invalid payload the second-time-around.

## Additional Info

This implementation has the following intricacies:

1. Instead of just resetting *invalid* payloads to optimistic, we'll also reset *valid* payloads. This is an artifact of our existing implementation.
1. We will only reset payload statuses when we detect an invalid payload present in `proto_array`
    - This helps save us from forgetting that all our blocks are valid in the "best case scenario" where there are no invalid blocks.
1. If we fail to revert the payload statuses we'll log a `CRIT` and just continue with a `proto_array` that *does not* have reverted payload statuses.
    - The code to revert statuses needs to deal with balances and proposer-boost, so it's a failure point. This is a defensive measure to avoid introducing new show-stopping bugs to LH.
2022-08-29 14:34:41 +00:00
Paul Hauner
4fc0cb121c Remove some "wontfix" TODOs for the merge (#3449)
## Issue Addressed

NA

## Proposed Changes

Removes three types of TODOs:

1. `execution_layer/src/lib.rs`: It was [determined](https://github.com/ethereum/consensus-specs/issues/2636#issuecomment-988688742) that there is no action required here.
2. `beacon_processor/worker/gossip_methods.rs`: Removed TODOs relating to peer scoring that have already been addressed via `epe.penalize_peer()`.
    - It seems `cargo fmt` wanted to adjust some things here as well 🤷 
3. `proto_array_fork_choice.rs`: it would be nice to remove that useless `bool` for cleanliness, but I don't think it's something we need to do and the TODO just makes things look messier IMO.


## Additional Info

There should be no functional changes to the code in this PR.

There are still some TODOs lingering, those ones require actual changes or more thought.
2022-08-10 13:06:46 +00:00
Paul Hauner
d23437f726 Ensure FC uses the current slot from the store (#3402)
## Issue Addressed

NA

## Proposed Changes

Ensure that we read the current slot from the `fc_store` rather than the slot clock. This is because the `fc_store` will never allow the slot to go backwards, even if the system clock does. The `ProtoArray::find_head` function assumes a non-decreasing slot.

This issue can cause logs like this:

```
ERRO Error whist recomputing head, error: ForkChoiceError(ProtoArrayError("find_head failed: InvalidBestNode(InvalidBestNodeInfo { start_root: 0xb22655aa2ae23075a60bd40797b3ba220db33d6fb86fa7910f0ed48e34bda72f, justified_checkpoint: Checkpoint { epoch: Epoch(111569), root: 0xb22655aa2ae23075a60bd40797b3ba220db33d6fb86fa7910f0ed48e34bda72f }, finalized_checkpoint: Checkpoint { epoch: Epoch(111568), root: 0x6140797e40c587b0d3f159483bbc603accb7b3af69891979d63efac437f9896f }, head_root: 0xb22655aa2ae23075a60bd40797b3ba220db33d6fb86fa7910f0ed48e34bda72f, head_justified_checkpoint: Some(Checkpoint { epoch: Epoch(111568), root: 0x6140797e40c587b0d3f159483bbc603accb7b3af69891979d63efac437f9896f }), head_finalized_checkpoint: Some(Checkpoint { epoch: Epoch(111567), root: 0x59b913d37383a158a9ea5546a572acc79e2cdfbc904c744744789d2c3814c570 }) })")), service: beacon, module: beacon_chain::canonical_head:499
```

We expect nodes to automatically recover from this issue within seconds without any major impact. However, having *any* errors in the path of fork choice is undesirable and should be avoided.

## Additional Info

NA
2022-08-02 00:58:25 +00:00
Paul Hauner
bcfde6e7df Indicate that invalid blocks are optimistic (#3383)
## Issue Addressed

NA

## Proposed Changes

This PR will make Lighthouse return blocks with invalid payloads via the API with `execution_optimistic = true`. This seems a bit awkward, however I think it's better than returning a 404 or some other error.

Let's consider the case where the only possible head is invalid (#3370 deals with this). In such a scenario all of the duties endpoints will start failing because the head is invalid. I think it would be better if the duties endpoints continue to work, because it's likely that even though the head is invalid the duties are still based upon valid blocks and we want the VC to have them cached. There's no risk to the VC here because we won't actually produce an attestation pointing to an invalid head.

Ultimately, I don't think it's particularly important for us to distinguish between optimistic and invalid blocks on the API. Neither should be trusted and the only *real* reason that we track this is so we can try and fork around the invalid blocks.


## Additional Info

- ~~Blocked on #3370~~
2022-07-30 05:08:57 +00:00
Paul Hauner
25f0e261cb Don't return errors when fork choice fails (#3370)
## Issue Addressed

NA

## Proposed Changes

There are scenarios where the only viable head will have an invalid execution payload, in this scenario the `get_head` function on `proto_array` will return an error. We must recover from this scenario by importing blocks from the network.

This PR stops `BeaconChain::recompute_head` from returning an error so that we can't accidentally start down-scoring peers or aborting block import just because the current head has an invalid payload.

## Reviewer Notes

The following changes are included:

1. Allow `fork_choice.get_head` to fail gracefully in `BeaconChain::process_block` when trying to update the `early_attester_cache`; simply don't add the block to the cache rather than aborting the entire process.
1. Don't return an error from `BeaconChain::recompute_head_at_current_slot` and `BeaconChain::recompute_head` to defensively prevent calling functions from aborting any process just because the fork choice function failed to run.
    - This should have practically no effect, since most callers were still continuing if recomputing the head failed.
    - The outlier is that the API will return 200 rather than a 500 when fork choice fails.
1. Add the `ProtoArrayForkChoice::set_all_blocks_to_optimistic` function to recover from the scenario where we've rebooted and the persisted fork choice has an invalid head.
2022-07-28 13:57:09 +00:00
Michael Sproul
d04fde3ba9 Remove equivocating validators from fork choice (#3371)
## Issue Addressed

Closes https://github.com/sigp/lighthouse/issues/3241
Closes https://github.com/sigp/lighthouse/issues/3242

## Proposed Changes

* [x] Implement logic to remove equivocating validators from fork choice per https://github.com/ethereum/consensus-specs/pull/2845
* [x] Update tests to v1.2.0-rc.1. The new test which exercises `equivocating_indices` is passing.
* [x] Pull in some SSZ abstractions from the `tree-states` branch that make implementing Vec-compatible encoding for types like `BTreeSet` and `BTreeMap`.
* [x] Implement schema upgrades and downgrades for the database (new schema version is V11).
* [x] Apply attester slashings from blocks to fork choice

## Additional Info

* This PR doesn't need the `BTreeMap` impl, but `tree-states` does, and I don't think there's any harm in keeping it. But I could also be convinced to drop it.

Blocked on #3322.
2022-07-28 09:43:41 +00:00
realbigsean
20ebf1f3c1 Realized unrealized experimentation (#3322)
## Issue Addressed

Add a flag that optionally enables unrealized vote tracking.  Would like to test out on testnets and benchmark differences in methods of vote tracking. This PR includes a DB schema upgrade to enable to new vote tracking style.


Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: sean <seananderson33@gmail.com>
Co-authored-by: Mac L <mjladson@pm.me>
2022-07-25 23:53:26 +00:00
Paul Hauner
1f54e10b7b Do not interpret "latest valid hash" as identifying a valid hash (#3327)
## Issue Addressed

NA

## Proposed Changes

After some discussion in Discord with @mkalinin it was raised that it was not the intention of the engine API to have CLs validate the `latest_valid_hash` (LVH) and all ancestors.

Whilst I believe the engine API is being updated such that the LVH *must* identify a valid hash or be set to some junk value, I'm not confident that we can rely upon the LVH as being valid (at least for now) due to the confusion surrounding it.

Being able to validate blocks via the LVH is a relatively minor optimisation; if the LVH value ends up becoming our head we'll send an fcU and get the VALID status there.

Falsely marking a block as valid has serious consequences and since it's a minor optimisation to use LVH I think that we don't take the risk.

For clarity, we will still *invalidate* the *descendants* of the LVH, we just wont *validate* the *ancestors*.

## Additional Info

NA
2022-07-13 23:07:49 +00:00
Paul Hauner
be4e261e74 Use async code when interacting with EL (#3244)
## Overview

This rather extensive PR achieves two primary goals:

1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state.
2. Refactors fork choice, block production and block processing to `async` functions.

Additionally, it achieves:

- Concurrent forkchoice updates to the EL and cache pruning after a new head is selected.
- Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production.
- Concurrent per-block-processing and execution payload verification during block processing.
- The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?):
    - I had to do this to deal with sending blocks into spawned tasks.
    - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones.
    - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap.
    - Avoids cloning *all the blocks* in *every chain segment* during sync.
    - It also has the potential to clean up our code where we need to pass an *owned* block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough 😅)
- The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs.

For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273

## Changes to `canonical_head` and `fork_choice`

Previously, the `BeaconChain` had two separate fields:

```
canonical_head: RwLock<Snapshot>,
fork_choice: RwLock<BeaconForkChoice>
```

Now, we have grouped these values under a single struct:

```
canonical_head: CanonicalHead {
  cached_head: RwLock<Arc<Snapshot>>,
  fork_choice: RwLock<BeaconForkChoice>
} 
```

Apart from ergonomics, the only *actual* change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously.

## Breaking Changes

### The `state` (root) field in the `finalized_checkpoint` SSE event

Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event:

1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`.
4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots.

Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](de2b2801c8/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java (L171-L182)) it uses [`getStateRootFromBlockRoot`](de2b2801c8/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java (L336-L341)) which uses (1).

I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku.

## Notes for Reviewers

I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct.

I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking".

I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run *at least once* means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows *when* it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it.

I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around.

Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2.

You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests:
- Changing tests to be `tokio::async` tests.
- Adding `.await` to fork choice, block processing and block production functions.
- Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`.
- Wrapping `SignedBeaconBlock` in an `Arc`.
- In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant.

I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic.

Co-authored-by: Mac L <mjladson@pm.me>
2022-07-03 05:36:50 +00:00