Commit Graph

34 Commits

Author SHA1 Message Date
Pawan Dhananjay
382b5abbee Merge branch 'deneb-free-blobs' into merge-unstable-deneb-june-6th 2023-07-17 11:47:46 -07:00
realbigsean
a618830f8f fix smol lint 2023-07-17 10:28:37 -04:00
realbigsean
42f54ee561 fix merge conflict issues 2023-07-14 16:01:57 -04:00
realbigsean
2b93c0eb0d remove spec minimal feature gating in tests (#4468)
* remove spec minimal feature gating in tests

* do merge transition in overflow cache test
2023-07-14 12:59:15 -07:00
realbigsean
a6f48f5ecb Merge branch 'unstable' of https://github.com/sigp/lighthouse into merge-unstable-deneb-june-6th 2023-07-12 13:05:30 -04:00
Paul Hauner
c25825a539 Move the BeaconProcessor into a new crate (#4435)
*Replaces #4434. It is identical, but this PR has a smaller diff due to a curated commit history.*

## Issue Addressed

NA

## Proposed Changes

This PR moves the scheduling logic for the `BeaconProcessor` into a new crate in `beacon_node/beacon_processor`. Previously it existed in the `beacon_node/network` crate.

This addresses a circular-dependency problem where it's not possible to use the `BeaconProcessor` from the `beacon_chain` crate. The `network` crate depends on the `beacon_chain` crate (`network -> beacon_chain`), but importing the `BeaconProcessor` into the `beacon_chain` crate would create a circular dependancy of `beacon_chain -> network`.

The `BeaconProcessor` was designed to provide queuing and prioritized scheduling for messages from the network. It has proven to be quite valuable and I believe we'd make Lighthouse more stable and effective by using it elsewhere. In particular, I think we should use the `BeaconProcessor` for:

1. HTTP API requests.
1. Scheduled tasks in the `BeaconChain` (e.g., state advance).

Using the `BeaconProcessor` for these tasks would help prevent the BN from becoming overwhelmed and would also help it to prioritize operations (e.g., choosing to process blocks from gossip before responding to low-priority HTTP API requests).

## Additional Info

This PR is intended to have zero impact on runtime behaviour. It aims to simply separate the *scheduling* code (i.e., the `BeaconProcessor`) from the *business logic* in the `network` crate (i.e., the `Worker` impls). Future PRs (see #4462) can build upon these works to actually use the `BeaconProcessor` for more operations.

I've gone to some effort to use `git mv` to make the diff look more like "file was moved and modified" rather than "file was deleted and a new one added". This should reduce review burden and help maintain commit attribution.
2023-07-10 07:45:54 +00:00
realbigsean
cfe2452533 Merge branch 'remove-into-gossip-verified-block' of https://github.com/realbigsean/lighthouse into merge-unstable-deneb-june-6th 2023-07-06 16:51:35 -04:00
Eitan Seri-Levi
edd093293a added debounce to log (#4269)
## Issue Addressed

[#4259](https://github.com/sigp/lighthouse/issues/4259)

## Proposed Changes

debounce spammy `Unable to send message to the beacon processor` log messages

## Additional Info

We could potentially debounce other logs that have the potential to be "spammy". 

After some feedback we decided to additionally add the following change:

create a newtype wrapper around `mpsc::Sender<BeaconWorkEvent<T>>`. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.
2023-06-30 01:13:03 +00:00
realbigsean
adbb62f7f3 Devnet6 (#4404)
* some blob reprocessing work

* remove ForceBlockLookup

* reorder enum match arms in sync manager

* a lot more reprocessing work

* impl logic for triggerng blob lookups along with block lookups

* deal with rpc blobs in groups per block in the da checker. don't cache missing blob ids in the da checker.

* make single block lookup generic

* more work

* add delayed processing logic and combine some requests

* start fixing some compile errors

* fix compilation in main block lookup mod

* much work

* get things compiling

* parent blob lookups

* fix compile

* revert red/stevie changes

* fix up sync manager delay message logic

* add peer usefulness enum

* should remove lookup refactor

* consolidate retry error handling

* improve peer scoring during certain failures in parent lookups

* improve retry code

* drop parent lookup if either req has a peer disconnect during download

* refactor single block processed method

* processing peer refactor

* smol bugfix

* fix some todos

* fix lints

* fix lints

* fix compile in lookup tests

* fix lints

* fix lints

* fix existing block lookup tests

* renamings

* fix after merge

* cargo fmt

* compilation fix in beacon chain tests

* fix

* refactor lookup tests to work with multiple forks and response types

* make tests into macros

* wrap availability check error

* fix compile after merge

* add random blobs

* start fixing up lookup verify error handling

* some bug fixes and the start of deneb only tests

* make tests work for all forks

* track information about peer source

* error refactoring

* improve peer scoring

* fix test compilation

* make sure blobs are sent for processing after stream termination, delete copied tests

* add some tests and fix a bug

* smol bugfixes and moar tests

* add tests and fix some things

* compile after merge

* lots of refactoring

* retry on invalid block/blob

* merge unknown parent messages before current slot lookup

* get tests compiling

* penalize blob peer on invalid blobs

* Check disk on in-memory cache miss

* Update beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs

* Update beacon_node/network/src/sync/network_context.rs

Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>

* fix bug in matching blocks and blobs in range sync

* pr feedback

* fix conflicts

* upgrade logs from warn to crit when we receive incorrect response in range

* synced_and_connected_within_tolerance -> should_search_for_block

* remove todo

* add data gas used and update excess data gas to u64

* Fix Broken Overflow Tests

* payload verification with commitments

* fix merge conflicts

* restore payload file

* Restore payload file

* remove todo

* add max blob commitments per block

* c-kzg lib update

* Fix ef tests

* Abstract over minimal/mainnet spec in kzg crate

* Start integrating new KZG

* checkpoint sync without alignment

* checkpoint sync without alignment

* add import

* add import

* query for checkpoint state by slot rather than state root (teku doesn't serve by state root)

* query for checkpoint state by slot rather than state root (teku doesn't serve by state root)

* loosen check

* get state first and query by most recent block root

* Revert "loosen check"

This reverts commit 069d13dd63.

* get state first and query by most recent block root

* merge max blobs change

* simplify delay logic

* rename unknown parent sync message variants

* rename parameter, block_slot -> slot

* add some docs to the lookup module

* use interval instead of sleep

* drop request if blocks and blobs requests both return `None` for `Id`

* clean up `find_single_lookup` logic

* add lookup source enum

* clean up `find_single_lookup` logic

* add docs to find_single_lookup_request

* move LookupSource our of param where unnecessary

* remove unnecessary todo

* query for block by `state.latest_block_header.slot`

* fix lint

* fix merge transition ef tests

* fix test

* fix test

* fix observed  blob sidecars test

* Add some metrics (#33)

* fix protocol limits for blobs by root

* Update Engine API for 1:1 Structure Method

* make beacon chain tests to fix devnet 6 changes

* get ckzg working and fix some tests

* fix remaining tests

* fix lints

* Fix KZG linking issues

* remove unused dep

* lockfile

* test fixes

* remove dbgs

* remove unwrap

* cleanup tx generator

* small fixes

* fixing fixes

* more self reivew

* more self review

* refactor genesis header initialization

* refactor mock el instantiations

* fix compile

* fix network test, make sure they run for each fork

* pr feedback

* fix last test (hopefully)

---------

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-06-29 15:35:43 -04:00
realbigsean
a62e52f319 Single blob lookups (#4152)
* some blob reprocessing work

* remove ForceBlockLookup

* reorder enum match arms in sync manager

* a lot more reprocessing work

* impl logic for triggerng blob lookups along with block lookups

* deal with rpc blobs in groups per block in the da checker. don't cache missing blob ids in the da checker.

* make single block lookup generic

* more work

* add delayed processing logic and combine some requests

* start fixing some compile errors

* fix compilation in main block lookup mod

* much work

* get things compiling

* parent blob lookups

* fix compile

* revert red/stevie changes

* fix up sync manager delay message logic

* add peer usefulness enum

* should remove lookup refactor

* consolidate retry error handling

* improve peer scoring during certain failures in parent lookups

* improve retry code

* drop parent lookup if either req has a peer disconnect during download

* refactor single block processed method

* processing peer refactor

* smol bugfix

* fix some todos

* fix lints

* fix lints

* fix compile in lookup tests

* fix lints

* fix lints

* fix existing block lookup tests

* renamings

* fix after merge

* cargo fmt

* compilation fix in beacon chain tests

* fix

* refactor lookup tests to work with multiple forks and response types

* make tests into macros

* wrap availability check error

* fix compile after merge

* add random blobs

* start fixing up lookup verify error handling

* some bug fixes and the start of deneb only tests

* make tests work for all forks

* track information about peer source

* error refactoring

* improve peer scoring

* fix test compilation

* make sure blobs are sent for processing after stream termination, delete copied tests

* add some tests and fix a bug

* smol bugfixes and moar tests

* add tests and fix some things

* compile after merge

* lots of refactoring

* retry on invalid block/blob

* merge unknown parent messages before current slot lookup

* get tests compiling

* penalize blob peer on invalid blobs

* Check disk on in-memory cache miss

* Update beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs

* Update beacon_node/network/src/sync/network_context.rs

Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>

* fix bug in matching blocks and blobs in range sync

* pr feedback

* fix conflicts

* upgrade logs from warn to crit when we receive incorrect response in range

* synced_and_connected_within_tolerance -> should_search_for_block

* remove todo

* Fix Broken Overflow Tests

* fix merge conflicts

* checkpoint sync without alignment

* add import

* query for checkpoint state by slot rather than state root (teku doesn't serve by state root)

* get state first and query by most recent block root

* simplify delay logic

* rename unknown parent sync message variants

* rename parameter, block_slot -> slot

* add some docs to the lookup module

* use interval instead of sleep

* drop request if blocks and blobs requests both return `None` for `Id`

* clean up `find_single_lookup` logic

* add lookup source enum

* clean up `find_single_lookup` logic

* add docs to find_single_lookup_request

* move LookupSource our of param where unnecessary

* remove unnecessary todo

* query for block by `state.latest_block_header.slot`

* fix lint

* fix test

* fix test

* fix observed  blob sidecars test

* PR updates

* use optional params instead of a closure

* create lookup and trigger request in separate method calls

* remove `LookupSource`

* make sure duplicate lookups are not dropped

---------

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
2023-06-15 12:59:10 -04:00
Jimmy Chen
70c4ae35ab Merge branch 'unstable' into deneb-free-blobs
# Conflicts:
#	.github/workflows/docker.yml
#	.github/workflows/local-testnet.yml
#	.github/workflows/test-suite.yml
#	Cargo.lock
#	Cargo.toml
#	beacon_node/beacon_chain/src/beacon_chain.rs
#	beacon_node/beacon_chain/src/builder.rs
#	beacon_node/beacon_chain/src/test_utils.rs
#	beacon_node/execution_layer/src/engine_api/json_structures.rs
#	beacon_node/network/src/beacon_processor/mod.rs
#	beacon_node/network/src/beacon_processor/worker/gossip_methods.rs
#	beacon_node/network/src/sync/backfill_sync/mod.rs
#	beacon_node/store/src/config.rs
#	beacon_node/store/src/hot_cold_store.rs
#	common/eth2_network_config/Cargo.toml
#	consensus/ssz/src/decode/impls.rs
#	consensus/ssz_derive/src/lib.rs
#	consensus/ssz_derive/tests/tests.rs
#	consensus/ssz_types/src/serde_utils/mod.rs
#	consensus/tree_hash/src/impls.rs
#	consensus/tree_hash/src/lib.rs
#	consensus/types/Cargo.toml
#	consensus/types/src/beacon_state.rs
#	consensus/types/src/chain_spec.rs
#	consensus/types/src/eth_spec.rs
#	consensus/types/src/fork_name.rs
#	lcli/Cargo.toml
#	lcli/src/main.rs
#	lcli/src/new_testnet.rs
#	scripts/local_testnet/el_bootnode.sh
#	scripts/local_testnet/genesis.json
#	scripts/local_testnet/geth.sh
#	scripts/local_testnet/setup.sh
#	scripts/local_testnet/start_local_testnet.sh
#	scripts/local_testnet/vars.env
#	scripts/tests/doppelganger_protection.sh
#	scripts/tests/genesis.json
#	scripts/tests/vars.env
#	testing/ef_tests/Cargo.toml
#	validator_client/src/block_service.rs
2023-05-30 22:44:05 +10:00
Age Manning
616bee6757 Maintain trusted peers (#4159)
## Issue Addressed
#4150 

## Proposed Changes

Maintain trusted peers in the pruning logic. ~~In principle the changes here are not necessary as a trusted peer has a max score (100) and all other peers can have at most 0 (because we don't implement positive scores). This means that we should never prune trusted peers unless we have more trusted peers than the target peer count.~~

This change shifts this logic to explicitly never prune trusted peers which I expect is the intuitive behaviour. 

~~I suspect the issue in #4150 arises when a trusted peer disconnects from us for one reason or another and then we remove that peer from our peerdb as it becomes stale. When it re-connects at some large time later, it is no longer a trusted peer.~~

Currently we do disconnect trusted peers, and this PR corrects this to maintain trusted peers in the pruning logic.

As suggested in #4150 we maintain trusted peers in the db and thus we remember them even if they disconnect from us.
2023-05-03 04:12:10 +00:00
realbigsean
a5addf661c Rename eip4844 to deneb (#4129)
* rename 4844 to deneb

* rename 4844 to deneb

* move excess data gas field

* get EF tests working

* fix ef tests lint

* fix the blob identifier ef test

* fix accessed files ef test script

* get beacon chain tests passing
2023-03-26 11:49:16 -04:00
Emilia Hane
2672cf40bb Better fix for debug tests 2023-02-15 11:47:56 +01:00
Emilia Hane
9e4abc79fb Comment out tests that use system time 2023-02-14 14:12:50 +01:00
Emilia Hane
73c7ad73b8 Disable use of system time in tests 2023-02-14 13:33:38 +01:00
Emilia Hane
7545ae9e9b fixup! Fix block lookup debug tests 2023-02-10 15:34:46 +01:00
Emilia Hane
d292a3a6a8 Fix conflicts rebasing eip4844 2023-02-10 09:41:23 +01:00
Emilia Hane
8365d76277 fixup! Debug tests 2023-02-10 09:39:22 +01:00
Emilia Hane
7220f35ff6 Debug tests 2023-02-10 09:39:21 +01:00
Emilia Hane
995b2715f2 Fix network block_lookups test 2023-02-10 09:39:21 +01:00
Emilia Hane
3676ce78b5 Fix rebase conflicts 2023-02-10 09:39:21 +01:00
Divma
240854750c cleanup: remove unused imports, unusued fields (#3834) 2022-12-23 17:16:10 -05:00
realbigsean
a0d4aecf30 requests block + blob always post eip4844 2022-12-07 15:30:08 -05:00
realbigsean
8102a01085 merge with upstream 2022-12-01 11:13:07 -05:00
Diva M
805df307f6 wip 2022-11-28 14:13:12 -05:00
Divma
84c7d8cc70 Blocklookup data inconsistencies (#3677)
## Issue Addressed
Closes #3649 

## Proposed Changes

Add a regression test for the data inconsistency, catching the problem in 31e88c5533 [here](https://github.com/sigp/lighthouse/actions/runs/3379894044/jobs/5612044797#step:6:2043).
When a chain is sent for processing, move it to a separate collection and now the test works, yay!

## Additional Info

na
2022-11-07 06:48:34 +00:00
Paul Hauner
fa6ad1a11a Deduplicate block root computation (#3590)
## Issue Addressed

NA

## Proposed Changes

This PR removes duplicated block root computation.

Computing the `SignedBeaconBlock::canonical_root` has become more expensive since the merge as we need to compute the merke root of each transaction inside an `ExecutionPayload`.

Computing the root for [a mainnet block](https://beaconcha.in/slot/4704236) is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations *after* the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times.

## Additional Info

NA
2022-09-23 03:52:42 +00:00
Divma
8c69d57c2c Pause sync when EE is offline (#3428)
## Issue Addressed

#3032

## Proposed Changes

Pause sync when ee is offline. Changes include three main parts:
- Online/offline notification system
- Pause sync
- Resume sync

#### Online/offline notification system
- The engine state is now guarded behind a new struct `State` that ensures every change is correctly notified. Notifications are only sent if the state changes. The new `State` is behind a `RwLock` (as before) as the synchronization mechanism.
- The actual notification channel is a [tokio::sync::watch](https://docs.rs/tokio/latest/tokio/sync/watch/index.html) which ensures only the last value is in the receiver channel. This way we don't need to worry about message order etc.
- Sync waits for state changes concurrently with normal messages.

#### Pause Sync
Sync has four components, pausing is done differently in each:
- **Block lookups**: Disabled while in this state. We drop current requests and don't search for new blocks. Block lookups are infrequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it.
- **Parent lookups**: Disabled while in this state. We drop current requests and don't search for new parents. Parent lookups are even less frequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it.
- **Range**: Chains don't send batches for processing to the beacon processor. This is easily done by guarding the channel to the beacon processor and giving it access only if the ee is responsive. I find this the simplest and most powerful approach since we don't need to deal with new sync states and chain segments that are added while the ee is offline will follow the same logic without needing to synchronize a shared state among those. Another advantage of passive pause vs active pause is that we can still keep track of active advertised chain segments so that on resume we don't need to re-evaluate all our peers.
- **Backfill**: Not affected by ee states, we don't pause.

#### Resume Sync
- **Block lookups**: Enabled again.
- **Parent lookups**: Enabled again.
- **Range**: Active resume. Since the only real pause range does is not sending batches for processing, resume makes all chains that are holding read-for-processing batches send them.
- **Backfill**: Not affected by ee states, no need to resume.

## Additional Info

**QUESTION**: Originally I made this to notify and change on synced state, but @pawanjay176 on talks with @paulhauner concluded we only need to check online/offline states. The upcheck function mentions extra checks to have a very up to date sync status to aid the networking stack. However, the only need the networking stack would have is this one. I added a TODO to review if the extra check can be removed

Next gen of #3094

Will work best with #3439 

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
2022-08-24 23:34:56 +00:00
Divma
f4ffa9e0b4 Handle processing results of non faulty batches (#3439)
## Issue Addressed
Solves #3390 

So after checking some logs @pawanjay176 got, we conclude that this happened because we blacklisted a chain after trying it "too much". Now here, in all occurrences it seems that "too much" means we got too many download failures. This happened very slowly, exactly because the batch is allowed to stay alive for very long times after not counting penalties when the ee is offline. The error here then was not that the batch failed because of offline ee errors, but that we blacklisted a chain because of download errors, which we can't pin on the chain but on the peer. This PR fixes that.

## Proposed Changes

Adds a missing piece of logic so that if a chain fails for errors that can't be attributed to an objectively bad behavior from the peer, it is not blacklisted. The issue at hand occurred when new peers arrived claiming a head that had wrongfully blacklisted, even if the original peers participating in the chain were not penalized.

Another notable change is that we need to consider a batch invalid if it processed correctly but its next non empty batch fails processing. Now since a batch can fail processing in non empty ways, there is no need to mark as invalid previous batches.

Improves some logging as well.

## Additional Info

We should do this regardless of pausing sync on ee offline/unsynced state. This is because I think it's almost impossible to ensure a processing result will reach in a predictable order with a synced notification from the ee. Doing this handles what I think are inevitable data races when we actually pause sync

This also fixes a return that reports which batch failed and caused us some confusion checking the logs
2022-08-12 00:56:38 +00:00
Age Manning
2ed51c364d Improve block-lookup functionality (#3287)
Improves some of the functionality around single and parent block lookup. 

Gives extra information about whether failures for lookups are related to processing or downloading.

This is entirely untested.


Co-authored-by: Diva M <divma@protonmail.com>
2022-07-17 23:26:58 +00:00
Pawan Dhananjay
28b0ff27ff Ignored sync jobs 2 (#3317)
## Issue Addressed

Duplicate of #3269. Making this since @divagant-martian opened the previous PR and she can't approve her own PR 😄 


Co-authored-by: Diva M <divma@protonmail.com>
2022-07-15 07:31:20 +00:00
Paul Hauner
be4e261e74 Use async code when interacting with EL (#3244)
## Overview

This rather extensive PR achieves two primary goals:

1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state.
2. Refactors fork choice, block production and block processing to `async` functions.

Additionally, it achieves:

- Concurrent forkchoice updates to the EL and cache pruning after a new head is selected.
- Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production.
- Concurrent per-block-processing and execution payload verification during block processing.
- The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?):
    - I had to do this to deal with sending blocks into spawned tasks.
    - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones.
    - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap.
    - Avoids cloning *all the blocks* in *every chain segment* during sync.
    - It also has the potential to clean up our code where we need to pass an *owned* block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough 😅)
- The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs.

For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273

## Changes to `canonical_head` and `fork_choice`

Previously, the `BeaconChain` had two separate fields:

```
canonical_head: RwLock<Snapshot>,
fork_choice: RwLock<BeaconForkChoice>
```

Now, we have grouped these values under a single struct:

```
canonical_head: CanonicalHead {
  cached_head: RwLock<Arc<Snapshot>>,
  fork_choice: RwLock<BeaconForkChoice>
} 
```

Apart from ergonomics, the only *actual* change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously.

## Breaking Changes

### The `state` (root) field in the `finalized_checkpoint` SSE event

Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event:

1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`.
4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots.

Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](de2b2801c8/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java (L171-L182)) it uses [`getStateRootFromBlockRoot`](de2b2801c8/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java (L336-L341)) which uses (1).

I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku.

## Notes for Reviewers

I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct.

I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking".

I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run *at least once* means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows *when* it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it.

I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around.

Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2.

You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests:
- Changing tests to be `tokio::async` tests.
- Adding `.await` to fork choice, block processing and block production functions.
- Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`.
- Wrapping `SignedBeaconBlock` in an `Arc`.
- In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant.

I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic.

Co-authored-by: Mac L <mjladson@pm.me>
2022-07-03 05:36:50 +00:00
Divma
788b6af3c4 Remove sync await points (#3036)
## Issue Addressed

Removes the await points in sync waiting for a processor response for rpc block processing. Built on top of #3029 
This also handles a couple of bugs in the previous code and adds a relatively comprehensive test suite.
2022-03-23 01:09:39 +00:00