*Replaces #4434. It is identical, but this PR has a smaller diff due to a curated commit history.*
## Issue Addressed
NA
## Proposed Changes
This PR moves the scheduling logic for the `BeaconProcessor` into a new crate in `beacon_node/beacon_processor`. Previously it existed in the `beacon_node/network` crate.
This addresses a circular-dependency problem where it's not possible to use the `BeaconProcessor` from the `beacon_chain` crate. The `network` crate depends on the `beacon_chain` crate (`network -> beacon_chain`), but importing the `BeaconProcessor` into the `beacon_chain` crate would create a circular dependancy of `beacon_chain -> network`.
The `BeaconProcessor` was designed to provide queuing and prioritized scheduling for messages from the network. It has proven to be quite valuable and I believe we'd make Lighthouse more stable and effective by using it elsewhere. In particular, I think we should use the `BeaconProcessor` for:
1. HTTP API requests.
1. Scheduled tasks in the `BeaconChain` (e.g., state advance).
Using the `BeaconProcessor` for these tasks would help prevent the BN from becoming overwhelmed and would also help it to prioritize operations (e.g., choosing to process blocks from gossip before responding to low-priority HTTP API requests).
## Additional Info
This PR is intended to have zero impact on runtime behaviour. It aims to simply separate the *scheduling* code (i.e., the `BeaconProcessor`) from the *business logic* in the `network` crate (i.e., the `Worker` impls). Future PRs (see #4462) can build upon these works to actually use the `BeaconProcessor` for more operations.
I've gone to some effort to use `git mv` to make the diff look more like "file was moved and modified" rather than "file was deleted and a new one added". This should reduce review burden and help maintain commit attribution.
## Issue Addressed
[Users on Twitter](https://twitter.com/ashekhirin/status/1676334843192397824) are getting checkpoint sync URL timeouts with the default of 60s, so this PR increases the default timeout to 3 minutes.
I've also added a short section to the book about adjusting the timeout with `--checkpoint-sync-url-timeout`.
## Issue Addressed
#4331
## Proposed Changes
- Use comparison rather than strict equality between the earliest epoch we know about and the backfill target (which will be the most recent WSP by default or genesis)
- Add helper function `BackFillSync<T>::would_complete` to achieve this in one location
## Additional Info
- There's an ad hoc test for this in #4461
Co-authored-by: Age Manning <Age@AgeManning.com>
We now officially have ipv6 support. The mainnet bootnodes have been updated to support ipv6. This PR updates lighthouse's internal bootnodes for mainnet to avoid fetching them on initial load.
## Issue Addressed
#4118
## Proposed Changes
This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive).
This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this:
- `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.**
- `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**.
- `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release.
- `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs.
### Tasks
- [x] Initial cache implementation in `BeaconState`
- [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache`
- [x] Add CLI flag, and disable the optimization by default
- [x] Testing on Goerli & Benchmarking
- [x] Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](https://github.com/sigp/lighthouse/pull/4362#discussion_r1230877001))
- [x] Add attesting balance metrics
Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
## Issue Addressed
[#4292](https://github.com/sigp/lighthouse/issues/4292)
## Proposed Changes
Updated the node health endpoint
will return a 200 status code if `!syncing && !el_offline && !optimistic`
wil return a 206 if `(syncing || optimistic) && !el_offline`
will return a 503 if `el_offline`
## Additional Info
## Issue Addressed
[#4259](https://github.com/sigp/lighthouse/issues/4259)
## Proposed Changes
debounce spammy `Unable to send message to the beacon processor` log messages
## Additional Info
We could potentially debounce other logs that have the potential to be "spammy".
After some feedback we decided to additionally add the following change:
create a newtype wrapper around `mpsc::Sender<BeaconWorkEvent<T>>`. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.
## Issue Addressed
- #4293
- #4264
## Proposed Changes
*Changes largely follow those suggested in the main issue*.
- Add new routes to HTTP API
- `post_beacon_blocks_v2`
- `post_blinded_beacon_blocks_v2`
- Add new routes to `BeaconNodeHttpClient`
- `post_beacon_blocks_v2`
- `post_blinded_beacon_blocks_v2`
- Define new Eth2 common types
- `BroadcastValidation`, enum representing the level of validation to apply to blocks prior to broadcast
- `BroadcastValidationQuery`, the corresponding HTTP query string type for the above type
- ~~Define `_checked` variants of both `publish_block` and `publish_blinded_block` that enforce a validation level at a type level~~
- Add interactive tests to the `bn_http_api_tests` test target covering each validation level (to their own test module, `broadcast_validation_tests`)
- `beacon/blocks`
- `broadcast_validation=gossip`
- Invalid (400)
- Full Pass (200)
- Partial Pass (202)
- `broadcast_validation=consensus`
- Invalid (400)
- Only gossip (400)
- Only consensus pass (i.e., equivocates) (200)
- Full pass (200)
- `broadcast_validation=consensus_and_equivocation`
- Invalid (400)
- Invalid due to early equivocation (400)
- Only gossip (400)
- Only consensus (400)
- Pass (200)
- `beacon/blinded_blocks`
- `broadcast_validation=gossip`
- Invalid (400)
- Full Pass (200)
- Partial Pass (202)
- `broadcast_validation=consensus`
- Invalid (400)
- Only gossip (400)
- ~~Only consensus pass (i.e., equivocates) (200)~~
- Full pass (200)
- `broadcast_validation=consensus_and_equivocation`
- Invalid (400)
- Invalid due to early equivocation (400)
- Only gossip (400)
- Only consensus (400)
- Pass (200)
- Add a new trait, `IntoGossipVerifiedBlock`, which allows type-level guarantees to be made as to gossip validity
- Modify the structure of the `ObservedBlockProducers` cache from a `(slot, validator_index)` mapping to a `((slot, validator_index), block_root)` mapping
- Modify `ObservedBlockProducers::proposer_has_been_observed` to return a `SeenBlock` rather than a boolean on success
- Punish gossip peer (low) for submitting equivocating blocks
- Rename `BlockError::SlashablePublish` to `BlockError::SlashableProposal`
## Additional Info
This PR contains changes that directly modify how blocks are verified within the client. For more context, consult [comments in-thread](https://github.com/sigp/lighthouse/pull/4316#discussion_r1234724202).
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
NA
## Proposed Changes
Implements the `PrettyReqwestError` to wrap a `reqwest::Error` and give nicer `Debug` formatting. It also wraps the `Url` component in a `SensitiveUrl` to avoid leaking sensitive info in logs.
### Before
```
Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(9999), path: "/eth/v1/node/version", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 61, kind: ConnectionRefused, message: "Connection refused" })) })
```
### After
```
HttpClient(url: http://localhost:9999/, kind: request, detail: error trying to connect: tcp connect error: Connection refused (os error 61))
```
## Additional Info
I've also renamed the `Reqwest` error enum variants to `HttpClient`, to give people a better chance at knowing what's going on. Reqwest is pretty odd and looks like a typo.
I've implemented it in the `eth2` and `execution_layer` crates. This should affect most logs in the VC and EE-related ones in the BN.
I think the last crate that could benefit from the is the `beacon_node/eth1` crate. I haven't updated it in this PR since its error type is not so amenable to it (everything goes into a `String`). I don't have a whole lot of time to jig around with that at the moment and I feel that this PR as it stands is a significant enough improvement to merge on its own. Leaving it as-is is fine for the time being and we can always come back for it later (or implement in-protocol deposits!).
## Issue Addressed
Resolves#3238
## Proposed Changes
Please list or describe the changes introduced by this PR.
## Additional Info
Please provide any additional information. For example, future considerations
or information useful for reviewers.
## Issue Addressed
This PR adds a new `lint-fix` task to automatically fix simple Clippy warnings using `cargo clippy --fix`.
Usage:
```
make lint-fix
```
## Issue Addressed
#4386
## Proposed Changes
The original proposal described in the issue adds a new endpoint to support updating validator graffiti, but I realized we already have an endpoint that we use for updating various validator fields in memory and in the validator definitions file, so I think that would be the best place to add this to.
### API endpoint
`PATCH lighthouse/validators/{validator_pubkey}`
This endpoint updates the graffiti in both the [ validator definition file](https://lighthouse-book.sigmaprime.io/graffiti.html#2-setting-the-graffiti-in-the-validator_definitionsyml) and the in memory `InitializedValidators`. In the next block proposal, the new graffiti will be used.
Note that the [`--graffiti-file`](https://lighthouse-book.sigmaprime.io/graffiti.html#1-using-the---graffiti-file-flag-on-the-validator-client) flag has a priority over the validator definitions file, so if the caller attempts to update the graffiti while the `--graffiti-file` flag is present, the endpoint will return an error (Bad request 400).
## Tasks
- [x] Add graffiti update support to `PATCH lighthouse/validators/{validator_pubkey}`
- [x] Return error if user tries to update graffiti while the `--graffiti-flag` is set
- [x] Update Lighthouse book
## Issue Addressed
NA
## Proposed Changes
Adds the `--validator-registration-batch-size` flag to the VC to allow runtime configuration of the number of validators POSTed to the [`validator/register_validator`](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Validator/registerValidator) endpoint.
There are builders (Agnostic and Eden) that are timing out with `regsiterValidator` requests with ~400 validators, even with a 9 second timeout. Exposing the batch size will help tune batch sizes to (hopefully) avoid this.
This PR should not change the behavior of Lighthouse when the new flag is not provided (i.e., the same default value is used).
## Additional Info
NA
## Proposed Changes
Remove `max-skip-slots` checks when processing blocks.
This was legacy code which was previously used in the Medalla testnet to sync to the correct fork.
With the addition of checkpoint sync which allows us to sync to any arbitrary fork, this is no longer a necessary feature, so it has been removed for simplicity.
## Additional Notes
The CLI flag and checks for attestation processing have been retained as it still may have uses in DoS protection.
Currently, the ENR of the node may not be correctly updated when specifying ipv6 fields through the CLI if an ENR exists on disk.
This remedies a bug where we were not checking for ipv6 fields when comparing whether to use an on-disk ENR or updating based on CLI configuration parameters.