Adds a new `/lighthouse` API call to the VC which allows the list of beacon nodes to be updated dynamically at runtime.
An entirely new beacon node list is provided to the VC so it effectively adds, removes or reorders nodes to match the new list.
This can then be used in Siren, which will enable a "drag to reorder" system along with adding and removing beacon nodes while the VC is on. This will make it unnecessary to reboot the VC when users want to simply add or remove a BN from the list.
N/A
After the electra fork which includes EIP 6110, the beacon node no longer needs the eth1 bridging mechanism to include new deposits as they are provided by the EL as a `deposit_request`. So after electra + a transition period where the finalized bridge deposits pre-fork are included through the old mechanism, we no longer need the elaborate machinery we had to get deposit contract data from the execution layer.
Since holesky has already forked to electra and completed the transition period, this PR basically checks to see if removing all the eth1 related logic leads to any surprises.
We would like to reuse the `notifier` and `latency_service` in Anchor. To make this possible, this PR moves these from `validator_client` to `validator_services` and makes them use the new `ValidatorStore` trait is used so that the code can be reused in Anchor.
- Create trait `ValidatorStore` with all functions used by the `validator_services`
- Make `validator_services` generic on `S: ValidatorStore`
- Introduce `LighthouseValidatorStore`, which has identical functionality to the old `ValidatorStore`
- Remove dependencies (especially `environment`) from `validator_services` and `beacon_node_fallback` in order to be able to cleanly use them in Anchor
Cleaned up and isolated version of the `--disable-attesting` flag for the VC, from the `holesky-rescue` branch:
- https://github.com/sigp/lighthouse/pull/7041
I figured we don't need the `--disable-attesting` flag on the BN for now, and it was a much more invasive impl.
* Add cli flag for HTTP API token path (VC)
* Add http_token_path_flag test
* Add pre-check for directory case & Fix test utils
* Update docs
* Apply review: move http_token_path into validator_http_api config
* Lint
* Make diff lesser to replace PK_FILENAME
* Merge branch 'unstable' into feature/cli-token-path
* Applt review: help_vc.md
Co-authored-by: chonghe <44791194+chong-he@users.noreply.github.com>
* Fix help for cli
* Fix issues on ci
* Merge branch 'unstable' into feature/cli-token-path
* Merge branch 'unstable' into feature/cli-token-path
* Merge branch 'unstable' into feature/cli-token-path
* Merge branch 'unstable' into feature/cli-token-path
* Rework Validator Client fallback mechanism
* Add CI workflow for fallback simulator
* Tie-break with sync distance for non-synced nodes
* Fix simulator
* Cleanup unused code
* More improvements
* Add IsOptimistic enum for readability
* Use configurable sync distance tiers
* Fix tests
* Combine status and health and improve logging
* Fix nodes not being marked as available
* Fix simulator
* Fix tests again
* Increase fallback simulator tolerance
* Add http api endpoint
* Fix todos and tests
* Update simulator
* Merge branch 'unstable' into vc-fallback
* Add suggestions
* Add id to ui endpoint
* Remove unnecessary clones
* Formatting
* Merge branch 'unstable' into vc-fallback
* Merge branch 'unstable' into vc-fallback
* Fix flag tests
* Merge branch 'unstable' into vc-fallback
* Merge branch 'unstable' into vc-fallback
* Fix conflicts
* Merge branch 'unstable' into vc-fallback
* Remove unnecessary pubs
* Simplify `compute_distance_tier` and reduce notifier awaits
* Use the more descriptive `user_index` instead of `id`
* Combine sync distance tolerance flags into one
* Merge branch 'unstable' into vc-fallback
* Merge branch 'unstable' into vc-fallback
* wip
* Use new simulator from unstable
* Fix cli text
* Remove leftover files
* Remove old commented code
* Merge branch 'unstable' into vc-fallback
* Update cli text
* Silence candidate errors when pre-genesis
* Merge branch 'unstable' into vc-fallback
* Merge branch 'unstable' into vc-fallback
* Retry on failure
* Merge branch 'unstable' into vc-fallback
* Merge branch 'unstable' into vc-fallback
* Remove disable_run_on_all
* Remove unused error variant
* Fix out of date comment
* Merge branch 'unstable' into vc-fallback
* Remove unnecessary as_u64
* Remove more out of date comments
* Use tokio RwLock and remove parking_lot
* Merge branch 'unstable' into vc-fallback
* Formatting
* Ensure nodes are still added to total when not available
* Allow VC to detect when BN comes online
* Fix ui endpoint
* Don't have block_service as an Option
* Merge branch 'unstable' into vc-fallback
* Clean up lifetimes and futures
* Revert "Don't have block_service as an Option"
This reverts commit b5445a09e9.
* Merge branch 'unstable' into vc-fallback
* Merge branch 'unstable' into vc-fallback
* Improve rwlock sanitation using clones
* Merge branch 'unstable' into vc-fallback
* Drop read lock immediately by cloning the vec.
* Reduce frequency of polling unknown validators.
* Move slot calculation into for loop.
* Simplify logic.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
* Fix formatting
* multiple broadcast flags
* rewrite with single --broadcast option
* satisfy cargo fmt
* shorten sync-committee-messages
* fix a doc comment and a test
* use strum
* Add broadcast test to simulator
* bring --disable-run-on-all flag back with deprecation notice
## Issue Addressed
#4531
## Proposed Changes
add SSZ support to the following block production endpoints:
GET /eth/v2/validator/blocks/{slot}
GET /eth/v1/validator/blinded_blocks/{slot}
## Additional Info
i updated a few existing tests to use ssz instead of writing completely new tests
## Issue Addressed
On a new network a user might require importing validators before waiting until genesis has occurred.
## Proposed Changes
Starts the validator client http api before waiting for genesis
## Additional Info
cc @antondlr
## Issue Addressed
Addresses #2557
## Proposed Changes
Adds the `lighthouse validator-manager` command, which provides:
- `lighthouse validator-manager create`
- Creates a `validators.json` file and a `deposits.json` (same format as https://github.com/ethereum/staking-deposit-cli)
- `lighthouse validator-manager import`
- Imports validators from a `validators.json` file to the VC via the HTTP API.
- `lighthouse validator-manager move`
- Moves validators from one VC to the other, utilizing only the VC API.
## Additional Info
In 98bcb947c I've reduced some VC `ERRO` and `CRIT` warnings to `WARN` or `DEBG` for the case where a pubkey is missing from the validator store. These were being triggered when we removed a validator but still had it in caches. It seems to me that `UnknownPubkey` will only happen in the case where we've removed a validator, so downgrading the logs is prudent. All the logs are `DEBG` apart from attestations and blocks which are `WARN`. I thought having *some* logging about this condition might help us down the track.
In 856cd7e37d I've made the VC delete the corresponding password file when it's deleting a keystore. This seemed like nice hygiene. Notably, it'll only delete that password file after it scans the validator definitions and finds that no other validator is also using that password file.
## Issue Addressed
NA
## Proposed Changes
Adds the `--validator-registration-batch-size` flag to the VC to allow runtime configuration of the number of validators POSTed to the [`validator/register_validator`](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Validator/registerValidator) endpoint.
There are builders (Agnostic and Eden) that are timing out with `regsiterValidator` requests with ~400 validators, even with a 9 second timeout. Exposing the batch size will help tune batch sizes to (hopefully) avoid this.
This PR should not change the behavior of Lighthouse when the new flag is not provided (i.e., the same default value is used).
## Additional Info
NA
This PR adds the ability to read the Lighthouse logs from the HTTP API for both the BN and the VC.
This is done in such a way to as minimize any kind of performance hit by adding this feature.
The current design creates a tokio broadcast channel and mixes is into a form of slog drain that combines with our main global logger drain, only if the http api is enabled.
The drain gets the logs, checks the log level and drops them if they are below INFO. If they are INFO or higher, it sends them via a broadcast channel only if there are users subscribed to the HTTP API channel. If not, it drops the logs.
If there are more than one subscriber, the channel clones the log records and converts them to json in their independent HTTP API tasks.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
Closes https://github.com/sigp/lighthouse/issues/4291, part of #3613.
## Proposed Changes
- Implement the `el_offline` field on `/eth/v1/node/syncing`. We set `el_offline=true` if:
- The EL's internal status is `Offline` or `AuthFailed`, _or_
- The most recent call to `newPayload` resulted in an error (more on this in a moment).
- Use the `el_offline` field in the VC to mark nodes with offline ELs as _unsynced_. These nodes will still be used, but only after synced nodes.
- Overhaul the usage of `RequireSynced` so that `::No` is used almost everywhere. The `--allow-unsynced` flag was broken and had the opposite effect to intended, so it has been deprecated.
- Add tests for the EL being offline on the upcheck call, and being offline due to the newPayload check.
## Why track `newPayload` errors?
Tracking the EL's online/offline status is too coarse-grained to be useful in practice, because:
- If the EL is timing out to some calls, it's unlikely to timeout on the `upcheck` call, which is _just_ `eth_syncing`. Every failed call is followed by an upcheck [here](693886b941/beacon_node/execution_layer/src/engines.rs (L372-L380)), which would have the effect of masking the failure and keeping the status _online_.
- The `newPayload` call is the most likely to time out. It's the call in which ELs tend to do most of their work (often 1-2 seconds), with `forkchoiceUpdated` usually returning much faster (<50ms).
- If `newPayload` is failing consistently (e.g. timing out) then this is a good indication that either the node's EL is in trouble, or the network as a whole is. In the first case validator clients _should_ prefer other BNs if they have one available. In the second case, all of their BNs will likely report `el_offline` and they'll just have to proceed with trying to use them.
## Additional Changes
- Add utility method `ForkName::latest` which is quite convenient for test writing, but probably other things too.
- Delete some stale comments from when we used to support multiple execution nodes.