Commit Graph

12 Commits

Author SHA1 Message Date
Age Manning
49536ff103 Add distributed flag to VC to enable support for DVT (#4867)
* Initial flag building

* Update validator_client/src/cli.rs

Co-authored-by: Abhishek Kumar <43061995+xenowits@users.noreply.github.com>

* Merge latest unstable

* Per slot aggregates

* One slot lookahead for sync committee aggregates

* Update validator_client/src/duties_service.rs

Co-authored-by: Abhishek Kumar <43061995+xenowits@users.noreply.github.com>

* Rename selection_look_ahead

* Merge branch 'unstable' into vc-distributed

* Merge remote-tracking branch 'origin/unstable' into vc-distributed

* Update CLI text
2024-02-15 12:23:58 +00:00
Alexander Uvizhev
b4556a3d62 Broadcast various requests as per #4684 (#4920)
* multiple broadcast flags

* rewrite with single --broadcast option

* satisfy cargo fmt

* shorten sync-committee-messages

* fix a doc comment and a test

* use strum

* Add broadcast test to simulator

* bring --disable-run-on-all flag back with deprecation notice
2023-11-27 15:39:37 +11:00
Eitan Seri-Levi
4ce01ddd11 Activate clippy::manual_let_else lint (#4889)
## Issue Addressed

#4888

## Proposed Changes

Enabled `clippy::manual_let_else` lint and resolved the warning messages.
2023-10-31 10:31:02 +00:00
Paul Hauner
1373dcf076 Add validator-manager (#3502)
## Issue Addressed

Addresses #2557

## Proposed Changes

Adds the `lighthouse validator-manager` command, which provides:

- `lighthouse validator-manager create`
    - Creates a `validators.json` file and a `deposits.json` (same format as https://github.com/ethereum/staking-deposit-cli)
- `lighthouse validator-manager import`
    - Imports validators from a `validators.json` file to the VC via the HTTP API.
- `lighthouse validator-manager move`
    - Moves validators from one VC to the other, utilizing only the VC API.

## Additional Info

In 98bcb947c I've reduced some VC `ERRO` and `CRIT` warnings to `WARN` or `DEBG` for the case where a pubkey is missing from the validator store. These were being triggered when we removed a validator but still had it in caches. It seems to me that `UnknownPubkey` will only happen in the case where we've removed a validator, so downgrading the logs is prudent. All the logs are `DEBG` apart from attestations and blocks which are `WARN`. I thought having *some* logging about this condition might help us down the track.

In 856cd7e37d I've made the VC delete the corresponding password file when it's deleting a keystore. This seemed like nice hygiene. Notably, it'll only delete that password file after it scans the validator definitions and finds that no other validator is also using that password file.
2023-08-08 00:03:22 +00:00
Michael Sproul
3052db29fe Implement el_offline and use it in the VC (#4295)
## Issue Addressed

Closes https://github.com/sigp/lighthouse/issues/4291, part of #3613.

## Proposed Changes

- Implement the `el_offline` field on `/eth/v1/node/syncing`. We set `el_offline=true` if:
  - The EL's internal status is `Offline` or `AuthFailed`, _or_
  - The most recent call to `newPayload` resulted in an error (more on this in a moment).

- Use the `el_offline` field in the VC to mark nodes with offline ELs as _unsynced_. These nodes will still be used, but only after synced nodes.
- Overhaul the usage of `RequireSynced` so that `::No` is used almost everywhere. The `--allow-unsynced` flag was broken and had the opposite effect to intended, so it has been deprecated.
- Add tests for the EL being offline on the upcheck call, and being offline due to the newPayload check.


## Why track `newPayload` errors?

Tracking the EL's online/offline status is too coarse-grained to be useful in practice, because:

- If the EL is timing out to some calls, it's unlikely to timeout on the `upcheck` call, which is _just_ `eth_syncing`. Every failed call is followed by an upcheck [here](693886b941/beacon_node/execution_layer/src/engines.rs (L372-L380)), which would have the effect of masking the failure and keeping the status _online_.
- The `newPayload` call is the most likely to time out. It's the call in which ELs tend to do most of their work (often 1-2 seconds), with `forkchoiceUpdated` usually returning much faster (<50ms).
- If `newPayload` is failing consistently (e.g. timing out) then this is a good indication that either the node's EL is in trouble, or the network as a whole is. In the first case validator clients _should_ prefer other BNs if they have one available. In the second case, all of their BNs will likely report `el_offline` and they'll just have to proceed with trying to use them.

## Additional Changes

- Add utility method `ForkName::latest` which is quite convenient for test writing, but probably other things too.
- Delete some stale comments from when we used to support multiple execution nodes.
2023-05-17 05:51:56 +00:00
tim gretler
5dba89e43b Sync committee sign bn fallback (#3624)
## Issue Addressed

Closes #3612

## Proposed Changes

- Iterates through BNs until it finds a non-optimistic head.

A slight change in error behavior: 
- Previously: `spawn_contribution_tasks` did not return an error for a non-optimistic block head. It returned `Ok(())` logged a warning.
- Now: `spawn_contribution_tasks` returns an error if it cannot find a non-optimistic block head. The caller of `spawn_contribution_tasks` then logs the error as a critical error.


Co-authored-by: Michael Sproul <micsproul@gmail.com>
2022-11-13 22:40:43 +00:00
Pawan Dhananjay
6779912fe4 Publish subscriptions to all beacon nodes (#3529)
## Issue Addressed

Resolves #3516 

## Proposed Changes

Adds a beacon fallback function for running a beacon node http query on all available fallbacks instead of returning on a first successful result. Uses the new `run_on_all` method for attestation and sync committee subscriptions. 

## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.
2022-09-28 19:53:35 +00:00
realbigsean
2ce86a0830 Validator registration request failures do not cause us to mark BNs offline (#3488)
## Issue Addressed

Relates to https://github.com/sigp/lighthouse/issues/3416

## Proposed Changes

- Add an `OfflineOnFailure` enum to the `first_success` method for querying beacon nodes so that a val registration request failure from the BN -> builder does not result in the BN being marked offline. This seems important because these failures could be coming directly from a connected relay and actually have no bearing on BN health.  Other messages that are sent to a relay have a local fallback so shouldn't result in errors 

- Downgrade the following log to a `WARN`

```
ERRO Unable to publish validator registrations to the builder network, error: All endpoints failed https://BN_B => RequestFailed(ServerMessage(ErrorMessage { code: 500, message: "UNHANDLED_ERROR: BuilderMissing", stacktraces: [] })), https://XXXX/ => Unavailable(Offline), [omitted]
```

## Additional Info

I think this change at least improves the UX of having a VC connected to some builder and some non-builder beacon nodes. I think we need to balance potentially alerting users that there is a BN <> VC misconfiguration and also allowing this type of fallback to work. 

If we want to fully support this type of configuration we may want to consider adding a flag `--builder-beacon-nodes` and track whether a VC should be making builder queries on a per-beacon node basis.  But I think the changes in this PR are independent of that type of extension.

PS: Sorry for the big diff here, it's mostly formatting changes after I added a new arg to a bunch of methods calls.




Co-authored-by: realbigsean <sean@sigmaprime.io>
2022-08-29 11:35:59 +00:00
Mac L
44fae52cd7 Refuse to sign sync committee messages when head is optimistic (#3191)
## Issue Addressed

Resolves #3151 

## Proposed Changes

When fetching duties for sync committee contributions, check the value of `execution_optimistic` of the head block from the BN and refuse to sign any sync committee messages `if execution_optimistic == true`.

## Additional Info
- Is backwards compatible with older BNs
- Finding a way to add test coverage for this would be prudent. Open to suggestions.
2022-07-27 00:51:05 +00:00
Michael Sproul
c2e9354126 Gracefully handle missing sync committee duties (#3086)
## Issue Addressed

Closes https://github.com/sigp/lighthouse/issues/3085
Closes https://github.com/sigp/lighthouse/issues/2953

## Proposed Changes

Downgrade some of the warnings logged by the VC which were useful during development of the sync committee service but are creating trouble now that we avoid populating the `sync_duties` map with 0 active validators.
2022-03-14 06:16:49 +00:00
Paul Hauner
c5c7476518 Web3Signer support for VC (#2522)
[EIP-3030]: https://eips.ethereum.org/EIPS/eip-3030
[Web3Signer]: https://consensys.github.io/web3signer/web3signer-eth2.html

## Issue Addressed

Resolves #2498

## Proposed Changes

Allows the VC to call out to a [Web3Signer] remote signer to obtain signatures.


## Additional Info

### Making Signing Functions `async`

To allow remote signing, I needed to make all the signing functions `async`. This caused a bit of noise where I had to convert iterators into `for` loops.

In `duties_service.rs` there was a particularly tricky case where we couldn't hold a write-lock across an `await`, so I had to first take a read-lock, then grab a write-lock.

### Move Signing from Core Executor

Whilst implementing this feature, I noticed that we signing was happening on the core tokio executor. I suspect this was causing the executor to temporarily lock and occasionally trigger some HTTP timeouts (and potentially SQL pool timeouts, but I can't verify this). Since moving all signing into blocking tokio tasks, I noticed a distinct drop in the "atttestations_http_get" metric on a Prater node:

![http_get_times](https://user-images.githubusercontent.com/6660660/132143737-82fd3836-2e7e-445b-a143-cb347783baad.png)

I think this graph indicates that freeing the core executor allows the VC to operate more smoothly.

### Refactor TaskExecutor

I noticed that the `TaskExecutor::spawn_blocking_handle` function would fail to spawn tasks if it were unable to obtain handles to some metrics (this can happen if the same metric is defined twice). It seemed that a more sensible approach would be to keep spawning tasks, but without metrics. To that end, I refactored the function so that it would still function without metrics. There are no other changes made.

## TODO

- [x] Restructure to support multiple signing methods.
- [x] Add calls to remote signer from VC.
- [x] Documentation
- [x] Test all endpoints
- [x] Test HTTPS certificate
- [x] Allow adding remote signer validators via the API
- [x] Add Altair support via [21.8.1-rc1](https://github.com/ConsenSys/web3signer/releases/tag/21.8.1-rc1)
- [x] Create issue to start using latest version of web3signer. (See #2570)

## Notes

- ~~Web3Signer doesn't yet support the Altair fork for Prater. See https://github.com/ConsenSys/web3signer/issues/423.~~
- ~~There is not yet a release of Web3Signer which supports Altair blocks. See https://github.com/ConsenSys/web3signer/issues/391.~~
2021-09-16 03:26:33 +00:00
Michael Sproul
17a2c778e3 Altair validator client and HTTP API (#2404)
## Proposed Changes

* Implement the validator client and HTTP API changes necessary to support Altair


Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2021-08-06 00:47:31 +00:00