Ensure that we don't log a warning for HTTP 202s, which are expected on the blinded block endpoints after Fulu.
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Anchor currently depends on `lighthouse_network` for a few types and utilities that live within. As we use our own libp2p behaviours, we actually do not use the core logic in that crate. This makes us transitively depend on a bunch of unneeded crates (even a whole separate libp2p if the versions mismatch!)
Move things we require into it's own lightweight crate.
Co-Authored-By: Daniel Knopik <daniel@dknopik.de>
A performance issue was discovered when devnet-3 was under non-finality - some of the lighthouse nodes are "stuck" with syncing because of handling proposer duties HTTP requests.
These validator requests are higher priority than Status processing, and if they are taking a long time to process, the node won't be able to progress. What's worse is - under long period of non finality, the proposer duties calculation function tries to do state advance for a large number of [slots](d545ddcbc7/beacon_node/beacon_chain/src/beacon_proposer_cache.rs (L183)) here, causing the node to spend all its CPU time on a task that doesn't really help, e.g. the computed duties aren't useful if the node is 20000 slots behind.
To solve this issue, we use the `not_while_syncing` filter to prevent serving these requests, until the node is synced. This should allow the node to focus on sync under non-finality situations.
This PR fixes a bug where wrong columns could get processed immediately after a CGC increase.
Scenario:
- The node's CGC increased due to additional validators attached to it (lets say from 10 to 11)
- The new CGC is advertised and new subnets are subscribed immediately, however the change won't be effective in the data availability check until the next epoch (See [this](ab0e8870b4/beacon_node/beacon_chain/src/validator_custody.rs (L93-L99))). Data availability checker still only require 10 columns for the current epoch.
- During this time, data columns for the additional custody column (lets say column 11) may arrive via gossip as we're already subscribed to the topic, and it may be incorrectly used to satisfy the existing data availability requirement (10 columns), and result in this additional column (instead of a required one) getting persisted, resulting in database inconsistency.
I noticed that we are serving preset values for Fulu on mainnet nodes prior to the fork. This has already gone live in v7.1.0, but should hopefully be handled in a graceful way by API consumers.
This PR _reverts_ the serving of Fulu data prior to Fulu, by serving Fulu data only if Fulu is scheduled.
#7647
Introduces a new record in the blobs db `DataColumnCustodyInfo`
When `DataColumnCustodyInfo` exists in the db this indicates that a recent cgc change has occurred and/or that a custody backfill sync is currently in progress (custody backfill will be added as a separate PR). When a cgc change has occurred `earliest_available_slot` will be equal to the slot at which the cgc change occured. During custody backfill sync`earliest_available_slot` should be updated incrementally as it progresses.
~~Note that if `advertise_false_custody_group_count` is enabled we do not add a `DataColumnCustodyInfo` record in the db as that would affect the status v2 response.~~
(See comment https://github.com/sigp/lighthouse/pull/7648#discussion_r2212403389)
~~If `DataColumnCustodyInfo` doesn't exist in the db this indicates that we have fulfilled our custody requirements up to the DA window.~~
(It now always exist, and the slot will be set to `None` once backfill is complete)
StatusV2 now uses `DataColumnCustodyInfo` to calculate the `earliest_available_slot` if a `DataColumnCustodyInfo` record exists in the db, if it's `None`, then we return the `oldest_block_slot`.
N/A
Lighthouse BN http endpoint would return a server error pre-genesis on the `validator/duties/attester` and `validator/prepare_beacon_proposer` because `slot_clock.now()` would return a `None` pre-genesis.
The prysm VC depends on the endpoints pre-genesis and was having issues interoping with the lighthouse bn because of this reason.
The proposer duties endpoint explicitly handles the pre-genesis case here
538067f1ff/beacon_node/http_api/src/proposer_duties.rs (L23-L28)
I see no reason why we can't make the other endpoints more flexible to work pre-genesis. This PR handles the pre-genesis case on the attester and prepare_beacon_proposer endpoints as well.
Thanks for raising @james-prysm.
N/A
This PR switches to using `prepare_beacon_proposer` instead of `beacon_committee_subscriptions` endpoint to register validators with the custody context.
We currently use the `beacon_committee_subscriptions` endpoint for registering validators in the custody context.
Using the subscriptions endpoint has a few disadvantages:
1. The lighthouse VC tries to optimise the number of calls it makes to this endpoint to reduce the load on the subscriptions endpoint. So we would be getting different a subset of the total number of validators in each call. This will lead to a ramp up of the validator custody units instead of a one time bump. For e.g. see these logs
```
Jun 30 22:36:05.012 DEBUG Validator count at head updated old_count: 0, new_count: 19
Jun 30 22:36:11.016 DEBUG Validator count at head updated old_count: 19, new_count: 24
Jun 30 22:36:17.017 DEBUG Validator count at head updated old_count: 24, new_count: 27
Jun 30 22:36:23.020 DEBUG Validator count at head updated old_count: 27, new_count: 32
Jun 30 22:36:29.016 DEBUG Validator count at head updated old_count: 32, new_count: 36
Jun 30 22:36:35.005 DEBUG Validator count at head updated old_count: 36, new_count: 42
Jun 30 22:36:41.014 DEBUG Validator count at head updated old_count: 42, new_count: 44
Jun 30 22:36:47.017 DEBUG Validator count at head updated old_count: 44, new_count: 46
Jun 30 22:36:53.007 DEBUG Validator count at head updated old_count: 46, new_count: 48
Jun 30 22:36:59.009 DEBUG Validator count at head updated old_count: 48, new_count: 49
Jun 30 22:37:05.014 DEBUG Validator count at head updated old_count: 49, new_count: 50
Jun 30 22:37:11.007 DEBUG Validator count at head updated old_count: 50, new_count: 53
Jun 30 22:37:17.007 DEBUG Validator count at head updated old_count: 53, new_count: 55
Jun 30 22:37:35.008 DEBUG Validator count at head updated old_count: 55, new_count: 58
Jun 30 22:37:41.007 DEBUG Validator count at head updated old_count: 58, new_count: 59
Jun 30 22:37:53.010 DEBUG Validator count at head updated old_count: 59, new_count: 60
Jun 30 22:38:05.013 DEBUG Validator count at head updated old_count: 60, new_count: 61
Jun 30 22:38:23.006 DEBUG Validator count at head updated old_count: 61, new_count: 62
Jun 30 22:38:29.009 DEBUG Validator count at head updated old_count: 62, new_count: 63
Jun 30 22:38:41.009 DEBUG Validator count at head updated old_count: 63, new_count: 64
```
2. Different VCs would probably have different behaviours in terms of sending subscriptions
In contrast, the `prepare_beacon_proposer` endpoint usage would be more standard across different VCs without any filtering of validators. Not doing so could mean potentially missing proposals so VCs are incentivised to make this call on any change in the validators managed by them.
Lighthouse calls this endpoint every slot.
N/A
After the electra fork which includes EIP 6110, the beacon node no longer needs the eth1 bridging mechanism to include new deposits as they are provided by the EL as a `deposit_request`. So after electra + a transition period where the finalized bridge deposits pre-fork are included through the old mechanism, we no longer need the elaborate machinery we had to get deposit contract data from the execution layer.
Since holesky has already forked to electra and completed the transition period, this PR basically checks to see if removing all the eth1 related logic leads to any surprises.
Partially https://github.com/sigp/lighthouse/issues/6291
This PR removes the reprocess event channel from being externally exposed. All work events are now sent through the single `BeaconProcessorSend` channel. I've introduced a new `Work::Reprocess` enum variant which we then use to schedule jobs for reprocess. I've also created a new scheduler module which will eventually house the different scheduler impls.
This is all needed as an initial step to generalize the beacon processor
A "full" implementation for the generalized beacon processor can be found here
https://github.com/sigp/lighthouse/pull/6448
I'm going to try to break up the full implementation into smaller PR's so it can actually be reviewed
#6970
This allows for us to receive `SingleAttestation` over gossip and process it without converting. There is still a conversion to `Attestation` as a final step in the attestation verification process, but by then the `SingleAttestation` is fully verified.
I've also fully removed the `submitPoolAttestationsV1` endpoint as its been deprecated
I've also pre-emptively deprecated supporting `Attestation` in `submitPoolAttestationsV2` endpoint. See here for more info: https://github.com/ethereum/beacon-APIs/pull/531
I tried to the minimize the diff here by only making the "required" changes. There are some unnecessary complexities with the way we manage the different attestation verification wrapper types. We could probably consolidate this to one wrapper type and refactor this even further. We could leave that to a separate PR if we feel like cleaning things up in the future.
Note that I've also updated the test harness to always submit `SingleAttestation` regardless of fork variant. I don't see a problem in that approach and it allows us to delete more code :)
Resolves#6767
This PR implements a basic version of validator custody.
- It introduces a new `CustodyContext` object which contains info regarding number of validators attached to a node and the custody count they contribute to the cgc.
- The `CustodyContext` is added in the da_checker and has methods for returning the current cgc and the number of columns to sample at head. Note that the logic for returning the cgc existed previously in the network globals.
- To estimate the number of validators attached, we use the `beacon_committee_subscriptions` endpoint. This might overestimate the number of validators actually publishing attestations from the node in the case of multi BN setups. We could also potentially use the `publish_attestations` endpoint to get a more conservative estimate at a later point.
- Anytime there's a change in the `custody_group_count` due to addition/removal of validators, the custody context should send an event on a broadcast channnel. The only subscriber for the channel exists in the network service which simply subscribes to more subnets. There can be additional subscribers in sync that will start a backfill once the cgc changes.
TODO
- [ ] **NOT REQUIRED:** Currently, the logic only handles an increase in validator count and does not handle a decrease. We should ideally unsubscribe from subnets when the cgc has decreased.
- [ ] **NOT REQUIRED:** Add a service in the `CustodyContext` that emits an event once `MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS ` passes after updating the current cgc. This event should be picked up by a subscriber which updates the enr and metadata.
- [x] Add more tests
The endpoint `/eth/v1/beacon/states/head/validator_balances` returns an empty data when the data field is `[]`. According to the beacon API spec, it should return the balances of all validators:
Reference: https://ethereum.github.io/beacon-APIs/#/Beacon/postStateValidatorBalances
`If the supplied list is empty (i.e. the body is []) or no body is supplied then balances will be returned for all validators.`
This PR changes so that: `curl -X 'POST' 'http://localhost:5052/eth/v1/beacon/states/head/validator_balances' -d '[]' | jq` returns balances of all validators.
Fix clippy lints for `rustc` 1.87
clippy complains about `BeaconChainError` being too large. I went on a bit of a boxing spree because of this. We may instead want to `Box` some of the `BeaconChainError` variants?
#7294
Fix the filtering logic so that we actually filter by committee index for both `Base` and `Electra` attestations.
Added a tiny optimization when calculating committee_index to prevent unneeded memory allocations
Added a regression test
N/A
Adds endpoints to add and remove trusted peers from the http api. The added peers are trusted peers so they won't be disconnected for bad scores. We try to maintain a connection to the peer in case they disconnect from us by trying to dial it every heartbeat.
- #6452 (partially)
Remove dependencies on `store` and `lighthouse_network` from `eth2`. This was achieved as follows:
- depend on `enr` and `multiaddr` directly instead of using `lighthouse_network`'s reexports.
- make `lighthouse_network` responsible for converting between API and internal types.
- in two cases, remove complex internal types and use the generic `serde_json::Value` instead - this is not ideal, but should be fine for now, as this affects two internal non-spec endpoints which are meant for debugging, unstable, and subject to change without notice anyway. Inspired by #6679. The alternative is to move all relevant types to `eth2` or `types` instead - what do you think?
Delete duplicate sync tolerance epoch config in the HTTP API which is unused.
We introduced the `sync-tolerance-epoch` flag in this PR:
- https://github.com/sigp/lighthouse/pull/7030
Then refined it in this PR:
- https://github.com/sigp/lighthouse/pull/7044
Somewhere in the merge of `release-v7.0.0` into `unstable`, the config from the original PR which had been deleted came back. I think I resolved these conflicts, so my bad.
Related to #6880, an issue that's usually observed on local devnets with small number of nodes.
When testing range sync, I usually shutdown a node for some period of time and restart it again. However, if it's within `SYNC_TOLERANCE_EPOCHS` (8), Lighthouse would consider the node as synced, and if it may attempt to produce a block if requested by a validator - on a local devnet, nodes frequently produce blocks - when this happens, the node ends up producing a block that would revert finality and would get disconnected from peers immediately.
NOTE: This is PR#7030 cherry-picked from `unstable` to `release-v7.0.0`.
Run Lighthouse BN with this flag to override:
```
--sync-tolerance--epoch 0
```
Related to #6880, an issue that's usually observed on local devnets with small number of nodes.
When testing range sync, I usually shutdown a node for some period of time and restart it again. However, if it's within `SYNC_TOLERANCE_EPOCHS` (8), Lighthouse would consider the node as synced, and if it may attempt to produce a block if requested by a validator - on a local devnet, nodes frequently produce blocks - when this happens, the node ends up producing a block that would revert finality and would get disconnected from peers immediately.
### Usage
Run Lighthouse BN with this flag to override:
```
--sync-tolerance--epoch 0
```
Partly addresses
- https://github.com/sigp/lighthouse/issues/6959
Use the `enable_light_client_server` field from the beacon chain config in the HTTP API. I think we can make this the single source of truth, as I think the network crate also has access to the beacon chain config.
N/A
2 changes:
1. Replace Option::map_or(true, ...) with is_none_or(...)
2. Remove unnecessary `Into::into` blocks where the type conversion is apparent from the types