Since https://github.com/sigp/lighthouse/pull/6847, invalid `BlocksByRange`/`BlobsByRange` requests, which do not comply with the spec, are [handled in the Handler](3d16d1080f/beacon_node/lighthouse_network/src/rpc/handler.rs (L880-L911)). Any peer that sends an invalid request is penalized and disconnected.
However, other kinds of invalid rpc request, which result in decoding errors, are just dropped. No penalty is applied and the connection with the peer remains.
I have added handling for the `ListenUpgradeError` event to notify the application of an `RPCError:InvalidData` error and disconnect to the peer that sent the invalid rpc request.
I also added tests for handling invalid rpc requests.
Co-Authored-By: ackintosh <sora.akatsuki@gmail.com>
Just visual clean-up, making logging statements look uniform. There's no reason to use `tracing::debug` instead of `debug`. If we ever need to migrate our logging lib in the future it would make things easier too.
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
Which issue # does this PR address?
#8586
Please list or describe the changes introduced by this PR.
Remove `service_name` from `TaskExecutor`
Co-Authored-By: Abhivansh <31abhivanshj@gmail.com>
There are certain crates which we re-export within `types` which creates a fragmented DevEx, where there are various ways to import the same crates.
```rust
// consensus/types/src/lib.rs
pub use bls::{
AggregatePublicKey, AggregateSignature, Error as BlsError, Keypair, PUBLIC_KEY_BYTES_LEN,
PublicKey, PublicKeyBytes, SIGNATURE_BYTES_LEN, SecretKey, Signature, SignatureBytes,
get_withdrawal_credentials,
};
pub use context_deserialize::{ContextDeserialize, context_deserialize};
pub use fixed_bytes::FixedBytesExtended;
pub use milhouse::{self, List, Vector};
pub use ssz_types::{BitList, BitVector, FixedVector, VariableList, typenum, typenum::Unsigned};
pub use superstruct::superstruct;
```
This PR removes these re-exports and makes it explicit that these types are imported from a non-`consensus/types` crate.
Co-Authored-By: Mac L <mjladson@pm.me>
Organize and categorize `consensus/types` into modules based on their relation to key consensus structures/concepts.
This is a precursor to a sensible public interface.
While this refactor is very opinionated, I am open to suggestions on module names, or type groupings if my current ones are inappropriate.
Co-Authored-By: Mac L <mjladson@pm.me>
Fixes#7785
- [x] Update all integration tests with >1 files to follow the `main` pattern.
- [x] `crypto/eth2_key_derivation/tests`
- [x] `crypto/eth2_keystore/tests`
- [x] `crypto/eth2_wallet/tests`
- [x] `slasher/tests`
- [x] `common/eth2_interop_keypairs/tests`
- [x] `beacon_node/lighthouse_network/tests`
- [x] Set `debug_assertions` to false on `.vscode/settings.json`.
- [x] Document how to make rust analyzer work on integration tests files. In `book/src/contributing_setup.md`
---
Tracking a `rust-analyzer.toml` with settings like the one provided in `.vscode/settings.json` would be nicer. But this is not possible yet. For now, that config should be a good enough indicator for devs using editors different to VSCode.
Co-Authored-By: Daniel Ramirez-Chiquillo <hi@danielrachi.com>
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Take 2 of #8390.
Fixes the race condition properly instead of propagating the error. I think this is a better alternative, and doesn't seem to look that bad.
* Lift node id loading or generation from `NetworkService ` startup to the `ClientBuilder`, so that it can be used to compute custody columns for the beacon chain without waiting for Network bootstrap.
I've considered and implemented a few alternatives:
1. passing `node_id` to beacon chain builder and compute columns when creating `CustodyContext`. This approach isn't good for separation of concerns and isn't great for testability
2. passing `ordered_custody_groups` to beacon chain. `CustodyContext` only uses this to compute ordered custody columns, so we might as well lift this logic out, so we don't have to do error handling in `CustodyContext` construction. Less tests to update;.
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
https://github.com/sigp/lighthouse/issues/8012
Replace all instances of `VariableList::from` and `FixedVector::from` to their `try_from` variants.
While I tried to use proper error handling in most cases, there were certain situations where adding an `expect` for situations where `try_from` can trivially never fail avoided adding a lot of extra complexity.
Co-Authored-By: Mac L <mjladson@pm.me>
Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Anchor currently depends on `lighthouse_network` for a few types and utilities that live within. As we use our own libp2p behaviours, we actually do not use the core logic in that crate. This makes us transitively depend on a bunch of unneeded crates (even a whole separate libp2p if the versions mismatch!)
Move things we require into it's own lightweight crate.
Co-Authored-By: Daniel Knopik <daniel@dknopik.de>
This method is a footgun because it truncates the list. It is the source of a recent bug:
- https://github.com/sigp/lighthouse/pull/7927
- Delete uses of `RuntimeVariableList::from_vec` and replace them with `::new` which does validation and can fail.
- Propagate errors where possible, unwrap in tests and use `expect` for obviously-safe uses (in `chain_spec.rs`).
Adds the required boilerplate code for the Gloas (Glamsterdam) hard fork. This allows PRs testing Gloas-candidate features to test fork transition.
This also includes de-duplication of post-Bellatrix readiness notifiers from #6797 (credit to @dapplion)
Fixes#7926
This was a bug I introduced in #7890 and @pawanjay176 noticed it on some running nodes, and added a rpc test to confirm it.
The culprit is this line, where I failed to fill the vec to it's max size, so it doesn't calculate the max size properly, resulting in all `DataColumnByRoot` requests exceeding the max size during validation:
d24a6d2a45/consensus/types/src/chain_spec.rs (L1984)
The PR fixes this and includes new regression tests for this fix.
Fix Clippy for recently released Rust 1.90 beta. There may be more changes required when Rust 1.89 stable is released in a few days, but possibly not 🤞
Closes#7467.
This PR primarily addresses [the P2P changes](https://github.com/ethereum/EIPs/pull/9840) in [fusaka-devnet-2](https://fusaka-devnet-2.ethpandaops.io/). Specifically:
* [the new `nfd` parameter added to the `ENR`](https://github.com/ethereum/EIPs/pull/9840)
* [the modified `compute_fork_digest()` changes for every BPO fork](https://github.com/ethereum/EIPs/pull/9840)
90% of this PR was absolutely hacked together as fast as possible during the Berlinterop as fast as I could while running between Glamsterdam debates. Luckily, it seems to work. But I was unable to be as careful in avoiding bugs as I usually am. I've cleaned up the things *I remember* wanting to come back and have a closer look at. But still working on this.
Progress:
* [x] get it working on `fusaka-devnet-2`
* [ ] [*optional* disconnect from peers with incorrect `nfd` at the fork boundary](https://github.com/ethereum/consensus-specs/pull/4407) - Can be addressed in a future PR if necessary
* [x] first pass clean-up
* [x] fix up all the broken tests
* [x] final self-review
* [x] more thorough review from people more familiar with affected code
Resolves#6767
This PR implements a basic version of validator custody.
- It introduces a new `CustodyContext` object which contains info regarding number of validators attached to a node and the custody count they contribute to the cgc.
- The `CustodyContext` is added in the da_checker and has methods for returning the current cgc and the number of columns to sample at head. Note that the logic for returning the cgc existed previously in the network globals.
- To estimate the number of validators attached, we use the `beacon_committee_subscriptions` endpoint. This might overestimate the number of validators actually publishing attestations from the node in the case of multi BN setups. We could also potentially use the `publish_attestations` endpoint to get a more conservative estimate at a later point.
- Anytime there's a change in the `custody_group_count` due to addition/removal of validators, the custody context should send an event on a broadcast channnel. The only subscriber for the channel exists in the network service which simply subscribes to more subnets. There can be additional subscribers in sync that will start a backfill once the cgc changes.
TODO
- [ ] **NOT REQUIRED:** Currently, the logic only handles an increase in validator count and does not handle a decrease. We should ideally unsubscribe from subnets when the cgc has decreased.
- [ ] **NOT REQUIRED:** Add a service in the `CustodyContext` that emits an event once `MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS ` passes after updating the current cgc. This event should be picked up by a subscriber which updates the enr and metadata.
- [x] Add more tests
Currently `test_delayed_rpc_response` is flaky (possibly specific to Windows?), but I'm not sure why.
Enabled stdout logging in rpc_tests. Note that in nextest, std output is only displayed when a test fails.
closes https://github.com/sigp/lighthouse/issues/5785
The diagram below shows the differences in how the receiver (responder) behaves before and after this PR. The following sentences will detail the changes.
```mermaid
flowchart TD
subgraph "*** After ***"
Start2([START]) --> AA[Receive request]
AA --> COND1{Is there already an active request <br> with the same protocol?}
COND1 --> |Yes| CC[Send error response]
CC --> End2([END])
%% COND1 --> |No| COND2{Request is too large?}
%% COND2 --> |Yes| CC
COND1 --> |No| DD[Process request]
DD --> EE{Rate limit reached?}
EE --> |Yes| FF[Wait until tokens are regenerated]
FF --> EE
EE --> |No| GG[Send response]
GG --> End2
end
subgraph "*** Before ***"
Start([START]) --> A[Receive request]
A --> B{Rate limit reached <br> or <br> request is too large?}
B -->|Yes| C[Send error response]
C --> End([END])
B -->|No| E[Process request]
E --> F[Send response]
F --> End
end
```
### `Is there already an active request with the same protocol?`
This check is not performed in `Before`. This is taken from the PR in the consensus-spec, which proposes updates regarding rate limiting and response timeout.
https://github.com/ethereum/consensus-specs/pull/3767/files
> The requester MUST NOT make more than two concurrent requests with the same ID.
The PR mentions the requester side. In this PR, I introduced the `ActiveRequestsLimiter` for the `responder` side to restrict more than two requests from running simultaneously on the same protocol per peer. If the limiter disallows a request, the responder sends a rate-limited error and penalizes the requester.
### `Rate limit reached?` and `Wait until tokens are regenerated`
UPDATE: I moved the limiter logic to the behaviour side. https://github.com/sigp/lighthouse/pull/5923#issuecomment-2379535927
~~The rate limiter is shared between the behaviour and the handler. (`Arc<Mutex<RateLimiter>>>`) The handler checks the rate limit and queues the response if the limit is reached. The behaviour handles pruning.~~
~~I considered not sharing the rate limiter between the behaviour and the handler, and performing all of these either within the behaviour or handler. However, I decided against this for the following reasons:~~
- ~~Regarding performing everything within the behaviour: The behaviour is unable to recognize the response protocol when `RPC::send_response()` is called, especially when the response is `RPCCodedResponse::Error`. Therefore, the behaviour can't rate limit responses based on the response protocol.~~
- ~~Regarding performing everything within the handler: When multiple connections are established with a peer, there could be multiple handlers interacting with that peer. Thus, we cannot enforce rate limiting per peer solely within the handler. (Any ideas? 🤔 )~~
Resolves#6811
Rename `GOSSIP_MAX_SIZE` to `MAX_PAYLOAD_SIZE` and remove `MAX_CHUNK_SIZE` in accordance with the spec.
The spec also "clarifies" the message size limits at different levels. The rpc limits are equivalent to what we had before imo.
The gossip limits have additional checks.
I have gotten rid of the `is_bellatrix_enabled` checks that used a lower limit (1mb) pre-merge. Since all networks we run start from the merge, I don't think this will break any setups.
I've been working at updating another library to latest Lighthouse and got very confused with RPC request Ids.
There were types that had fields called `request_id` and `id`. And interchangeably could have types `PeerRequestId`, `rpc::RequestId`, `AppRequestId`, `api_types::RequestId` or even `Request.id`.
I couldn't keep track of which Id was linked to what and what each type meant.
So this PR mainly does a few things:
- Changes the field naming to match the actual type. So any field that has an `AppRequestId` will be named `app_request_id` rather than `id` or `request_id` for example.
- I simplified the types. I removed the two different `RequestId` types (one in Lighthouse_network the other in the rpc) and grouped them into one. It has one downside tho. I had to add a few unreachable lines of code in the beacon processor, which the extra type would prevent, but I feel like it might be worth it. Happy to add an extra type to avoid those few lines.
- I also removed the concept of `PeerRequestId` which sometimes went alongside a `request_id`. There were times were had a `PeerRequest` and a `Request` being returned, both of which contain a `RequestId` so we had redundant information. I've simplified the logic by removing `PeerRequestId` and made a `ResponseId`. I think if you look at the code changes, it simplifies things a bit and removes the redundant extra info.
I think with this PR things are a little bit easier to reasonable about what is going on with all these RPC Ids.
NOTE: I did this with the help of AI, so probably should be checked
N/A
In https://github.com/sigp/lighthouse/pull/6329 we changed `max_blobs_per_block` from a preset to a config value.
We weren't using the right value based on fork in that PR. This is a follow up PR to use the fork dependent values.
In the proces, I also updated other places where we weren't using fork dependent values from the ChainSpec.
Note to reviewer: easier to go through by commit
* add id to rpc requests
* rename rpc request and response types for more accurate meaning
* remove unrequired build_request function
* remove unirequired Request wrapper types and unify Outbound and Inbound Request
* add RequestId to NetworkMessage::SendResponse
,NetworkMessage::SendErrorResponse to be passed to Rpc::send_response
* Remove all batches related to a peer on disconnect
* Cleanup map entries after disconnect
* Allow lookups to continue in case of disconnections
* Pretty response types
* fmt
* Fix lints
* Remove lookup if it cannot progress
* Fix tests
* Remove poll_close on rpc behaviour
* Remove redundant test
* Fix issue raised by lion
* Revert pretty response types
* Cleanup
* Fix test
* Merge remote-tracking branch 'origin/release-v5.2.1' into rpc-error-on-disconnect-revert
* Apply suggestions from joao
Co-authored-by: João Oliveira <hello@jxs.pt>
* Fix log
* update request status on no peers found
* Do not remove lookup after peer disconnection
* Add comments about expected event api
* Update single_block_lookup.rs
* Update mod.rs
* Merge branch 'rpc-error-on-disconnect-revert' into 5969-review
* Merge pull request #10 from dapplion/5969-review
Add comments about expected event api
* Return and error if peer has disconnected
* Report errors for rate limited requests
* Code improvement
* Bump rust version to 1.78
* Downgrade to 1.77
* Update beacon_node/lighthouse_network/src/service/mod.rs
Co-authored-by: João Oliveira <hello@jxs.pt>
* fix fmt
* Merge branch 'unstable' of https://github.com/sigp/lighthouse into rpc-peer-disconnect-error
* update lockfile
* Report RPC Errors to the application on peer disconnections
Co-authored-by: Age Manning <Age@AgeManning.com>
* Expect RPCError::Disconnect to fail ongoing requests
* Drop lookups after peer disconnect and not awaiting events
* Allow RPCError disconnect through network service
* Update beacon_node/lighthouse_network/src/service/mod.rs
Co-authored-by: Age Manning <Age@AgeManning.com>
* Merge branch 'unstable' into rpc-error-on-disconnect
* move gossipsub into a separate crate
* Merge branch 'unstable' of github.com:sigp/lighthouse into separate-gossipsub
* address review 2
* clippy beta
* update logging to log gossipsub logs
* remove exit-future usage,
as it is non maintained, and replace with async-channel which is already in the repo.
* Merge branch 'unstable' of https://github.com/sigp/lighthouse into remove-exit-future
* Merge branch 'unstable' of https://github.com/sigp/lighthouse into remove-exit-future
* Lint fixes
* More fixes for beta compiler.
* Format fixes
* Move `#[allow(dead_code)]` to field level.
* Remove old comment.
* Update beacon_node/execution_layer/src/test_utils/mod.rs
Co-authored-by: João Oliveira <hello@jxs.pt>
* remove duplicate line