Fix local testnet to generate keys in the correct folders when `BN_COUNT` and `VC_COUNT` don't match.
The current script place the generated validator keys in validator folders based on the `BN_COUNT` config, e.g. `node_1/validators`, `node_2/validators`..etc. We should be using `VC_COUNT` here instead, otherwise the number of validator clients may not match the number of directories generated, and would result in either:
1. a VC not having any keys (when `BN_COUNT` < `VC_COUNT`)
2. a validator key directory not being used (when `BN_COUNT` > `VC_COUNT`).
There is an issue with the file `scripts/local_testnet/start_local_testnet.sh` - when we use a non-default `$SPEC-PRESET` in `vars.env` it runs into an error:
```
executing: ./setup.sh >> /home/ck/.lighthouse/local-testnet/testnet/setup.log
parse error: Invalid numeric literal at line 1, column 7
```
@jimmygchen found the issue and the updated script includes the flag `--spec $SPEC-PRESET`
## Issue Addressed
#4402
## Proposed Changes
This PR adds QUIC support to Lighthouse. As this is not officially spec'd this will only work between lighthouse <-> lighthouse connections. We attempt a QUIC connection (if the node advertises it) and if it fails we fallback to TCP.
This should be a backwards compatible modification. We want to test this functionality on live networks to observe any improvements in bandwidth/latency.
NOTE: This also removes the websockets transport as I believe no one is really using it. It should be mentioned in our release however.
Co-authored-by: João Oliveira <hello@jxs.pt>
## Issue Addressed
Upgrade libp2p to v0.52
## Proposed Changes
- **Workflows**: remove installation of `protoc`
- **Book**: remove installation of `protoc`
- **`Dockerfile`s and `cross`**: remove custom base `Dockerfile` for cross since it's no longer needed. Remove `protoc` from remaining `Dockerfiles`s
- **Upgrade `discv5` to `v0.3.1`:** we have some cool stuff in there: no longer needs `protoc` and faster ip updates on cold start
- **Upgrade `prometheus` to `0.21.0`**, now it no longer needs encoding checks
- **things that look like refactors:** bunch of api types were renamed and need to be accessed in a different (clearer) way
- **Lighthouse network**
- connection limits is now a behaviour
- banned peers no longer exist on the swarm level, but at the behaviour level
- `connection_event_buffer_size` now is handled per connection with a buffer size of 4
- `mplex` is deprecated and was removed
- rpc handler now logs the peer to which it belongs
## Additional Info
Tried to keep as much behaviour unchanged as possible. However, there is a great deal of improvements we can do _after_ this upgrade:
- Smart connection limits: Connection limits have been checked only based on numbers, we can now use information about the incoming peer to decide if we want it
- More powerful peer management: Dial attempts from other behaviours can be rejected early
- Incoming connections can be rejected early
- Banning can be returned exclusively to the peer management: We should not get connections to banned peers anymore making use of this
- TCP Nat updates: We might be able to take advantage of confirmed external addresses to check out tcp ports/ips
Co-authored-by: Age Manning <Age@AgeManning.com>
Co-authored-by: Akihito Nakano <sora.akatsuki@gmail.com>
## Proposed Changes
- Fix bad `state_root` reuse in `lcli transition-blocks` that resulted in invalid results at skipped slots.
- Modernise `lcli pretty-ssz` to include fork-generic decoders for `SignedBeaconBlock` and `BeaconState` which respect the `--network`/`--testnet-dir` flag.
## Additional Info
Breaking change: the underscore names like `signed_block_merge` are removed in favour of the fork-generic name `SignedBeaconBlock`, and fork-specific names which match the superstruct variants, e.g. `SignedBeaconBlockMerge`.
## Issue Addressed
Addresses an issue where CI could fail due to an nonexisting file error:
```
Run ./clean.sh
rm: cannot remove '/home/runner/.lighthouse/local-testnet/geth_datadir4/geth/fastcache.tmp.1549331618': No such file or directory
Error: Process completed with exit code 1.
```
This seems to happen quite frequently now, I'm not sure exactly why but perhaps worth trying suppressing the prompt?
https://github.com/sigp/lighthouse/actions/runs/5455027574/jobs/9925916159?pr=4463
## Issue Addressed
N/A
## Proposed Changes
Geth's latest release breaks our CI with the following message
```
Fatal: Failed to register the Ethereum service: ethash is only supported as a historical component of already merged networks
Shutting down
```
Latest geth version has removed support for PoW networks. Hence, we need to add an extra `terminalTotalDifficultyPassed ` parameter in the genesis config to start from a merged network.
## Issue Addressed
N/A
## Proposed Changes
Modifies the local testnet scripts to start a network with genesis validators embedded into the genesis state. This allows us to start a local testnet without the need for deploying a deposit contract or depositing validators pre-genesis.
This also enables us to start a local test network at any fork we want without going through fork transitions. Also adds scripts to start multiple geth clients and peer them with each other and peer the geth clients with beacon nodes to start a post merge local testnet.
## Additional info
Adds a new lcli command `mnemonics-validators` that generates validator directories derived from a given mnemonic.
Adds a new `derived-genesis-state` option to the `lcli new-testnet` command to generate a genesis state populated with validators derived from a mnemonic.
## Issue Addressed
N/A
## Proposed Changes
Replace ganache-cli with anvil https://github.com/foundry-rs/foundry/blob/master/anvil/README.md
We can lose all js dependencies in CI as a consequence.
## Additional info
Also changes the ethers-rs version used in the execution layer (for the transaction reconstruction) to a newer one. This was necessary to get use the ethers utils for anvil. The fixed execution engine integration tests should catch any potential issues with the payload reconstruction after #3592
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
N/A
## Proposed Changes
The doppelganger tests were failing silently since the `PROPOSER_BOOST` config was not set. Sets the config and script returns an error if any subprocess fails.
## Proposed Changes
With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks.
This PR adds three flags to the BN to control this behaviour:
* `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default).
* `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.
* `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.
For safety Lighthouse will only attempt a re-org under very specific conditions:
1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`.
2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes.
3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense.
4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost.
## Additional Info
For the initial idea and background, see: https://github.com/ethereum/consensus-specs/pull/2353#issuecomment-950238004
There is also a specification for this feature here: https://github.com/ethereum/consensus-specs/pull/3034
Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: pawan <pawandhananjay@gmail.com>
## Issue Addressed
The release CI is currently broken due to the addition of the `protoc` dependency. Here's a failure of the release flow running on my fork: https://github.com/michaelsproul/lighthouse/actions/runs/3155541478/jobs/5134317334
## Proposed Changes
- Install `protoc` on Windows and Mac so that it's available for `cargo install`.
- Install an x86_64 binary in the Cross image for the aarch64 platform: we need a binary that runs on the host, _not_ on the target.
- Fix `macos` local testnet CI by using the Github API key to dodge rate limiting (this issue: https://github.com/actions/runner-images/issues/602).
## Issue Addressed
https://github.com/sigp/lighthouse/issues/3091
Extends https://github.com/sigp/lighthouse/pull/3062, adding pre-bellatrix block support on blinded endpoints and allowing the normal proposal flow (local payload construction) on blinded endpoints. This resulted in better fallback logic because the VC will not have to switch endpoints on failure in the BN <> Builder API, the BN can just fallback immediately and without repeating block processing that it shouldn't need to. We can also keep VC fallback from the VC<>BN API's blinded endpoint to full endpoint.
## Proposed Changes
- Pre-bellatrix blocks on blinded endpoints
- Add a new `PayloadCache` to the execution layer
- Better fallback-from-builder logic
## Todos
- [x] Remove VC transition logic
- [x] Add logic to only enable builder flow after Merge transition finalization
- [x] Tests
- [x] Fix metrics
- [x] Rustdocs
Co-authored-by: Mac L <mjladson@pm.me>
Co-authored-by: realbigsean <sean@sigmaprime.io>
## Proposed Changes
Adds some improvements I found when playing around with local testnets in #3335:
- When trying to kill processes, do not exit on a failure. (If a node fails to start due to a bug, the PID associated with it no longer exists. When trying to tear down the testnets, an error will be raised when it tries that PID and then will not try any PIDs following it. This change means it will continue and tear down the rest of the network.
- When starting the testnet, set `ulimit` to a high number. This allows the VCs to import 1000s of validators without running into limitations.
## Issue Addressed
#3006
## Proposed Changes
This PR changes the default behaviour of lighthouse to ignore discovered IPs that are not globally routable. It adds a CLI flag, --enable-local-discovery to permit the non-global IPs in discovery.
NOTE: We should take care in merging this as I will break current set-ups that rely on local IP discovery. I made this the non-default behaviour because we dont really want to be wasting resources attempting to connect to non-routable addresses and we dont want to propagate these to others (on the chance we can connect to one of these local nodes), improving discoveries efficiency.