Commit Graph

5256 Commits

Author SHA1 Message Date
Michael Sproul
cae73a4d82 Merge tag 'v4.5.0' into tree-states
v4.5.0
2023-09-26 11:21:44 +10:00
Paul Hauner
441fc1691b Release v4.5.0 (#4768)
## Issue Addressed

NA

## Proposed Changes

Bump versions from v4.4.1 to v4.5.0.

## Additional Info

NA
v4.5.0
2023-09-25 05:14:01 +00:00
João Oliveira
0f05499e30 Fix cli options (#4772)
## Issue Addressed

Fixes breaking change introduced on https://github.com/sigp/lighthouse/pull/4674/  that doesn't allow multiple `http_enabled` `ArgGroup` flags
2023-09-22 12:00:51 +00:00
Paul Hauner
fbb6997309 Fix release CI for self-hosted runners (#4770)
## Issue Addressed

NA

## Proposed Changes

Disables some commands for self-hosted runners to prevent failures.

## Additional Info

NA
2023-09-22 11:04:47 +00:00
Michael Sproul
364074da26 Tree states release v4.5.111-exp (#4769) v4.5.111-exp 2023-09-22 15:52:23 +10:00
Michael Sproul
d24875ff74 Merge remote-tracking branch 'origin/unstable' into tree-states 2023-09-22 15:11:42 +10:00
Michael Sproul
cd23c89adb Improve state cache eviction and reduce mem usage (#4762)
* Improve state cache eviction and reduce mem usage

* Fix epochs_per_state_diff tests
2023-09-22 14:49:15 +10:00
João Oliveira
dcd69dfc62 Move dependencies to workspace (#4650)
## Issue Addressed

Synchronize dependencies and edition on the workspace `Cargo.toml`

## Proposed Changes

with https://github.com/rust-lang/cargo/issues/8415 merged it's now possible to synchronize details on the workspace `Cargo.toml` like the metadata and dependencies.
By only having dependencies that are shared between multiple crates aligned on the workspace `Cargo.toml` it's easier to not miss duplicate versions of the same dependency and therefore ease on the compile times.

## Additional Info
this PR also removes the no longer required direct dependency of the `serde_derive` crate.

should be reviewed after https://github.com/sigp/lighthouse/pull/4639 get's merged.
closes https://github.com/sigp/lighthouse/issues/4651


Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-09-22 04:30:56 +00:00
antondlr
69c39ad1e5 Use release workflow runners (#4765)
## Issue Addressed

Build releases on self-hosted hardware to speed the process up
2023-09-22 02:33:13 +00:00
Age Manning
6b02e8525a Add new teku bootnodes (#4724)
Adds new Teku bootnodes

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-09-22 02:33:12 +00:00
Jimmy Chen
c4e907de9f Update the voluntary exit endpoint to comply with the key manager specification (#4679)
## Issue Addressed

#4635 

## Proposed Changes

Wrap the `SignedVoluntaryExit` object in a `GenericResponse` container, adding an additional `data` layer, to ensure compliance with the key manager API specification.

The new response would look like this:

```json
{"data":{"message":{"epoch":"196868","validator_index":"505597"},"signature":"0xhexsig"}}
```

This is a backward incompatible change and will affect Siren as well.
2023-09-22 02:33:11 +00:00
João Oliveira
c5588eb66e require http and metrics for respective flags (#4674)
## Issue Addressed

following discussion on https://github.com/sigp/lighthouse/pull/4639#discussion_r1305183750 this PR makes the `http` and `metrics` sub-flags to require those main flags enabled
2023-09-22 02:33:10 +00:00
Paul Hauner
2441a247ab Bump quinn-proto to address rustsec vuln (#4767)
## Issue Addressed

NA

## Proposed Changes

Bumps `quinn-proto` to address a QUIC-related vulnerability: https://rustsec.org/advisories/RUSTSEC-2023-0063

Fixes a `cargo audit` failure.

## Additional Info

NA
2023-09-21 22:37:00 +00:00
antondlr
d0b1abc6fa Update Holesky boot ENR (#4763)
## Issue Addressed

update boot ENR for Holesky relaunch
2023-09-21 22:36:59 +00:00
Michael Sproul
0074a3b5f5 Fix block & state queries prior to genesis (#4761)
## Issue Addressed

Closes #4751

## Proposed Changes

Prevent `state_root_at_slot` and `block_root_at_slot` from erroring out due to a call to `self.slot()?` that fails before genesis. This fixes pre-genesis queries for:

- block at slot 0
- block by genesis block root
- state at slot 0
- state by genesis state root
- state at `finalized` tag
- state at `justified` tag
2023-09-21 06:38:33 +00:00
Jimmy Chen
d3fe3ad337 Update holesky config for relaunch (#4760)
## Issue Addressed

#4759 

Note: Sigma Prime ENR hasn't been updated, tracking it in #4759
2023-09-21 06:38:32 +00:00
Eitan Seri-Levi
992b476eac Add SSZ support to validator block production endpoints (#4534)
## Issue Addressed

#4531 

## Proposed Changes

add SSZ support to the following block production endpoints:

GET /eth/v2/validator/blocks/{slot}
GET /eth/v1/validator/blinded_blocks/{slot}

## Additional Info

i updated a few existing tests to use ssz instead of writing completely new tests
2023-09-21 06:38:31 +00:00
Jimmy Chen
a0478da990 Fix genesis state download panic when running in debug mode (#4753)
## Issue Addressed

#4738 

## Proposed Changes

See the above issue for details. Went with option #2 to use the async reqwest client in `Eth2NetworkConfig` and propagate the async-ness.
2023-09-21 04:17:25 +00:00
realbigsean
082bb2d638 Self hosted docker builds (#4592)
## Issue Addressed

We're OOM'ing on Docker builds on the Deneb branch https://github.com/sigp/lighthouse/issues/3929

Are we ok to self host automated docker builds?


Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: antondlr <anton@delaruelle.net>
2023-09-21 04:17:24 +00:00
Jimmy Chen
fe3bd03234 Fix local testnet to generate keys in the correct folders (#4752)
Fix local testnet to generate keys in the correct folders when `BN_COUNT` and `VC_COUNT` don't match.

The current script place the generated validator keys in validator folders based on the `BN_COUNT` config, e.g. `node_1/validators`, `node_2/validators`..etc. We should be using `VC_COUNT` here instead, otherwise the number of validator clients may not match the number of directories generated, and would result in either:
1. a VC not having any keys  (when `BN_COUNT` < `VC_COUNT`)
2. a validator key directory not being used (when `BN_COUNT` > `VC_COUNT`).
2023-09-21 00:26:56 +00:00
chonghe
f9a3c00518 Update local testnet script (#4733)
There is an issue with the file `scripts/local_testnet/start_local_testnet.sh` - when we use a non-default `$SPEC-PRESET` in `vars.env` it runs into an error: 
```
executing: ./setup.sh >> /home/ck/.lighthouse/local-testnet/testnet/setup.log
parse error: Invalid numeric literal at line 1, column 7
```
@jimmygchen found the issue and the updated script includes the flag `--spec $SPEC-PRESET`
2023-09-21 00:26:55 +00:00
Michael Sproul
5a35278aea Add more checks and logging before genesis (#4730)
## Proposed Changes

This PR adds more logging prior to genesis, particularly on networks that start with execution enabled.

There are new checks using `eth_getBlockByHash/Number` to verify that the genesis state's `latest_execution_payload_header` matches the execution node's genesis block.

The first commit also runs the merge-readiness/Capella-readiness checks prior to genesis. This has two effects:

- Give more information on the execution node's status and its readiness for genesis.
- Prevent the `el_offline` status from being set on `/eth/v1/node/syncing`, which previously caused the VC to complain loudly.

I would like to include this for the Holesky reboot. It would have caught the misconfig that doomed the first Holesky.

## Additional Info

- Geth doesn't serve payload bodies prior to genesis, which is why we use the legacy methods. I haven't checked with other ELs yet.
- Currently this is logging errors with _Capella_ genesis states generated by `ethereum-genesis-generator` because the `withdrawals_root` is not set correctly (it is 0x0). This is not a blocker for Holesky, as it starts from Bellatrix (Pari is investigating).
2023-09-21 00:26:53 +00:00
Jimmy Chen
1e9925435e Reuse fork choice read lock instead of re-acquiring it immediately (#4688)
## Issue Addressed

I went through the code base and look for places where we acquire fork choice locks (after the deadlock bug was found and fixed in #4687), and discovered an instance where we re-acquire a lock immediately after dropping it. This shouldn't cause deadlock like the other issue, but is slightly less efficient.
2023-09-21 00:26:52 +00:00
Michael Sproul
4b6cb3db2c Prevent port re-use in HTTP API tests (#4745)
## Issue Addressed

CI is plagued by `AddrAlreadyInUse` failures, which are caused by race conditions in allocating free ports.

This PR removes all usages of the `unused_port` crate for Lighthouse's HTTP API, in favour of passing `:0` as the listen address. As a result, the listen address isn't known ahead of time and must be read from the listening socket after it binds. This requires tying some self-referential knots, which is a little disruptive, but hopefully doesn't clash too much with Deneb 🤞

There are still a few usages of `unused_tcp4_port` left in cases where we start external processes, like the `watch` Postgres DB, Anvil, Geth, Nethermind, etc. Removing these usages is non-trivial because it's hard to read the port back from an external process after starting it with `--port 0`. We might be able to do something on Linux where we read from `/proc/`, but I'll leave that for future work.
2023-09-20 01:19:03 +00:00
João Oliveira
d386a07b0c validator client: start http api before genesis (#4714)
## Issue Addressed

On a new network a user might require importing validators before waiting until genesis has occurred.

## Proposed Changes

Starts the validator client http api before waiting for genesis 

## Additional Info

cc @antondlr
quic-test
2023-09-15 10:08:30 +00:00
Jimmy Chen
b88e57c989 Update Java runtime requirement to 17 for Web3Signer tests (#4681)
Web3Signer now requires Java runtime v17, see [v23.8.0 release](https://github.com/Consensys/web3signer/releases/tag/23.8.0).

We have some Web3Signer tests that requires a compatible Java runtime to be installed on dev machines. This PR updates `setup` documentation in Lighthouse book, and also fixes a small typo.
2023-09-15 08:49:14 +00:00
Age Manning
e4ed317b76 Add Experimental QUIC support (#4577)
## Issue Addressed

#4402 

## Proposed Changes

This PR adds QUIC support to Lighthouse. As this is not officially spec'd this will only work between lighthouse <-> lighthouse connections. We attempt a QUIC connection (if the node advertises it) and if it fails we fallback to TCP. 

This should be a backwards compatible modification. We want to test this functionality on live networks to observe any improvements in bandwidth/latency.

NOTE: This also removes the websockets transport as I believe no one is really using it. It should be mentioned in our release however.


Co-authored-by: João Oliveira <hello@jxs.pt>
2023-09-15 03:07:24 +00:00
Michael Sproul
1b4bc8818b Release v4.4.111-exp (#4729) v4.4.111-exp 2023-09-14 10:07:26 +10:00
Michael Sproul
5cb2ed3696 Restore custom image for Cross 2023-09-13 14:43:02 +10:00
Michael Sproul
f7c6b7d64b Bump schema version to v24 2023-09-13 14:00:28 +10:00
Michael Sproul
68f80cc862 Change default epochs-per-state-diff to 16
This should make replaying diffs during non-finality a bit quicker.
2023-09-13 13:56:53 +10:00
Michael Sproul
838e104b25 Attempt to fix flaky test 2023-09-13 13:54:03 +10:00
Michael Sproul
d961d2c9ed Disable ARM docker builds 2023-09-13 12:51:20 +10:00
Michael Sproul
b8e04ce5a4 Merge remote-tracking branch 'origin/unstable' into tree-states 2023-09-13 11:25:18 +10:00
Jack McPherson
35f47f454f Await listening address from libp2p in RPC tests setup (#4705)
## Issue Addressed

#4704 

## Proposed Changes

 - Receive multiaddr from libp2p by awaiting listener setup

## Additional Info

See also: #4675
2023-09-11 06:14:56 +00:00
Jimmy Chen
1e4ee7aa5e Tree states to support per-slot state diffs (#4652)
* Support per slot state diffs

* Store HierarchyConfig on disk. Support storing hdiffs at per slot level.

* Revert HierachyConfig change for testing.

* Add validity check for the hierarchy config when opening the DB.

* Update HDiff tests.

* Fix `get_cold_state` panic when the diff for the slot isn't stored.

* Use slots instead of epochs for storing snapshots in freezer DB.

* Add snapshot buffer to `diff_buffer_cache` instead of loading it from db every time.

* Add `hierarchy-exponents` cli flag to beacon node.

* Add test for `StorageStrategy::ReplayFrom` and ignore a flaky test.

* Drop hierarchy_config in tests for more frequent snapshot and fix an issue where hdiff wasn't stored unless it's a epoch boundary slot.
2023-09-11 10:19:40 +10:00
Jimmy Chen
1ff4033830 Remove Node.js from release-tests CI job since we no longer use ganache (#4691)
## Issue Addressed

I noticed a node.js version warning on our CI, and thought about updating the version to get rid of this warning, but then realized we may not need node.js anymore now we're using `anvil` instead of `ganache`.

> The following actions uses node12 which is deprecated and will be forced to run on node16: actions/setup-node@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
2023-09-06 04:37:05 +00:00
Ricki Moore
0caf2af771 Feat: siren faq update (#4685)
## Issue Addressed

Siren FAQ requires more information regarding network connections to lighthouse BN/VC

## Proposed Changes

Added more info regarding port, BN/VC flags, ssh tunneling and VPNs access
2023-09-06 04:37:04 +00:00
Daichuan Wu
48e2b205e8 Fix some typos in "Advanced Networking" documentation (#4672)
## Issue Addressed
N/A

## Proposed Changes
The current Advanced Networking page references the ["--listen-addresses"](14924dbc95/book/src/advanced_networking.md (L124C8-L124C8)) argument, which does not exist in the beacon node. This PR changes such instances of "--listen-addresses" to "--listen-address".

Additionally, the page mentions using sockets that [both listen to IPv6](14924dbc95/book/src/advanced_networking.md (L151)) in a dual-stack setup? Hence, this PR also changes said line to "using one socket for IPv4 and another socket for IPv6".

## Additional Info
None.
2023-09-06 04:37:03 +00:00
chonghe
291ff640f1 Minor revision to Lighthouse Book on validator-manager (#4638)
Correct the formatting and remove `http-port 5062` to make the command simpler

Co-authored-by: chonghe <44791194+chong-he@users.noreply.github.com>
2023-09-06 04:37:02 +00:00
Michael Sproul
2841f60686 Release v4.4.1 (#4690)
## Proposed Changes

New release to replace the cancelled v4.4.0 release.

This release includes the bugfix #4687 which avoids a deadlock that was present in v4.4.0.

## Additional Info

Awaiting testing over the weekend this will be merged Monday September 4th.
v4.4.1
2023-09-04 02:56:52 +00:00
Michael Sproul
74eb267643 Remove double-locking deadlock from HTTP API (#4687)
## Issue Addressed

Fix a deadlock introduced in #4236 which was caught during the v4.4.0 release testing cycle (with thanks to @paulhauner and `gdb`).

## Proposed Changes

Avoid re-locking the fork choice read lock when querying a state by root in the HTTP API. This avoids a deadlock due to the lock already being held.

## Additional Info

The [RwLock docs](https://docs.rs/lock_api/latest/lock_api/struct.RwLock.html#method.read) explicitly advise against re-locking:

> Note that attempts to recursively acquire a read lock on a RwLock when the current thread already holds one may result in a deadlock.
2023-08-31 11:18:00 +00:00
Paul Hauner
e99ba3a14e Release v4.4.0 (#4673)
## Issue Addressed

NA

## Proposed Changes

Bump versions from `v4.3.0` to `v4.4.0`.

## Additional Info

NA
v4.4.0
2023-08-31 02:12:35 +00:00
Jimmy Chen
41ac9b6a08 Pin foundry toolchain version to fix stuck CI jobs 2023-08-30 21:41:35 +10:00
Philippe Schommers
0c23c86849 feat: add chiado (#4530)
## Issue Addressed

N/A

## Proposed Changes

Adds the Chiado (Gnosis testnet) network to the builtin one.

## Additional Info

It's a fairly trivial change all things considered as the preset already exists, so shouldn't be hard to maintain.

It compiles and seems to work, but I'm sure I missed something?

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-08-29 05:56:30 +00:00
Michael Sproul
f284e0e264 Fix bug in block root storage (#4663)
## Issue Addressed

Fix a bug in the storage of the linear block roots array in the freezer DB. Previously this array was always written as part of state storage (or block backfill). With state pruning enabled by #4610, these states were no longer being written and as a result neither were the block roots.

The impact is quite low, we would just log an error when trying to forwards-iterate the block roots, which for validating nodes only happens when they try to look up blocks for peers:

> Aug 25 03:42:36.980 ERRO Missing chunk in forwards iterator      chunk index: 49726, service: freezer_db

Any node checkpoint synced off `unstable` is affected and has a corrupt database. If you see the log above, you need to re-sync with the fix. Nodes that haven't checkpoint synced recently should _not_ be corrupted, even if they ran the buggy version.

## Proposed Changes

- Use a `ChunkWriter` to write the block roots when states are not being stored.
- Tweak the usage of `get_latest_restore_point` so that it doesn't return a nonsense value when state pruning is enabled.
- Tweak the guarantee on the block roots array so that block roots are assumed available up to the split slot (exclusive). This is a bit nicer than relying on anything to do with the latest restore point, which is a nonsensical concept when there aren't any restore points.

## Additional Info

I'm looking forward to deleting the chunked vector code for good when we merge tree-states 😁
2023-08-28 05:34:28 +00:00
Paul Hauner
d61f507184 Add Holesky (#4653)
## Issue Addressed

NA

## Proposed Changes

Add the Holesky network config as per 36e4ff2d51/custom_config_data.

Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see #4564 for context). To download this file we have:

- A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs)
- If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket.
- If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag.
- Whenever a genesis state is downloaded it is checked against a checksum baked into the binary.
- A genesis state will never be downloaded if it's already included in the binary.
- There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file.

## Log Output

Example of log output when a state is downloaded:

```bash
Aug 23 05:40:13.424 INFO Logging to file                         path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log"
Aug 23 05:40:13.425 INFO Lighthouse started                      version: Lighthouse/v4.3.0-bd9931f+
Aug 23 05:40:13.425 INFO Configured for network                  name: holesky
Aug 23 05:40:13.426 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/holesky
Aug 23 05:40:13.427 INFO Deposit contract                        address: 0x4242424242424242424242424242424242424242, deploy_block: 0
Aug 23 05:40:13.427 INFO Downloading genesis state               info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/
Aug 23 05:40:29.895 INFO Starting from known genesis state       service: beacon
```

Example of log output when there are no URLs specified:

```
Aug 23 06:29:51.645 INFO Logging to file                         path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log"
Aug 23 06:29:51.646 INFO Lighthouse started                      version: Lighthouse/v4.3.0-666a39c+
Aug 23 06:29:51.646 INFO Configured for network                  name: goerli
Aug 23 06:29:51.647 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/goerli
Aug 23 06:29:51.647 INFO Deposit contract                        address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322
The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url.
```

## Additional Info

I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉 

My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there.

I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states.

When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
2023-08-28 05:34:27 +00:00
Mac L
e056c279aa Increase web3signer_tests timeouts (#4662)
## Issue Addressed

`web3signer_tests` can sometimes timeout.

## Proposed Changes

Increase the `web3signer_tests` timeout from 20s to 30s

## Additional Info
Previously I believed the consistent CI failures were due to this, but it ended up being something different. See below:

---

The timing of this makes it very likely it is related to the [latest release of `web3-signer`](https://github.com/Consensys/web3signer/releases/tag/23.8.1).

I now believe this is due to an out of date Java runtime on our runners. A newer version of Java became a requirement with the new `web3-signer` release.

However, I was getting timeouts locally, which implies that the margin before timeout is quite small at 20s so bumping it up to 30s could be a good idea regardless.
2023-08-28 00:55:34 +00:00
Mac L
55e02e7c3f Show --gui flag in help text (#4660)
## Issue Addressed

N/A

## Proposed Changes

Remove the `hidden(true)` modifier on the `--gui` flag so it shows up when running `lighthouse bn --help`

## Additional Info

We need to include this now that Siren has had its first stable release.
2023-08-28 00:55:33 +00:00
Jimmy Chen
9c24cd4ad4 Do not log slot clock error prior to genesis (#4657)
## Issue Addressed

#4654 

## Proposed Changes

Only log error if we're unable to read slot clock after genesis. 

I thought about simply down grading the `error` to a `warn`, but feel like it's still unnecessary noise before genesis, and it would be good to retain error log if we're pass genesis. But I'd be ok with just downgrading the log level, too.
2023-08-28 00:55:32 +00:00