lighthouse

mirror of https://github.com/sigp/lighthouse.git synced 2026-06-17 02:38:34 +00:00

Author	SHA1	Message	Date
realbigsean	4008da6c60	sync tx blobs	2022-09-29 12:32:55 -04:00
realbigsean	4cdf1b546d	add shanghai fork version and epoch	2022-09-29 12:28:58 -04:00
realbigsean	de44b300c0	add/update types	2022-09-29 12:25:56 -04:00
Age Manning	27bb9ff07d	Handle Lodestar's new agent string (#3620 ) ## Issue Addressed #3561 ## Proposed Changes Recognize Lodestars new agent string and appropriately count these peers as lodestar peers.	2022-09-29 01:50:13 +00:00
Age Manning	01b6bf7a2d	Improve logging a little (#3619 ) Some of the logs in combination with others could be improved. It will save some time debugging by improving the wording slightly.	2022-09-29 01:50:12 +00:00
Divma	b1d2510d1b	Libp2p v0.48.0 upgrade (#3547 ) ## Issue Addressed Upgrades libp2p to v.0.47.0. This is the compilation of - [x] #3495 - [x] #3497 - [x] #3491 - [x] #3546 - [x] #3553 Co-authored-by: Age Manning <Age@AgeManning.com>	2022-09-29 01:50:11 +00:00
Paul Hauner	01e84b71f5	v3.1.2 (#3603 ) ## Issue Addressed NA ## Proposed Changes Bump versions to v3.1.2 ## Additional Info - ~~Blocked on several PRs.~~ - ~~Requires further testing.~~	2022-09-26 01:17:36 +00:00
Divma	bd873e7162	New rust lints for rustc 1.64.0 (#3602 ) ## Issue Addressed fixes lints from the last rust release ## Proposed Changes Fix the lints, most of the lints by `clippy::question-mark` are false positives in the form of https://github.com/rust-lang/rust-clippy/issues/9518 so it's allowed for now ## Additional Info	2022-09-23 03:52:46 +00:00
Divma	9bd384a573	send attnet unsubscription event on random subnet expiry (#3600 ) ## Issue Addressed 🐞 in which we don't actually unsubscribe from a random long lived subnet when it expires ## Proposed Changes Remove code addressing a specific case in which we are subscribed to all subnets and handle the removal of the long lived subnet. I don't think the special case code is particularly important as, if someone is running with that many validators to be subscribed to all subnets, it should use `--subscribe-all-subnets` instead ## Additional Info Noticed on some test nodes climbing bandwidth usage periodically (around 27hours, the time of subnet expirations) I'm running this code to test this does not happen anymore, but I think it should be good now	2022-09-23 03:52:45 +00:00
Paul Hauner	9246a92d76	Make garbage collection test less failure prone (#3599 ) ## Issue Addressed NA ## Proposed Changes This PR attempts to fix the following spurious CI failure: ``` ---- store_tests::garbage_collect_temp_states_from_failed_block stdout ---- thread 'store_tests::garbage_collect_temp_states_from_failed_block' panicked at 'disk store should initialize: DBError { message: "Error { message: \"IO error: lock /tmp/.tmp6DcBQ9/cold_db/LOCK: already held by process\" }" }', beacon_node/beacon_chain/tests/store_tests.rs:59:10 ``` I believe that some async task is taking a clone of the store and holding it in some other thread for a short time. This creates a race-condition when we try to open a new instance of the store. ## Additional Info NA	2022-09-23 03:52:44 +00:00
Paul Hauner	fa6ad1a11a	Deduplicate block root computation (#3590 ) ## Issue Addressed NA ## Proposed Changes This PR removes duplicated block root computation. Computing the `SignedBeaconBlock::canonical_root` has become more expensive since the merge as we need to compute the merke root of each transaction inside an `ExecutionPayload`. Computing the root for [a mainnet block](https://beaconcha.in/slot/4704236) is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations after the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times. ## Additional Info NA	2022-09-23 03:52:42 +00:00
Paul Hauner	3128b5b430	v3.1.1 (#3585 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info - ~~Requires additional testing~~ - ~~Blocked on:~~ - ~~#3589~~ - ~~#3540~~ - ~~#3587~~	2022-09-22 06:08:52 +00:00
Paul Hauner	96692b8e43	Impl `oneshot_broadcast` for committee promises (#3595 ) ## Issue Addressed NA ## Proposed Changes Fixes an issue introduced in #3574 where I erroneously assumed that a `crossbeam_channel` multiple receiver queue was a broadcast queue. This is incorrect, each message will be received by only one receiver. The effect of this mistake is these logs: ``` Sep 20 06:56:17.001 INFO Synced slot: 4736079, block: 0xaa8a…180d, epoch: 148002, finalized_epoch: 148000, finalized_root: 0x2775…47f2, exec_hash: 0x2ca5…ffde (verified), peers: 6, service: slot_notifier Sep 20 06:56:23.237 ERRO Unable to validate attestation error: CommitteeCacheWait(RecvError), peer_id: 16Uiu2HAm2Jnnj8868tb7hCta1rmkXUf5YjqUH1YPj35DCwNyeEzs, type: "aggregated", slot: Slot(4736047), beacon_block_root: 0x88d318534b1010e0ebd79aed60b6b6da1d70357d72b271c01adf55c2b46206c1 ``` ## Additional Info NA	2022-09-21 01:01:50 +00:00
Paul Hauner	a95bcba2ab	Avoid holding write-lock whilst waiting on shuffling cache promise (#3589 ) ## Issue Addressed NA ## Proposed Changes Fixes a bug which hogged the write-lock for the `shuffling_cache`. ## Additional Info NA	2022-09-19 07:58:50 +00:00
Michael Sproul	507bb9dad4	Refined payload pruning (#3587 ) ## Proposed Changes Improve the payload pruning feature in several ways: - Payload pruning is now entirely optional. It is enabled by default but can be disabled with `--prune-payloads false`. The previous `--prune-payloads-on-startup` flag from #3565 is removed. - Initial payload pruning on startup now runs in a background thread. This thread will always load the split state, which is a small fraction of its total work (up to ~300ms) and then backtrack from that state. This pruning process ran in 2m5s on one Prater node with good I/O and 16m on a node with slower I/O. - To work with the optional payload pruning the database function `try_load_full_block` will now attempt to load execution payloads for finalized slots _if_ pruning is currently disabled. This gives users an opt-out for the extensive traffic between the CL and EL for reconstructing payloads. ## Additional Info If the `prune-payloads` flag is toggled on and off then the on-startup check may not see any payloads to delete and fail to clean them up. In this case the `lighthouse db prune_payloads` command should be used to force a manual sweep of the database.	2022-09-19 07:58:49 +00:00
Michael Sproul	f2ac0738d8	Implement `skip_randao_verification` and blinded block rewards API (#3540 ) ## Issue Addressed https://github.com/ethereum/beacon-APIs/pull/222 ## Proposed Changes Update Lighthouse's randao verification API to match the `beacon-APIs` spec. We implemented the API before spec stabilisation, and it changed slightly in the course of review. Rather than a flag `verify_randao` taking a boolean value, the new API uses a `skip_randao_verification` flag which takes no argument. The new spec also requires the randao reveal to be present and equal to the point-at-infinity when `skip_randao_verification` is set. I've also updated the `POST /lighthouse/analysis/block_rewards` API to take blinded blocks as input, as the execution payload is irrelevant and we may want to assess blocks produced by builders. ## Additional Info This is technically a breaking change, but seeing as I suspect I'm the only one using these parameters/APIs, I think we're OK to include this in a patch release.	2022-09-19 07:58:48 +00:00
Marius van der Wijden	6f7d21c542	enable 4844 at epoch 3	2022-09-18 12:13:03 +02:00
Marius van der Wijden	285dbf43ed	hacky hacks	2022-09-18 11:34:46 +02:00
Marius van der Wijden	8b71b978e0	new round of hacks (config etc)	2022-09-17 23:42:49 +02:00
Daniel Knopik	750c594f5f	forgor something	2022-09-17 21:38:57 +02:00
Daniel Knopik	eab1fce0e5	Merge branch 'eip4844' of github.com:dknopik/lighthouse into eip4844	2022-09-17 20:55:36 +02:00
Daniel Knopik	76572db9d5	add network config	2022-09-17 20:55:21 +02:00
Marius van der Wijden	f43532d3de	implement handle blobs by range req	2022-09-17 20:05:51 +02:00
Marius van der Wijden	f9209e2d08	more network stuff	2022-09-17 16:39:40 +02:00
Marius van der Wijden	aeb52ff186	network stuff	2022-09-17 16:10:42 +02:00
Daniel Knopik	d4d40be870	storable blobs	2022-09-17 15:58:52 +02:00
Marius van der Wijden	36a0add0cd	network stuff	2022-09-17 15:23:28 +02:00
Daniel Knopik	0518665949	Merge remote-tracking branch 'fork/eip4844' into eip4844	2022-09-17 14:58:33 +02:00
Daniel Knopik	292a16a6eb	gossip boilerplate	2022-09-17 14:58:27 +02:00
Marius van der Wijden	acace8ab31	network: blobs by range message	2022-09-17 14:55:18 +02:00
Daniel Knopik	bcc738cb9d	progress on gossip stuff	2022-09-17 14:31:57 +02:00
Marius van der Wijden	8473f08d10	beacon: consensus: implement engine api getBlobs	2022-09-17 14:10:15 +02:00
Daniel Knopik	dcfae6c5cf	implement From<FullPayload> for Payload	2022-09-17 13:29:20 +02:00
Marius van der Wijden	fe6be28e6b	beacon: consensus: implement engine api getBlobs	2022-09-17 13:20:18 +02:00
Daniel Knopik	ca1e17b386	it compiles!	2022-09-17 12:23:03 +02:00
Daniel Knopik	95203c51d4	fix some bugx, adjust stucts	2022-09-17 11:26:18 +02:00
Michael Sproul	ca42ef2e5a	Prune finalized execution payloads (#3565 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/3556 ## Proposed Changes Delete finalized execution payloads from the database in two places: 1. When running the finalization migration in `migrate_database`. We delete the finalized payloads between the last split point and the new updated split point. _If_ payloads are already pruned prior to this then this is sufficient to prune _all_ payloads as non-canonical payloads are already deleted by the head pruner, and all canonical payloads prior to the previous split will already have been pruned. 2. To address the fact that users will update to this code _after_ the merge on mainnet (and testnets), we need a one-off scan to delete the finalized payloads from the canonical chain. This is implemented in `try_prune_execution_payloads` which runs on startup and scans the chain back to the Bellatrix fork or the anchor slot (if checkpoint synced after Bellatrix). In the case where payloads are already pruned this check only imposes a single state load for the split state, which shouldn't be _too slow_. Even so, a flag `--prepare-payloads-on-startup=false` is provided to turn this off after it has run the first time, which provides faster start-up times. There is also a new `lighthouse db prune_payloads` subcommand for users who prefer to run the pruning manually. ## Additional Info The tests have been updated to not rely on finalized payloads in the database, instead using the `MockExecutionLayer` to reconstruct them. Additionally a check was added to `check_chain_dump` which asserts the non-existence or existence of payloads on disk depending on their slot.	2022-09-17 02:27:01 +00:00
Paul Hauner	2cd3e3a768	Avoid duplicate committee cache loads (#3574 ) ## Issue Addressed NA ## Proposed Changes I have observed scenarios on Goerli where Lighthouse was receiving attestations which reference the same, un-cached shuffling on multiple threads at the same time. Lighthouse was then loading the same state from database and determining the shuffling on multiple threads at the same time. This is unnecessary load on the disk and RAM. This PR modifies the shuffling cache so that each entry can be either: - A committee - A promise for a committee (i.e., a `crossbeam_channel::Receiver`) Now, in the scenario where we have thread A and thread B simultaneously requesting the same un-cached shuffling, we will have the following: 1. Thread A will take the write-lock on the shuffling cache, find that there's no cached committee and then create a "promise" (a `crossbeam_channel::Sender`) for a committee before dropping the write-lock. 1. Thread B will then be allowed to take the write-lock for the shuffling cache and find the promise created by thread A. It will block the current thread waiting for thread A to fulfill that promise. 1. Thread A will load the state from disk, obtain the shuffling, send it down the channel, insert the entry into the cache and then continue to verify the attestation. 1. Thread B will then receive the shuffling from the receiver, be un-blocked and then continue to verify the attestation. In the case where thread A fails to generate the shuffling and drops the sender, the next time that specific shuffling is requested we will detect that the channel is disconnected and return a `None` entry for that shuffling. This will cause the shuffling to be re-calculated. ## Additional Info NA	2022-09-16 08:54:03 +00:00
Paul Hauner	7d3948c8fe	Add metric for re-org distance (#3566 ) ## Issue Addressed NA ## Proposed Changes Add a metric to track the re-org distance. ## Additional Info NA	2022-09-13 17:19:27 +00:00
tim gretler	98815516a1	Support histogram buckets (#3391 ) ## Issue Addressed #3285 ## Proposed Changes Adds support for specifying histogram with buckets and adds new metric buckets for metrics mentioned in issue. ## Additional Info Need some help for the buckets. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-09-13 01:57:44 +00:00
Nils Effinghausen	f682df51a1	fix description for BALANCES_CACHE_MISSES metric (#3545 ) ## Issue Addressed fixes metric description Co-authored-by: Nils Effinghausen <nils.effinghausen@t-systems.com>	2022-09-10 01:35:10 +00:00
realbigsean	d1a8d6cf91	Pin mev rs deps (#3557 ) ## Issue Addressed We were unable to update lighthouse by running `cargo update` because some of the `mev-build-rs` deps weren't pinned. But `mev-build-rs` is now pinned here and includes it's own pinned commits for `ssz-rs` and `etheruem-consensus` Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-09-08 23:46:03 +00:00
Michael Sproul	9a7f7f1c1e	Configurable monitoring endpoint frequency (#3530 ) ## Issue Addressed Closes #3514 ## Proposed Changes - Change default monitoring endpoint frequency to 120 seconds to fit with 30k requests/month limit. - Allow configuration of the monitoring endpoint frequency using `--monitoring-endpoint-frequency N` where `N` is a value in seconds.	2022-09-05 08:29:00 +00:00
realbigsean	177aef8f1e	Builder profit threshold flag (#3534 ) ## Issue Addressed Resolves https://github.com/sigp/lighthouse/issues/3517 ## Proposed Changes Adds a `--builder-profit-threshold <wei value>` flag to the BN. If an external payload's value field is less than this value, the local payload will be used. The value of the local payload will not be checked (it can't really be checked until the engine API is updated to support this). Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-09-05 04:50:49 +00:00
realbigsean	cae40731a2	Strict count unrealized (#3522 ) ## Issue Addressed Add a flag that can increase count unrealized strictness, defaults to false ## Proposed Changes Please list or describe the changes introduced by this PR. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers. Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: sean <seananderson33@gmail.com>	2022-09-05 04:50:47 +00:00
Mac L	80359d8ddb	Fix attestation performance API `InvalidValidatorIndex` error (#3503 ) ## Issue Addressed When requesting an index which is not active during `start_epoch`, Lighthouse returns: ``` curl "http://localhost:5052/lighthouse/analysis/attestation_performance/999999999?start_epoch=100000&end_epoch=100000" ``` ```json { "code": 500, "message": "INTERNAL_SERVER_ERROR: ParticipationCache(InvalidValidatorIndex(999999999))", "stacktraces": [] } ``` This error occurs even when the index in question becomes active before `end_epoch` which is undesirable as it can prevent larger queries from completing. ## Proposed Changes In the event the index is out-of-bounds (has not yet been activated), simply return all fields as `false`: ``` -> curl "http://localhost:5052/lighthouse/analysis/attestation_performance/999999999?start_epoch=100000&end_epoch=100000" ``` ```json [ { "index": 999999999, "epochs": { "100000": { "active": false, "head": false, "target": false, "source": false } } } ] ``` By doing this, we cover the case where a validator becomes active sometime between `start_epoch` and `end_epoch`. ## Additional Info Note that this error only occurs for epochs after the Altair hard fork.	2022-09-05 04:50:45 +00:00
Divma	473abc14ca	Subscribe to subnets only when needed (#3419 ) ## Issue Addressed We currently subscribe to attestation subnets as soon as the subscription arrives (one epoch in advance), this makes it so that subscriptions for future slots are scheduled instead of done immediately. ## Proposed Changes - Schedule subscriptions to subnets for future slots. - Finish removing hashmap_delay, in favor of [delay_map](https://github.com/AgeManning/delay_map). This was the only remaining service to do this. - Subscriptions for past slots are rejected, before we would subscribe for one slot. - Add a new test for subscriptions that are not consecutive. ## Additional Info This is also an effort in making the code easier to understand	2022-09-05 00:22:48 +00:00
Paul Hauner	aa022f4685	v3.1.0 (#3525 ) ## Issue Addressed NA ## Proposed Changes - Bump versions ## Additional Info - ~~Blocked on #3508~~ - ~~Blocked on #3526~~ - ~~Requires additional testing.~~ - Expected release date is 2022-09-01	2022-08-31 22:21:55 +00:00
Paul Hauner	661307dce1	Separate committee subscriptions queue (#3508 ) ## Issue Addressed NA ## Proposed Changes As we've seen on Prater, there seems to be a correlation between these messages ``` WARN Not enough time for a discovery search subnet_id: ExactSubnet { subnet_id: SubnetId(19), slot: Slot(3742336) }, service: attestation_service ``` ... and nodes falling 20-30 slots behind the head for short periods. These nodes are running ~20k Prater validators. After running some metrics, I can see that the `network_recv` channel is processing ~250k `AttestationSubscribe` messages per minute. It occurred to me that perhaps the `AttestationSubscribe` messages are "washing out" the `SendRequest` and `SendResponse` messages. In this PR I separate the `AttestationSubscribe` and `SyncCommitteeSubscribe` messages into their own queue so the `tokio::select!` in the `NetworkService` can still process the other messages in the `network_recv` channel without necessarily having to clear all the subscription messages first. ~~I've also added filter to the HTTP API to prevent duplicate subscriptions going to the network service.~~ ## Additional Info - Currently being tested on Prater	2022-08-30 05:47:31 +00:00
Michael Sproul	7a50684741	Harden slot notifier against clock drift (#3519 ) ## Issue Addressed Partly resolves #3518 ## Proposed Changes Change the slot notifier to use `duration_to_next_slot` rather than an interval timer. This makes it robust against underlying clock changes.	2022-08-29 14:34:43 +00:00

... 5 6 7 8 9 ...

2297 Commits