lighthouse

mirror of https://github.com/sigp/lighthouse.git synced 2026-04-25 00:38:22 +00:00

Author	SHA1	Message	Date
Michael Sproul	ff649f0b26	Implement committee cache diffs	2022-03-15 17:08:14 +11:00
Michael Sproul	1a261e1d3b	Implement database downgrade	2022-03-14 17:52:18 +11:00
Michael Sproul	b4c60807dd	Implement DB upgrade migration	2022-03-10 15:31:32 +11:00
Michael Sproul	0ee31a0a69	Add `lighthouse db` command!	2022-03-08 13:39:24 +11:00
Michael Sproul	e48ab54dcc	Jemalloc tuning via Cargo config	2022-03-07 18:47:05 +11:00
Michael Sproul	f93dfd0c28	Arc-ify immutable Validator fields	2022-03-07 17:33:59 +11:00
Michael Sproul	73af0b6282	CLI flags for state cache and compression level	2022-03-02 18:52:35 +11:00
Michael Sproul	64f0e3e13d	New state pruning algorithm	2022-03-02 15:40:56 +11:00
Michael Sproul	ebe8e30171	Merge remote-tracking branch 'origin/unstable' into tree-states	2022-03-01 16:03:41 +11:00
Michael Sproul	98629ce741	Several changes * Fix state cache pruning of finalized state from block map * Update to latest `milhouse` * Check beacon state diffs in EF tests	2022-03-01 15:54:14 +11:00
Age Manning	a1b730c043	Cleanup small issues (#3027 ) Downgrades some excessive networking logs and corrects some metrics.	2022-03-01 01:49:22 +00:00
Paul Hauner	27e83b888c	Retrospective invalidation of exec. payloads for opt. sync (#2837 ) ## Issue Addressed NA ## Proposed Changes Adds the functionality to allow blocks to be validated/invalidated after their import as per the [optimistic sync spec](https://github.com/ethereum/consensus-specs/blob/dev/sync/optimistic.md#how-to-optimistically-import-blocks). This means: - Updating `ProtoArray` to allow flipping the `execution_status` of ancestors/descendants based on payload validity updates. - Creating separation between `execution_layer` and the `beacon_chain` by creating a `PayloadStatus` struct. - Refactoring how the `execution_layer` selects a `PayloadStatus` from the multiple statuses returned from multiple EEs. - Adding testing framework for optimistic imports. - Add `ExecutionBlockHash(Hash256)` new-type struct to avoid confusion between beacon block roots and execution payload hashes. - Add `merge` to [`FORKS`](`c3a793fd73/Makefile (L17)`) in the `Makefile` to ensure we test the beacon chain with merge settings. - Fix some tests here that were failing due to a missing execution layer. ## TODO - [ ] Balance tests Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2022-02-28 22:07:48 +00:00
Michael Sproul	143cf59504	Beacon state diffs!	2022-02-25 19:35:45 +11:00
Michael Sproul	5e1f8a8480	Update to Rust 1.59 and 2021 edition (#3038 ) ## Proposed Changes Lots of lint updates related to `flat_map`, `unwrap_or_else` and string patterns. I did a little more creative refactoring in the op pool, but otherwise followed Clippy's suggestions. ## Additional Info We need this PR to unblock CI.	2022-02-25 00:10:17 +00:00
Mac L	c1df5d29cb	Ensure logfile respects the validators-dir CLI flag (#3003 ) ## Issue Addressed Closes #2990 ## Proposed Changes Add a check to see if the `--validators-dir` CLI flag is set and if so store validator logs into it. Ensure that if the log directory cannot be created, emit a `WARN` and disable file logging rather than panicking. ## Additional Info Panics associated with logfiles can still occur in these scenarios: 1. The `$datadir/validators/logs` directory already exists with the wrong permissions (or was changed after creation). 1. The logfile already exists with the wrong permissions (or was changed after creation). > These panics are cosmetic only since only the logfile thread panics. Following the panics, LH will continue to function as normal. I believe this is due to the use of [`slog::Fuse`](https://docs.rs/slog/latest/slog/struct.Fuse.html) when initializing the logger. I'm not sure if there a better way of handling logfile errors? I think ideally, rather than panicking, we would emit a `WARN` to the stdout logger with the panic reason, then exit the logfile thread gracefully.	2022-02-24 00:31:35 +00:00
Mac L	696de58141	Add aliases for validator-dir flags (#3034 ) ## Issue Addressed #3020 ## Proposed Changes - Alias the `validators-dir` arg to `validator-dir` in the `validator_client` subcommand. - Alias the `validator-dir` arg to `validators-dir` in the `account_manager validator` subcommand. - Add test for the validator_client alias.	2022-02-22 03:09:02 +00:00
Paul Hauner	5a0b049049	Avoid hogging the fallback `status` lock in the VC (#3022 ) ## Issue Addressed Addresses https://github.com/sigp/lighthouse/issues/2926 ## Proposed Changes Appropriated from https://github.com/sigp/lighthouse/issues/2926#issuecomment-1039676768: When a node returns any error we call [`CandidateBeaconNode::set_offline`](`c3a793fd73/validator_client/src/beacon_node_fallback.rs (L424)`) which sets it's `status` to `CandidateError::Offline`. That node will then be ignored until the routine [`fallback_updater_service`](`c3a793fd73/validator_client/src/beacon_node_fallback.rs (L44)`) manages to reconnect to it. However, I believe there was an issue in the [`CanidateBeaconNode::refesh_status`](`c3a793fd73/validator_client/src/beacon_node_fallback.rs (L157-L178)`) method, which is used by the updater service to see if the node has come good again. It was holding a [write lock on the `status` field](`c3a793fd73/validator_client/src/beacon_node_fallback.rs (L165)`) whilst it polled the node status. This means a long timeout would hog the write lock and starve other processes. When a VC is trying to access a beacon node for whatever purpose (getting duties, posting blocks, etc), it performs [three passes](`c3a793fd73/validator_client/src/beacon_node_fallback.rs (L432-L482)`) through the lists of nodes, trying to run some generic `function` (closure, lambda, etc) on each node: - 1st pass: only try running `function` on all nodes which are both synced and online. - 2nd pass: try running `function` on all nodes that are online, but not necessarily synced. - 3rd pass: for each offline node, try refreshing its status and then running `function` on it. So, it turns out that if the `CanidateBeaconNode::refesh_status` function from the routine update service is hogging the write-lock, the 1st pass gets blocked whilst trying to read the status of the first node. So, nodes that should be left until the 3rd pass are blocking the process of the 1st and 2nd passes, hence the behaviour described in #2926. ## Additional Info NA	2022-02-22 03:09:00 +00:00
Michael Sproul	b37d5db8df	Increase Bors timeout, refine target-branch-check (#3035 ) ## Issue Addressed Timeouts due to Windows builds running for 2h 20m. ## Proposed Changes * Increase Bors timeout to 3h * Refine the target branch check so that it will pass when we make PRs to feature branches. This is just an extra change I've been meaning to sneak in for a while. ## Additional Info * I think it would also be cool to try caching for CI again, but that's a separate issue and we'll still need the long timeout on a cache miss.	2022-02-21 23:21:03 +00:00
Mac L	104e3104f9	Add API to compute block packing efficiency data (#2879 ) ## Issue Addressed N/A ## Proposed Changes Add a HTTP API which can be used to compute the block packing data for all blocks over a discrete range of epochs. ## Usage ### Request ``` curl "http:localhost:5052/lighthouse/analysis/block_packing_efficiency?start_epoch=57730&end_epoch=57732" ``` ### Response ``` [ { "slot": "1847360", "block_hash": "0xa7dc230659802df2f99ea3798faede2e75942bb5735d56e6bfdc2df335dcd61f", "proposer_info": { "validator_index": 1686, "graffiti": "" }, "available_attestations": 7096, "included_attestations": 6459, "prior_skip_slots": 0 }, ... ] ``` ## Additional Info This is notably different to the existing lcli code: - Uses `BlockReplayer` #2863 and as such runs significantly faster than the previous method. - Corrects the off-by-one #2878 - Removes the `offline` validators component. This was only a "best guess" and simply was used as a way to determine an estimate of the "true" packing efficiency and was generally not helpful in terms of direct comparisons between different packing methods. As such it has been removed from the API and any future estimates of "offline" validators would be better suited in a separate/more targeted API or as part of 'beacon watch': #2873 - Includes `prior_skip_slots`.	2022-02-21 23:21:02 +00:00
Michael Sproul	0a4dcdd4e3	Very spicy consensus optimisations	2022-02-18 17:34:53 +11:00
eklm	56b2ec6b29	Allow proposer duties request for the next epoch (#2963 ) ## Issue Addressed Closes #2880 ## Proposed Changes Support requests to the next epoch in proposer_duties api. ## Additional Info Implemented with skipping proposer cache for this case because the cache for the future epoch will be missed every new slot as dependent_root is changed and we don't want to "wash it out" by saving additional values.	2022-02-18 05:32:00 +00:00
Michael Sproul	82bf8a3351	Delete current epoch vals from ParticipationCache	2022-02-18 14:22:25 +11:00
tim gretler	c8019caba6	Fix sync committee polling for 0 validators (#2999 ) ## Issue Addressed #2953 ## Proposed Changes Adds empty local validator check. ## Additional Info Two other options: - add check inside `local_index` collection. Instead of after collection. - Move `local_index` collection to the beginning of the `poll_sync_committee_duties` function and combine sync committee with altair fork check.	2022-02-18 02:36:44 +00:00
Age Manning	3ebb8b0244	Improved peer management (#2993 ) ## Issue Addressed I noticed in some logs some excess and unecessary discovery queries. What was happening was we were pruning our peers down to our outbound target and having some disconnect. When we are below this threshold we try to find more peers (even if we are at our peer limit). The request becomes futile because we have no more peer slots. This PR corrects this issue and advances the pruning mechanism to favour subnet peers. An overview the new logic added is: - We prune peers down to a target outbound peer count which is higher than the minimum outbound peer count. - We only search for more peers if there is room to do so, and we are below the minimum outbound peer count not the target. So this gives us some buffer for peers to disconnect. The buffer is currently 10% The modified pruning logic is documented in the code but for reference it should do the following: - Prune peers with bad scores first - If we need to prune more peers, then prune peers that are subscribed to a long-lived subnet - If we still need to prune peers, the prune peers that we have a higher density of on any given subnet which should drive for uniform peers across all subnets. This will need a bit of testing as it modifies some significant peer management behaviours in lighthouse.	2022-02-18 02:36:43 +00:00
Michael Sproul	da4ca024f1	Use SmallVec in Bitfield (#3025 ) ## Issue Addressed Alternative to #2935 ## Proposed Changes Replace the `Vec<u8>` inside `Bitfield` with a `SmallVec<[u8; 32>`. This eliminates heap allocations for attestation bitfields until we reach 500K validators, at which point we can consider increasing `SMALLVEC_LEN` to 40 or 48. While running Lighthouse under `heaptrack` I found that SSZ encoding and decoding of bitfields corresponded to 22% of all allocations by count. I've confirmed that with this change applied those allocations disappear entirely. ## Additional Info We can win another 8 bytes of space by using `smallvec`'s [`union` feature](https://docs.rs/smallvec/1.8.0/smallvec/#union), although I might leave that for a future PR because I don't know how experimental that feature is and whether it uses some spicy `unsafe` blocks.	2022-02-17 23:55:04 +00:00
Paul Hauner	0a6a8ea3b0	Engine API v1.0.0.alpha.6 + interop tests (#3024 ) ## Issue Addressed NA ## Proposed Changes This PR extends #3018 to address my review comments there and add automated integration tests with Geth (and other implementations, in the future). I've also de-duplicated the "unused port" logic by creating an `common/unused_port` crate. ## Additional Info I'm not sure if we want to merge this PR, or update #3018 and merge that. I don't mind, I'm primarily opening this PR to make sure CI works. Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2022-02-17 21:47:06 +00:00
Michael Sproul	0b171cf097	Use rustc-hash in participation cache	2022-02-17 17:32:40 +11:00
Michael Sproul	c88fcfed2b	Implement ConsensusContext	2022-02-17 16:40:32 +11:00
Michael Sproul	1db0e32bfb	Optimisations and bug fixes for state advance This commit is reasonably performant on Prater!	2022-02-17 14:00:57 +11:00
Michael Sproul	f5dae9106e	Inline safe_arith methods	2022-02-16 17:34:00 +11:00
Michael Sproul	062720f62e	Use SmallVec in Bitfield	2022-02-15 17:45:53 +11:00
Michael Sproul	5340c49de7	Use smallvec for tree hash packed encoding	2022-02-15 16:52:33 +11:00
Michael Sproul	e86cff2f8b	Load all states relative to finalized state	2022-02-15 15:37:24 +11:00
Michael Sproul	b8709fdcab	Fixups (still loading epoch boundary states)	2022-02-15 12:10:02 +11:00
Michael Sproul	5ff4868280	Merge remote-tracking branch 'michael/state-root-summary' into tree-states	2022-02-15 12:05:54 +11:00
Michael Sproul	5ed951d84c	Merge remote-tracking branch 'origin/unstable' into tree-states	2022-02-15 12:00:52 +11:00
Paul Hauner	2f8531dc60	Update to consensus-specs v1.1.9 (#3016 ) ## Issue Addressed Closes #3014 ## Proposed Changes - Rename `receipt_root` to `receipts_root` - Rename `execute_payload` to `notify_new_payload` - This is slightly weird since we modify everything except the actual HTTP call to the engine API. That change is expected to be implemented in #2985 (cc @ethDreamer) - Enable "random" tests for Bellatrix. ## Notes This will break partially compatibility with Kintusgi testnets in order to gain compatibility with [Kiln](https://hackmd.io/@n0ble/kiln-spec) testnets. I think it will only break the BN APIs due to the `receipts_root` change, however it might have some other effects too. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-02-14 23:57:23 +00:00
Michael Sproul	f888a08f15	Revamp state advance, delete snapshot cache	2022-02-14 16:16:12 +11:00
Michael Sproul	886afd684a	Update block reward API docs (#3013 ) ## Proposed Changes Fix the URLs and source code link in the docs for the block rewards API.	2022-02-11 11:02:09 +00:00
Michael Sproul	42e4675c97	Persistent PubkeyCache on the state	2022-02-11 18:15:34 +11:00
Michael Sproul	c97f6dcc06	Persistent committee caches and exit cache	2022-02-11 17:41:43 +11:00
Paul Hauner	c3a793fd73	v2.1.3 (#3017 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA v2.1.3	2022-02-11 01:54:33 +00:00
Zachinquarantine	b5921e4248	Remove Pyrmont testnet (#2543 ) ## Issue Addressed N/A ## Proposed Changes Removes all configurations and hard-coded rules related to the deprecated Pyrmont testnet. ## Additional Info Pyrmont is deprecated/will be shut down after being used for scenario testing, this PR removes configurations related to it. Co-authored-by: Zachinquarantine <zachinquarantine@yahoo.com>	2022-02-10 06:02:55 +00:00
Divma	1306b2db96	libp2p upgrade + gossipsub interval fix (#3012 ) ## Issue Addressed Lighthouse gossiping late messages ## Proposed Changes Point LH to our fork using tokio interval, which 1) works as expected 2) is more performant than the previous version that actually worked as expected Upgrade libp2p ## Additional Info https://github.com/libp2p/rust-libp2p/issues/2497	2022-02-10 04:12:03 +00:00
Paul Hauner	7e38d203ce	Add "update priority" (#2988 ) ## Issue Addressed NA ## Proposed Changes Add the "Update Priority" section which has featured in many of our previous releases (e.g., [Poñeta](https://github.com/sigp/lighthouse/releases/v2.1.1)). Previously this section has been copied in manually. ## Additional Info NA	2022-02-09 07:44:42 +00:00
Michael Sproul	4340ba01b5	More tree fields, fix bugs	2022-02-09 17:42:58 +11:00
Philipp K	5388183884	Allow per validator fee recipient via flag or file in validator client (similar to graffiti / graffiti-file) (#2924 ) ## Issue Addressed #2883 ## Proposed Changes * Added `suggested-fee-recipient` & `suggested-fee-recipient-file` flags to validator client (similar to graffiti / graffiti-file implementation). * Added proposer preparation service to VC, which sends the fee-recipient of all known validators to the BN via [/eth/v1/validator/prepare_beacon_proposer](https://github.com/ethereum/beacon-APIs/pull/178) api once per slot * Added [/eth/v1/validator/prepare_beacon_proposer](https://github.com/ethereum/beacon-APIs/pull/178) api endpoint and preparation data caching * Added cleanup routine to remove cached proposer preparations when not updated for 2 epochs ## Additional Info Changed the Implementation following the discussion in #2883. Co-authored-by: pk910 <philipp@pk910.de> Co-authored-by: Paul Hauner <paul@paulhauner.com> Co-authored-by: Philipp K <philipp@pk910.de>	2022-02-08 19:52:20 +00:00
Paul Hauner	d172c0b9fc	Bump crossbeam-utils to fix cargo-audit CI failure (#3004 ) ## Issue Addressed Bump `crossbeam-utils` to `0.8.7` since `0.8.6` was yanked and that made `cargo audit` fail.	2022-02-07 23:25:09 +00:00
ladidan	1fd883d79a	Fix Docker run -p for both TCP and UDP (#2998 ) ## Issue Addressed [Docker run] ... "-p 9000:9000" defaults to expose TCP only. ## Proposed Changes Add "-p 9000:9000/udp" for UDP peer discovery.	2022-02-07 23:25:08 +00:00
Divma	36fc887a40	Gossip cache timeout adjustments (#2997 ) ## Proposed Changes - Do not retry to publish sync committee messages. - Give a more lenient timeout to slashings and exits	2022-02-07 23:25:06 +00:00

1 2 3 4 5 ...

4402 Commits