lighthouse

mirror of https://github.com/sigp/lighthouse.git synced 2026-06-01 21:57:15 +00:00

Author	SHA1	Message	Date
Jimmy Chen	2d223575d6	Avoid unnecessary database lookups in data column RPC requests (#7897 ) This PR is an optimisation to avoid unnecessary database lookups when peer requests data columns that the node doesn't custody (advertised via `cgc`). e.g. an extreme but realistic example - a full node only store 4 custody columns by default, but it may receive a range request of 32 slots with all 128 columns, and this would result in 4096 database lookups but the node is only able to get 128 (4 * 32) of them. - Filter data column RPC requests (`DataColumnsByRoot`, `DataColumnsByRange`) to only lookup columns the node custodies - Prevents unnecessary database queries that would always fail for non-custody columns	2025-08-20 05:08:53 +00:00
Jimmy Chen	f6859b1137	Add tempo to local testnet config and update fulu kurtosis config files (#7898 ) This PR adds tempo to kurtosis config and will collect lighthouse traces on kurtosis local testnet. The traces can be viewed / queried from Grafana. Also updated fulu kurtosis configs to use latest geth image.	2025-08-20 02:30:11 +00:00
Jimmy Chen	b4704eab4a	Fulu update to spec v1.6.0-alpha.4 (#7890 ) Fulu update to spec [v1.6.0-alpha.4](https://github.com/ethereum/consensus-specs/releases/tag/v1.6.0-alpha.4). - Make `number_of_columns` a preset - Optimise `get_custody_groups` to avoid computing if cgc = 128 - Add support for additional typenum values in type_dispatch macro	2025-08-20 02:05:04 +00:00
Jimmy Chen	95882bfa66	Add `--telemetry-service-name` CLI flag for OpenTelemetry service name override (#7903 ) Allows users to customize the OpenTelemetry service name instead of using the hardcoded default `lighthouse`. Defaults to 'lighthouse-bn' for beacon node, 'lighthouse-vc' for validator client, or 'lighthouse' for other subcommands. This is useful when analysing traces from multiple nodes, see Grafana screenshot below with service name overrides in Kurtosis (`ethereum-package` PR: https://github.com/ethpandaops/ethereum-package/pull/1160): <img width="1148" height="627" alt="image" src="https://github.com/user-attachments/assets/7e875639-10f7-4756-837f-2006fa4b12e0" />	2025-08-20 01:16:34 +00:00
Jimmy Chen	34dd1b27ae	Revise data column rpc limits and queue sizes (#7887 ) Revise data column rpc limits and queue sizes. Also removed some outdated TODOs for Fulu / das.	2025-08-19 03:48:08 +00:00
Daniel Knopik	1fd7ead010	Do not filter validators by status if filter is an empty list (#7884 ) `69d2feb12a/apis/beacon/states/validators.yaml (L128-L130)` says we need to not filter if the filter is an empty list. Add a check for `statuses.is_empty()`.	2025-08-18 07:46:37 +00:00
Michael Sproul	836c39efaa	Shrink persisted fork choice data (#7805 ) Closes: - https://github.com/sigp/lighthouse/issues/7760 - [x] Remove `balances_cache` from `PersistedForkChoiceStore` (~65 MB saving on mainnet) - [x] Remove `justified_balances` from `PersistedForkChoiceStore` (~16 MB saving on mainnet) - [x] Remove `balances` from `ProtoArray`/`SszContainer`. - [x] Implement zstd compression for votes - [x] Fix bug in justified state usage - [x] Bump schema version to V28 and implement migration.	2025-08-18 06:03:28 +00:00
Michael Sproul	08234b2823	Add rustfmt config with edition 2024 (#7888 ) Since we updated to edition 2024 my Vim plugin for rustfmt is formatting code incorrectly, with 2018 settings: `889b9a7515/autoload/rustfmt.vim (L74-L75)` Arguably this plugin is a bit junk, but I think it's fairly harmless to add this config. Add `rustfmt.toml`. This is a generic config file for `rustfmt` which is probably useful for `rustfmt` integration with other editors too. We may want to add other config to `rustfmt.toml` over time as well, I think this was discussed recently.	2025-08-18 04:32:58 +00:00
Jimmy Chen	aa8cba3741	Upgrade rust-eth-kzg to `0.8.0` (#7870 ) #7864 The main breaking change in v0.8.0 is the `TrustedSetup` initialisation - it now requires a json string via `PeerDASTrustedSetup::from_json`.	2025-08-18 02:52:39 +00:00
Age Manning	9200042910	Transition network key to hex format (#7665 ) #7181 Instead of storing the network key as binary data we store it as hex, allowing users to modify it via the file. We can read old-binary forms, however we will migrate binary to hex as it will be the new standard.	2025-08-15 07:12:19 +00:00
Michael Sproul	317dc0f56c	Fix malloc_utils features (sysmalloc) (#7770 ) Follow-up to: - https://github.com/sigp/lighthouse/pull/7764 The `heaptrack` feature added in my previous PR was ineffective, because the jemalloc feature was turned on by the Linux target-specific dependency. This PR tweaks the features such that: - The jemalloc feature is just used to control whether jemalloc is compiled in. It is enabled on Linux by the target-specific dependency (see `lighthouse/Cargo.toml`), and completely disabled on Windows. - If the `sysmalloc` feature is set on Linux then it overrides jemalloc when selecting an allocator, _even if_ the jemalloc feature is enabled (and the jemalloc dep was compiled).	2025-08-15 03:46:38 +00:00
Eitan Seri-Levi	90fa7c216e	Fix ssz formatting for `/light_client/updates` beacon API endpoint (#7806 ) #7759 We were incorrectly encoding the full response from `/light_client/updates` instead of only encoding the light client update	2025-08-15 03:17:29 +00:00
Michael Sproul	42f6d7b02d	Yeet env_logger into the sun (#7872 ) - Remove explicit `env_logger` usage from `state_processing` tests and `lcli`. - Set up tracing correctly for `lcli` (I've checked that we can see logs after this change). - I didn't do anything to set up logging for the `state_processing` tests, as these are rarely run manually (they never fail). We could add `test_logger` in there on an as-needed basis.	2025-08-15 03:17:26 +00:00
antondlr	5ebb44e222	Try using sccache instead of disabling (#7873 ) We temporarily can't build sccache on windows runners, but it's still available on linux. this smol change lets us use it when available, instead of disabling across the board. The Windows runners now have a conditional check to disable (unset the `rustc-wrapper` env var) sccache in their entrypoint, just like the Linux ones have. Also the workflows no longer fail when `sccache --show-stats` fails.	2025-08-14 00:10:13 +00:00
Age Manning	ee1b0ae2ff	Allow for sync state where batch is unknown (#7391 )	2025-08-13 06:00:49 +00:00
chonghe	522bd9e9c6	Update Rust Edition to 2024 (#7766 ) * #7749 Thanks @dknopik and @michaelsproul for your help!	2025-08-13 03:04:31 +00:00
Michael Sproul	bd6b8b6a65	Disable sccache to fix Windows builds (#7867 ) Quick fix to unblock Windows CI. I have a hammer and I'm using it.	2025-08-13 01:51:19 +00:00
antondlr	6604fd10b4	Deprecate macOS-x86 binaries (#7862 ) Rust is demoting x86 for macOS: https://blog.rust-lang.org/2025/08/07/Rust-1.89.0/ This makes it unfeasible to maintain such a build going forward. Stop publishing `x86_64-apple-darwin` binaries.	2025-08-12 07:23:31 +00:00
Jimmy Chen	4ef4bdc38b	Initial Claude.md draft (#7848 ) Add Initial `Claude.md` draft. Feel free to comment and make suggestions.	2025-08-12 05:16:23 +00:00
Mac L	152f2bb2e4	Re-export `context_deserialize_derive` inside `context_deserialize` (#7852 ) Re-export `context_deserialize_derive` inside of `context_deserialize` so they are both available from the same interface, which matches how popular crates (like `serde`) handle this. This also nests both crates inside a new `context_deserialize` directory which will make it easier to eventually spin out into a different repo (if/when) we decide to do that (plus I prefer it aesthetically).	2025-08-12 05:16:19 +00:00
Michael Sproul	918121e313	Fix bugs in rebasing of states prior to finalization (#7849 ) Attempt to fix this error reported by `beaconcha.in` on their Hoodi archive nodes: > {"code":500,"message":"UNHANDLED_ERROR: DBError(CacheBuildError(BeaconState(MilhouseError(OutOfBoundsIterFrom { index: 1199549, len: 1060000 }))))","stacktraces":[]} There are only a handful of places where we call `iter_from`. This one is safe by construction (the check immediately prior ensures `self.pubkeys.len()` is not out of bounds): `cfb1f73310/beacon_node/beacon_chain/src/validator_pubkey_cache.rs (L84-L90)` This one should also be safe, and the indexes used here would not be as large as the ones in the reported error: `cfb1f73310/consensus/state_processing/src/per_epoch_processing/single_pass.rs (L365-L368)` Which leaves one remaining usage which must be the culprit: `cfb1f73310/consensus/types/src/beacon_state.rs (L2109-L2113)` This indexing relies on the invariant that `self.pubkey_cache().len() <= self.validators.len()`. We mostly maintain that invariant, except for in `rebase_caches_on` (fixed in this PR). The other bug, is that we were calling `rebase_on_finalized` for all "hot" states, which post-v7.1.0 includes states prior to the split which are required by the hdiff grid. This is how we end up calling something like `genesis_state.rebase_on(&split_state)`, which then corrupts the pubkey cache of the genesis state using the newer pubkey cache from the split state.	2025-08-12 02:19:24 +00:00
Pawan Dhananjay	80ba0b169b	Backfill peer attribution (#7762 ) Partly addresses https://github.com/sigp/lighthouse/issues/7744 Implement similar peer sync attribution like in #7733 for backfill sync.	2025-08-12 02:11:56 +00:00
Eitan Seri-Levi	122f16776f	Add metrics to track beacon processor queue times (#7808 ) This PR adds a created_timestamp to the beacon processor send channel. When work items are sent through that channel `try_send` will forward the work event along with the current timestamp to the beacon processor. When the work event is completed the `Drop` impl for `SendOnDrop` will track the time it took from work event creation to its completion. Previously we only had data on how long a work event took to process, but didn't have data on how long it sat in the queue + how long it took to process.	2025-08-12 01:06:42 +00:00
Pawan Dhananjay	4262ad3e01	Add a flag to disable getBlobs (#7853 ) N/A Add a flag to disable get blobs. I configured the flag to disable it regardless of version because its most likely something we use for testing anyway.	2025-08-11 23:17:00 +00:00
Jimmy Chen	40c2fd5ff4	Instrument tracing spans for block processing and import (#7816 ) #7815 - removes all existing spans, so some span fields that appear in logs like `service_name` may be lost. - instruments a few key code paths in the beacon node, starting from root spans named below: * Gossip block and blobs * `process_gossip_data_column_sidecar` * `process_gossip_blob` * `process_gossip_block` * Rpc block and blobs * `process_rpc_block` * `process_rpc_blobs` * `process_rpc_custody_columns` * Rpc blocks (range and backfill) * `process_chain_segment` * `PendingComponents` lifecycle * `pending_components` To test locally: * Run Grafana and Tempo with https://github.com/sigp/lighthouse-metrics/pull/57 * Run Lighthouse BN with `--telemetry-collector-url http://localhost:4317` Some captured traces can be found here: https://hackmd.io/@jimmygchen/r1sLOxPPeg Removing the old spans seem to have reduced the memory usage quite a lot - i think we were using them on long running tasks and too excessively: <img width="910" height="495" alt="image" src="https://github.com/user-attachments/assets/5208bbe4-53b2-4ead-bc71-0b782c788669" />	2025-08-08 05:32:22 +00:00
Jimmy Chen	6dfab22267	Fix Rust 1.89 compiler warnings in slasher tests. (#7844 ) As described in title, failing test here https://github.com/sigp/lighthouse/actions/runs/16818997885/job/47646515894	2025-08-08 04:41:08 +00:00
Daniel Ramirez-Chiquillo	cafb3644e2	Fix Makefile line continuation syntax in test-release target (#7834 ) #7833 Fix a typo on the `Makefile` that was causing `make test` to run `http_api` tests when they should have been ignored.	2025-08-07 08:32:52 +00:00
Jimmy Chen	3a02bdd94a	Adjust DA checker cache size (#7825 ) The current `OVERFLOW_LRU_CAPACITY` of `1024` seems a bit excessive now we rarely store more than 1 `PendingComponents` (under normal networking components). Additionally given the blob count increases, the max size of `PendingComponents` has also increased and is expected to increase further. This PR brings the max capacity of the cache down to `64`, which should be more than enough headroom but also give us better protection from the network.	2025-08-07 05:11:38 +00:00
Jimmy Chen	8bc6693dac	Fix wrong columns getting processed on a CGC change (#7792 ) This PR fixes a bug where wrong columns could get processed immediately after a CGC increase. Scenario: - The node's CGC increased due to additional validators attached to it (lets say from 10 to 11) - The new CGC is advertised and new subnets are subscribed immediately, however the change won't be effective in the data availability check until the next epoch (See [this](`ab0e8870b4/beacon_node/beacon_chain/src/validator_custody.rs (L93-L99)`)). Data availability checker still only require 10 columns for the current epoch. - During this time, data columns for the additional custody column (lets say column 11) may arrive via gossip as we're already subscribed to the topic, and it may be incorrectly used to satisfy the existing data availability requirement (10 columns), and result in this additional column (instead of a required one) getting persisted, resulting in database inconsistency.	2025-08-07 00:45:04 +00:00
Daniel Ramirez-Chiquillo	9c972201bc	Fix: RPC test failures (#7734 ) Fixes #7735 Use `tracing::subscriber::set_default` to ensure that each test/thread has its own subscirber.	2025-08-06 14:59:41 +00:00
Eric Tu	c06ac81c67	Shuffling for 32 bit platforms (#7725 ) - In shuffling, a the raw_pivot (u64) is cast to a usize which will break on 32 bit systems. Now it is modulo'ed with the list_size first then cast to a usize. - ruint doesn't implement shifting with u64's on 32-bit arch. Since `prefix_bits` is u8 and NODE_ID_BITS = 256, we use them as u32's instead. See: https://docs.rs/ruint/latest/src/ruint/bits.rs.html#711	2025-08-06 02:37:07 +00:00
Michael Sproul	0dcce40ccb	Fix Clippy for Rust 1.90 beta (#7826 ) Fix Clippy for recently released Rust 1.90 beta. There may be more changes required when Rust 1.89 stable is released in a few days, but possibly not 🤞	2025-08-05 13:52:26 +00:00
Jimmy Chen	adf6ad70f0	Update fetch blobs metrics buckets (#7823 ) While looking at metrics I noticed that `beacon_blobs_from_el_expected` and `beacon_blobs_from_el_received_total` have different buckets, this PR adds more buckets to both (to prepare for Fusaka) and make them both consistent.	2025-08-01 18:27:53 +00:00
Age Manning	2f59d5208a	Filter dependencies from SSE logging (#7819 )	2025-08-01 04:45:20 +00:00
Michael Sproul	134039d014	Simplify ConfigAndPreset (#7777 ) I noticed that we are serving preset values for Fulu on mainnet nodes prior to the fork. This has already gone live in v7.1.0, but should hopefully be handled in a graceful way by API consumers. This PR _reverts_ the serving of Fulu data prior to Fulu, by serving Fulu data only if Fulu is scheduled.	2025-07-25 08:53:24 +00:00
Pawan Dhananjay	09065a851f	Add builder blinded_blocks v2 (#7778 ) Partially addresses https://github.com/sigp/lighthouse/issues/7381 Add blinded_blocks v2 method specified in https://github.com/ethereum/builder-specs/pull/123/	2025-07-25 08:29:19 +00:00
Jimmy Chen	2aae08a8aa	Remove KZG verification on blobs fetched from the EL (#7771 ) Continuation of #7713, addresses comment about skipping KZG verification on EL fetched blobs: https://github.com/sigp/lighthouse/pull/7713#discussion_r2198542501	2025-07-25 06:49:50 +00:00
Eitan Seri-Levi	6a52454647	Update spec tests to 1.6.0-alpha.3 (#7786 ) #7782	2025-07-25 06:49:47 +00:00
Jimmy Chen	1a6eeb228c	Bump Rust version to 1.88 (#7787 ) In #7743, rust version was bumped: - msrv to 1.87 - `Dockerfile` to 1.88 We also need to bump the other docker images as well, and might as well keep them all consistent at 1.88.	2025-07-25 05:52:51 +00:00
Michael Sproul	b904956074	Skip serializing blob_schedule before Fulu (#7779 ) Alternative to: - https://github.com/sigp/lighthouse/pull/7758 Serve the `blob_schedule` field on `/eth/v1/config/spec` _only_ when Fulu is enabled. If the blob schedule is empty, we will still serve it as `[]`, so long as Fulu is enabled.	2025-07-24 18:14:25 +00:00
Eric Tu	9911f348bc	Feature gate arbitrary crate in the consensus types crate (#7743 ) Which issue # does this PR address? Puts the `arbitrary` crate behind a feature flag in the `types` crate.	2025-07-23 16:55:02 +00:00
Jimmy Chen	4daa015971	Remove peer sampling code (#7768 ) Peer sampling has been completely removed from the spec. This PR removes our partial implementation from the codebase. https://github.com/ethereum/consensus-specs/pull/4393	2025-07-23 03:24:45 +00:00
chonghe	c4b973f5ba	Use SSZ by default when calling /eth/v3/validator/blocks (#7727 ) * #7698	2025-07-23 00:29:21 +00:00
Michael Sproul	ce99e0c383	Refine delayed head block logging (#7705 ) Small tweak to `Delayed head block` logging to make it more representative of actual issues. Previously we used the total import delay to determine whether a block was late, but this includes the time taken for IO (and now hdiff computation) which happens _after_ the block is made attestable. This PR changes the logic to use the attestable delay (where possible) falling back to the previous value if the block doesn't have one; e.g. if it didn't meet the conditions to make it into the attestable cache.	2025-07-23 00:29:18 +00:00
Mac L	e6089fe7db	Control span data through tracing Extensions (#7239 ) #7234 Removes the `Arc<Mutex<_>` which was used to store and manage span data and replaces it with the inbuilt `Extension` for managing span-specific data. This also avoids an `unwrap` which was used when acquiring the lock over the mutex'd span data.	2025-07-22 14:22:03 +00:00
Eitan Seri-Levi	db8b6be9df	Data column custody info (#7648 ) #7647 Introduces a new record in the blobs db `DataColumnCustodyInfo` When `DataColumnCustodyInfo` exists in the db this indicates that a recent cgc change has occurred and/or that a custody backfill sync is currently in progress (custody backfill will be added as a separate PR). When a cgc change has occurred `earliest_available_slot` will be equal to the slot at which the cgc change occured. During custody backfill sync`earliest_available_slot` should be updated incrementally as it progresses. ~~Note that if `advertise_false_custody_group_count` is enabled we do not add a `DataColumnCustodyInfo` record in the db as that would affect the status v2 response.~~ (See comment https://github.com/sigp/lighthouse/pull/7648#discussion_r2212403389) ~~If `DataColumnCustodyInfo` doesn't exist in the db this indicates that we have fulfilled our custody requirements up to the DA window.~~ (It now always exist, and the slot will be set to `None` once backfill is complete) StatusV2 now uses `DataColumnCustodyInfo` to calculate the `earliest_available_slot` if a `DataColumnCustodyInfo` record exists in the db, if it's `None`, then we return the `oldest_block_slot`.	2025-07-22 13:30:30 +00:00
Jimmy Chen	b48879a566	Remove KZG verification from local block production and blobs fetched from the EL (#7713 ) #7700 As described in title, the EL already performs KZG verification on all blobs when they entered the mempool, so it's redundant to perform extra validation on blobs returned from the EL. This PR removes - KZG verification for both blobs and data columns during block production - KZG verification for data columns after fetch engine blobs call. I have not done this for blobs because it requires extra changes to check the observed cache, and doesn't feel like it's a worthy optimisation given the number of blobs per block. This PR does not remove KZG verification on the block publishing path yet.	2025-07-22 10:48:49 +00:00
Michael Sproul	4a3e248b7e	Add heaptrack support (#7764 ) Although we're working on jemalloc profiler support in https://github.com/sigp/lighthouse/pull/7746, heaptrack seems to be producing more sensible results. This PR adds a heaptrack profile and a heaptrack feature so that we no longer need to patch the code in order to use heaptrack. This may prove complementary to jemalloc profiling, so I think there is no harm in having both.	2025-07-21 02:11:27 +00:00
Pawan Dhananjay	1046dfbfe7	Serialize bpo schedule in asending order (#7753 ) N/A Serializes the blob_schedule in ascending order to match other clients. This is needed to keep the output of `eth/v1/config/spec` http endpoint consistent across clients. cc @barnabasbusa	2025-07-18 05:36:18 +00:00
Pawan Dhananjay	3f06e5dfba	Fix enr loading from disk with cgc (#7754 ) N/A During building an enr on startup, we weren't using the value in the custody context. This was resulting in the enr value getting updated when the cgc updates, the change getting persisted, but getting set back to the default on restart. This PR takes the value explicitly from the custody context.	2025-07-18 04:51:11 +00:00

1 2 3 4 5 ...

6991 Commits