83 Commits

Author SHA1 Message Date
Eitan Seri-Levi
278111f4c4 Resolve merge conflicts 2026-05-17 12:13:00 +03:00
Eitan Seri-Levi
ac7fc2206a Resolve merge conflicts 2026-05-17 11:55:52 +03:00
Lion - dapplion
7148bfcdd1 Implement beacon_blocks_by_head (#9237)
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2026-05-07 02:41:01 +00:00
Devnet Bot
2356bdd256 chore: cargo fmt + fix clippy warnings 2026-05-06 16:08:52 +00:00
Devnet Bot
39b6f58bc2 feat(focil): retry envelope on transient EL errors
When engine_newPayloadV6 fails with a transient error (e.g. Besu's
ConcurrentModificationException), queue the envelope for retry instead
of permanently rejecting it. Matches Lodestar's behavior of retrying
on the next BlockImported event.

- Add RetryEnvelope variant to ReprocessQueueMessage
- On BlockImported, immediately dispatch any pending retry envelopes
- Fallback timeout of 1 slot in case no block arrives
- Max 3 retries per envelope to prevent infinite loops
- Only retry non-penalizing EL errors (transient failures)
2026-05-06 14:29:10 +00:00
Devnet Bot
0aa7e43e85 wip: early envelope reprocessing and is_heze_fork plumbing 2026-05-06 13:16:05 +00:00
jking-aus
330348ea14 fix: prevent duplicate column reconstruction dispatch (#9250)
Fixes a flaky CI failure in `data_column_reconstruction_at_deadline` where 2 `column_reconstruction` events are emitted instead of the expected 1.


  - Change `queued_column_reconstructions` from `HashMap<Hash256, DelayKey>` to `HashMap<Hash256, Option<DelayKey>>`, where `None` indicates reconstruction was already dispatched.
- On dispatch (`ReadyColumnReconstruction`), set the entry to `None` instead of removing it. This prevents a subsequent gossip column from inserting a fresh reconstruction request into the now-vacant slot.
- Prune stale `None` entries on each dispatch to keep the map bounded.


Co-Authored-By: Josh King <josh@sigmaprime.io>
2026-05-01 12:44:25 +00:00
Eitan Seri-Levi
2585096de6 FMT 2026-04-30 15:03:42 +02:00
Eitan Seri-Levi
bab71429f8 resolve merge conflicts 2026-04-30 01:51:26 +02:00
Daniel Knopik
8a384ff445 Cell Dissemination (Partial messages) (#8314)
- https://github.com/ethereum/consensus-specs/pull/4558
- https://eips.ethereum.org/EIPS/eip-8136


  


Co-Authored-By: Daniel Knopik <daniel@dknopik.de>

Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>

Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-04-23 18:52:28 +00:00
Lion - dapplion
bc5d8c9f90 Add range sync tests (#8989)
Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2026-03-31 05:07:22 +00:00
Eitan Seri-Levi
c7055b604f Gloas serve envelope rpc (#8896)
Serves envelope by range and by root requests. Added PayloadEnvelopeStreamer so that we dont need to alter upstream code when we introduce blinded payload envelopes.


  


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>

Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2026-03-25 06:45:24 +00:00
Eitan Seri-Levi
17d183eb5b Unknown block for envelope (#8992)
Add a queue that allows us to reprocess an envelope when it arrives over gossip references a unknown block root. When the block is finally imported, we immediately reprocess the queued envelope.

Note that we don't trigger a block lookup sync. Incoming attestations for this block root will already trigger a lookup for us. I think thats good enough


  


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
2026-03-17 07:35:05 +00:00
Lion - dapplion
f4a6b8d9b9 Tree-sync friendly lookup sync tests (#8592)
- Step 0 of the tree-sync roadmap https://github.com/sigp/lighthouse/issues/7678

Current lookup sync tests are written in an explicit way that assume how the internals of lookup sync work. For example the test would do:

- Emit unknown block parent message
- Expect block request for X
- Respond with successful block request
- Expect block processing request for X
- Response with successful processing request
- etc..

This is unnecessarily verbose. And it will requires a complete re-write when something changes in the internals of lookup sync (has happened a few times, mostly for deneb and fulu).

What we really want to assert is:

- WHEN: we receive an unknown block parent message
- THEN: Lookup sync can sync that block
- ASSERT: Without penalizing peers, without unnecessary retries


  Keep all existing tests and add new cases but written in the new style described above. The logic to serve and respond to request is in this function `fn simulate` 2288a3aeb1/beacon_node/network/src/sync/tests/lookups.rs (L301)
- It controls peer behavior based on a `CompleteStrategy` where you can set for example "respond to BlocksByRoot requests with empty"
- It actually runs beacon processor messages running their clousures. Now sync tests actually import blocks, increasing the test coverage to the interaction of sync and the da_checker.
- To achieve the above the tests create real blocks with the test harness. To make the tests as fast as before, I disabled crypto with `TestConfig`

Along the way I found a couple bugs, which I documented on the diff.


Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>
2026-02-13 04:24:51 +00:00
Eitan Seri-Levi
b202e98dd9 Gloas gossip boilerplate (#8700)
All the required boilerplate for gloas gossip. We'll include the gossip message processing logic in a separate PR


  


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2026-01-28 07:12:48 +00:00
Eitan Seri-Levi
3fac61e0c8 Re-introduce clearer variable names in beacon processor work queue (#8649)
#8648


  


Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
2026-01-12 23:36:01 +00:00
Eitan Seri-Levi
b8c386d38d Move beacon processor work queue implementation to its own file (#8141)
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
2026-01-11 01:22:28 +00:00
Eitan Seri-Levi
e211a7325f Resolve merge conflicts 2026-01-02 08:52:14 -06:00
kevaundray
b3df0d1985 fix: clarify bb vs bl variable names in BeaconProcessorQueue (#8315)
since block and blob both start with `bl`, it was not clear how to differentiate between `blbroots_queue` and `bbroots_queue`

After renaming, there also seems to be a discrepancy


  


Co-Authored-By: Kevaundray Wedderburn <kevtheappdev@gmail.com>
2025-11-11 05:23:44 +00:00
Jimmy Chen
66f88f6bb4 Use millis_from_slot_start when comparing against reconstruction deadline (#8246)
This recent PR below changes the max reconstruction delay to be a function of slot time. However it uses `seconds_from_slot_start` when comparing (and dropping `nano`), so it might delay reconstruction on networks where the slot time isn’t a multiple of 4, e.g. on gnosis this only happens at 2s instead of 1.25s.:
- https://github.com/sigp/lighthouse/pull/8067#discussion_r2443875068


  Use `millis_from_slot_start` when comparing against reconstruction deadline

Also added some tests for reconstruction delay.


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
2025-10-21 02:24:43 +00:00
Odinson
79716f6ec1 Max reconstruction delay as a function of slot time (#8067)
Fixes #8054


  


Co-Authored-By: PoulavBhowmick03 <bpoulav@gmail.com>
2025-10-17 08:49:13 +00:00
Eitan Seri-Levi
af274029e8 Run reconstruction inside a scoped rayon pool (#8075)
Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
2025-09-24 06:37:34 +00:00
Jimmy Chen
4111bcb39b Use scoped rayon pool for backfill chain segment processing (#7924)
Part of #7866

- Continuation of #7921

In the above PR, we enabled rayon for batch KZG verification in chain segment processing. However, using the global rayon thread pool for backfill is likely to create resource contention with higher-priority beacon processor work.


  This PR introduces a dedicated low-priority rayon thread pool `LOW_PRIORITY_RAYON_POOL` and uses it for processing backfill chain segments.

This prevents backfill KZG verification from using the global rayon thread pool and competing with high-priority beacon processor tasks for CPU resources.

However, this PR by itself doesn't prevent CPU oversubscription because other tasks could still fill up the global rayon thread pool, and having an extra thread pool could make things worse. To address this we need the beacon
processor to coordinate total CPU allocation across all tasks, which is covered in:
- #7789


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>

Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
2025-09-18 07:10:23 +00:00
Michael Sproul
684632df73 Fix reprocess queue memory leak (#8065)
Fix a memory leak in the reprocess queue.


  If the vec of attestation IDs for a block is never evicted from the reprocess queue by a `BlockImported` event, then it stays in the map forever consuming memory. The fix is to remove the entry when its last attestation times out. We do similarly for light client updates.

In practice this will only occur if there is a race between adding an attestation to the queue and processing the `BlockImported` event, or if there are attestations for block roots that we never import (e.g. random block roots, block roots of invalid blocks).


Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
2025-09-18 05:16:59 +00:00
Eitan Seri-Levi
aba3627099 Reduce reconstruction queue capacity (#8053)
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
2025-09-16 05:18:28 +00:00
Eitan Seri-Levi
aef8291f94 Add max delay to reconstruction (#7976)
#7697


  If we're three seconds into the current slot just trigger reconstruction. I don't know what the correct reconstruction deadline number is, but it should probably be at least half a second before the attestation deadline


Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>

Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
2025-09-12 06:05:42 +00:00
Jimmy Chen
3e78034de6 Add BEACON_PROCESSOR_WORKERS_ACTIVE_GAUGE_BY_TYPE metric (#7935)
Similar to `BEACON_PROCESSOR_WORKERS_ACTIVE_TOTAL` but this metric also records the work type.

This is useful in identifying the task when a worker is stuck due to a deadlock or something else, and usually difficult to debug in production / release mode.
2025-08-26 06:46:14 +00:00
Jimmy Chen
34dd1b27ae Revise data column rpc limits and queue sizes (#7887)
Revise data column rpc limits and queue sizes. Also removed some outdated TODOs for Fulu / das.
2025-08-19 03:48:08 +00:00
chonghe
522bd9e9c6 Update Rust Edition to 2024 (#7766)
* #7749

Thanks @dknopik and @michaelsproul for your help!
2025-08-13 03:04:31 +00:00
Eitan Seri-Levi
122f16776f Add metrics to track beacon processor queue times (#7808)
This PR adds a created_timestamp to the beacon processor send channel. When work items are sent through that channel `try_send` will forward the work event along with the current timestamp to the beacon processor. When the work event is completed the `Drop` impl for `SendOnDrop` will track the time it took from work event creation to its completion. Previously we only had data on how long a work event took to process, but didn't have data on how long it sat in the queue + how long it took to process.
2025-08-12 01:06:42 +00:00
Jimmy Chen
4daa015971 Remove peer sampling code (#7768)
Peer sampling has been completely removed from the spec. This PR removes our partial implementation from the codebase.
https://github.com/ethereum/consensus-specs/pull/4393
2025-07-23 03:24:45 +00:00
Eitan Seri-Levi
f67084a571 Remove reprocess channel (#7437)
Partially https://github.com/sigp/lighthouse/issues/6291


  This PR removes the reprocess event channel from being externally exposed. All work events are now sent through the single `BeaconProcessorSend` channel. I've introduced a new `Work::Reprocess` enum variant which we then use to schedule jobs for reprocess. I've also created a new scheduler module which will eventually house the different scheduler impls.

This is all needed as an initial step to generalize the beacon processor

A "full" implementation for the generalized beacon processor can be found here
https://github.com/sigp/lighthouse/pull/6448

I'm going to try to break up the full implementation into smaller PR's so it can actually be reviewed
2025-06-20 02:52:16 +00:00
Eitan Seri-Levi
6786b9d12a Single attestation "Full" implementation (#7444)
#6970


  This allows for us to receive `SingleAttestation` over gossip and process it without converting. There is still a conversion to `Attestation` as a final step in the attestation verification process, but by then the `SingleAttestation` is fully verified.

I've also fully removed the `submitPoolAttestationsV1` endpoint as its been deprecated

I've also pre-emptively deprecated supporting `Attestation` in `submitPoolAttestationsV2` endpoint. See here for more info: https://github.com/ethereum/beacon-APIs/pull/531

I tried to the minimize the diff here by only making the "required" changes. There are some unnecessary complexities with the way we manage the different attestation verification wrapper types. We could probably consolidate this to one wrapper type and refactor this even further. We could leave that to a separate PR if we feel like cleaning things up in the future.

Note that I've also updated the test harness to always submit `SingleAttestation` regardless of fork variant. I don't see a problem in that approach and it allows us to delete more code :)
2025-06-17 09:01:26 +00:00
Daniel Knopik
ccd99c138c Wait before column reconstruction (#7588) 2025-06-13 18:19:06 +00:00
Eitan Seri-Levi
e21b578073 resolve merge conflict and migrate il service to new pardigmn 2025-05-21 12:43:43 -07:00
chonghe
c13e069c9c Revise logging when queue is full (#7324) 2025-04-22 22:46:30 +00:00
Eitan Seri-Levi
19d43a2a8e merge unstable 2025-03-26 12:42:55 -06:00
ThreeHrSleep
d60c24ef1c Integrate tracing (#6339)
Tracing Integration
- [reference](5bbf1859e9/projects/project-ideas.md (L297))


  - [x] replace slog & log with tracing throughout the codebase
- [x] implement custom crit log
- [x] make relevant changes in the formatter
- [x] replace sloggers
- [x] re-write SSE logging components

cc: @macladson @eserilev
2025-03-12 22:31:05 +00:00
Eitan Seri-Levi
43243eadd9 fix merge 2025-02-22 17:11:31 -08:00
Eitan Seri-Levi
df0aeb0ba9 resolve merge conflicts 2025-02-22 07:33:36 -08:00
Michael Sproul
25f804a111 Fix light client plumbing in beacon processor (#6993)
Our Holesky nodes running with the light client enabled were logging messages about full queues:

> Feb 12 22:09:28.949 ERRO Work queue is full                      queue: unknown_light_client_optimistic_update, queue_len: 128, msg: the system has insufficient resources for load, service: bproc

I thought this might be genuine overload, but it turns out this queue was never being read from!


  - [x] Rename light-client related queues in the beacon processor for clarity.
- [x] Ensure all light-client related queues are being popped from.
2025-02-13 00:50:17 +00:00
Michael Sproul
2bd5bbdffb Optimise and refine SingleAttestation conversion (#6934)
Closes

- https://github.com/sigp/lighthouse/issues/6805


  - Use a new `WorkEvent::GossipAttestationToConvert` to handle the conversion from `SingleAttestation` to `Attestation` _on_ the beacon processor (prevents a Tokio thread being blocked).
- Improve the error handling for single attestations. I think previously we had no ability to reprocess single attestations for unknown blocks -- we would just error. This seemed to be the case in both gossip processing and processing of `SingleAttestation`s from the HTTP API.
- Move the `SingleAttestation -> Attestation` conversion function into `beacon_chain` so that it can return the `attestation_verification::Error` type, which has well-defined error handling and peer penalties. The now-unused variants of `types::Attestation::Error` have been removed.
2025-02-07 23:18:57 +00:00
Michael Sproul
364a978f12 Fix attestation queue length metric (#6924)
We were using the wrong queue length for attestation work event metrics.
2025-02-06 07:08:20 +00:00
Eitan Seri-Levi
db8d659f2f Merge unstable 2025-02-02 00:18:00 +03:00
Eitan Seri-Levi
f31c0b69e3 add focil epoch activation 2025-02-01 23:54:50 +03:00
Jimmy Chen
70194dfc6a Implement PeerDAS Fulu fork activation (#6795)
Addresses #6706


  This PR activates PeerDAS at the Fulu fork epoch instead of `EIP_7594_FORK_EPOCH`. This means we no longer support testing PeerDAS with Deneb / Electrs, as it's now part of a hard fork.
2025-01-30 07:01:34 +00:00
Pawan Dhananjay
1f6850fae2 Rust 1.84 lints (#6781)
* Fix few lints

* Fix remaining lints

* Use fully qualified syntax
2025-01-10 01:13:29 +00:00
Mac L
b2b1faad4e Enforce alphabetically ordered cargo deps (#6678)
* Enforce alphabetically ordered cargo deps

* Fix test-suite

* Another CI fix

* Merge branch 'unstable' into cargo-sort

* Fix conflicts

* Merge remote-tracking branch 'origin/unstable' into cargo-sort
2024-12-19 05:46:03 +00:00
Age Manning
e31ac508d4 Modularize tracing executor and metrics rename (#6424)
* Tracing executor and metrics rename

* Appease clippy

* Merge branch 'unstable' into modularise-task-executor
2024-10-28 09:41:45 +00:00
Eitan Seri-Levi
d1fda938a3 Light client updates by range RPC (#6383)
* enable lc update over rpc

* resolve TODOs

* resolve merge conflicts

* move max light client updates to eth spec

* Merge branch 'unstable' of https://github.com/sigp/lighthouse into light-client-updates-by-range-rpc

* remove ethspec dependency

* Update beacon_node/network/src/network_beacon_processor/rpc_methods.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/lighthouse_network/src/rpc/methods.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>
2024-10-18 02:50:51 +00:00