lighthouse

mirror of https://github.com/sigp/lighthouse.git synced 2026-06-01 21:57:15 +00:00

Author	SHA1	Message	Date
Jimmy Chen	dbe474e132	Delete attester cache (#8469 ) Fixes attester cache write lock contention. Alternative to #8463. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2026-01-06 03:08:02 +00:00
Eitan Seri-Levi	33e21634cb	Custody backfill sync (#7907 ) #7603 #### Custody backfill sync service Similar in many ways to the current backfill service. There may be ways to unify the two services. The difficulty there is that the current backfill service tightly couples blocks and their associated blobs/data columns. Any attempts to unify the two services should be left to a separate PR in my opinion. #### `SyncNeworkContext` `SyncNetworkContext` manages custody sync data columns by range requests separetly from other sync RPC requests. I think this is a nice separation considering that custody backfill is its own service. #### Data column import logic The import logic verifies KZG committments and that the data columns block root matches the block root in the nodes store before importing columns #### New channel to send messages to `SyncManager` Now external services can communicate with the `SyncManager`. In this PR this channel is used to trigger a custody sync. Alternatively we may be able to use the existing `mpsc` channel that the `SyncNetworkContext` uses to communicate with the `SyncManager`. I will spend some time reviewing this. Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com>	2025-10-22 03:51:34 +00:00
Jimmy Chen	4111bcb39b	Use scoped rayon pool for backfill chain segment processing (#7924 ) Part of #7866 - Continuation of #7921 In the above PR, we enabled rayon for batch KZG verification in chain segment processing. However, using the global rayon thread pool for backfill is likely to create resource contention with higher-priority beacon processor work. This PR introduces a dedicated low-priority rayon thread pool `LOW_PRIORITY_RAYON_POOL` and uses it for processing backfill chain segments. This prevents backfill KZG verification from using the global rayon thread pool and competing with high-priority beacon processor tasks for CPU resources. However, this PR by itself doesn't prevent CPU oversubscription because other tasks could still fill up the global rayon thread pool, and having an extra thread pool could make things worse. To address this we need the beacon processor to coordinate total CPU allocation across all tasks, which is covered in: - #7789 Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>	2025-09-18 07:10:23 +00:00
Eitan Seri-Levi	242bdfcf12	Add instrumentation to `recompute_head_at_slot` (#8049 ) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>	2025-09-16 05:18:31 +00:00
Jimmy Chen	8a4f6cf0d5	Instrument tracing on block production code path (#8017 ) Partially #7814. Instrument block production code path. New root spans: * `produce_block_v3` * `produce_block_v2` Example traces: <img width="518" height="432" alt="image" src="https://github.com/user-attachments/assets/a9413d25-501c-49dc-95cc-623db5988981" /> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>	2025-09-10 03:30:51 +00:00
Jimmy Chen	ee734d1456	Fix stuck data column lookups by improving peer selection and retry logic (#8005 ) Fixes the issue described in #7980 where Lighthouse repeatedly sends `DataColumnsByRoot` requests to the same peers that return empty responses, causing sync to get stuck. The root cause was we don't count empty responses as failures, leading to excessive retries to unresponsive peers. - Track per peer attempts to limit retry attempts per peer (`MAX_CUSTODY_PEER_ATTEMPTS = 3`) - Replaced random peer selection with hashing within each lookup to prevent splitting lookup into too many small requests and improve request batching efficiency. - Added `single_block_lookup` root span to track all lookups created and added more debug logs: <img width="1264" height="501" alt="image" src="https://github.com/user-attachments/assets/983629ba-b6d0-41cf-8e93-88a5b96c2f31" /> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>	2025-09-09 06:18:05 +00:00
Jimmy Chen	c13fb2fb46	Instrument `publish_block` code path (#7945 ) Instrument `publish_block` code path and log dropped data columns when publishing. Example spans (running the devnet from my laptop, so the numbers aren't great) <img width="734" height="296" alt="image" src="https://github.com/user-attachments/assets/20620bf7-2b38-4392-aa75-9ba96d3a7f0d" /> <img width="718" height="625" alt="image" src="https://github.com/user-attachments/assets/61e1ff1c-65b5-4ad4-981a-d0fadc9829e1" />	2025-08-28 03:31:29 +00:00
Jimmy Chen	f19d4f6af1	Implement tracing spans for data columm RPC requests and responses (#7831 ) #7830	2025-08-20 23:35:51 +00:00

8 Commits