Optimise pubkey cache initialisation during beacon node startup (#8451)

Instrument beacon node startup and parallelise pubkey cache initialisation.

I instrumented beacon node startup and noticed that pubkey cache takes a long time to initialise, mostly due to decompressing all the validator pubkeys.

This PR uses rayon to parallelize the decompression on initial checkpoint sync. The pubkeys are stored uncompressed, so the decopression time is not a problem on subsequent restarts. On restarts, we still deserialize pubkeys, but the timing is quite minimal on Sepolia so I didn't investigate further.

`validator_pubkey_cache_new` timing on Sepolia:
* before: 109.64ms
* with parallelization: 21ms

on Hoodi:
* before: times out with Kurtosis after 120s
* with parallelization: 12.77s to import keys

**UPDATE**: downloading checkpoint state + genesis state takes about 2 minutes on my laptop, so it seems like the BN managed to start the http server just before timing out (after the optimisation).

<img width="1380" height="625" alt="image" src="https://github.com/user-attachments/assets/4c548c14-57dd-4b47-af9a-115b15791940" />


  


Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
This commit is contained in:
Jimmy Chen
2025-11-28 15:30:49 +11:00
committed by GitHub
parent 9394663155
commit 7cee5d6090
2 changed files with 94 additions and 22 deletions

View File

@@ -42,7 +42,7 @@ use std::time::Duration;
use std::time::{SystemTime, UNIX_EPOCH};
use store::database::interface::BeaconNodeBackend;
use timer::spawn_timer;
use tracing::{debug, info, warn};
use tracing::{debug, info, instrument, warn};
use types::data_column_custody_group::compute_ordered_custody_column_indices;
use types::{
BeaconState, BlobSidecarList, ChainSpec, EthSpec, ExecutionBlockHash, Hash256,
@@ -151,6 +151,7 @@ where
/// Initializes the `BeaconChainBuilder`. The `build_beacon_chain` method will need to be
/// called later in order to actually instantiate the `BeaconChain`.
#[instrument(skip_all)]
pub async fn beacon_chain_builder(
mut self,
client_genesis: ClientGenesis,
@@ -613,6 +614,7 @@ where
///
/// If type inference errors are being raised, see the comment on the definition of `Self`.
#[allow(clippy::type_complexity)]
#[instrument(name = "build_client", skip_all)]
pub fn build(
mut self,
) -> Result<Client<Witness<TSlotClock, E, THotStore, TColdStore>>, String> {
@@ -813,6 +815,7 @@ where
TColdStore: ItemStore<E> + 'static,
{
/// Consumes the internal `BeaconChainBuilder`, attaching the resulting `BeaconChain` to self.
#[instrument(skip_all)]
pub fn build_beacon_chain(mut self) -> Result<Self, String> {
let context = self
.runtime_context