Instrument tracing spans for block processing and import (#7816)

#7815

- removes all existing spans, so some span fields that appear in logs like `service_name` may be lost.
- instruments a few key code paths in the beacon node, starting from **root spans** named below:

* Gossip block and blobs
* `process_gossip_data_column_sidecar`
* `process_gossip_blob`
* `process_gossip_block`
* Rpc block and blobs
* `process_rpc_block`
* `process_rpc_blobs`
* `process_rpc_custody_columns`
* Rpc blocks (range and backfill)
* `process_chain_segment`
* `PendingComponents` lifecycle
* `pending_components`

To test locally:
* Run Grafana and Tempo with https://github.com/sigp/lighthouse-metrics/pull/57
* Run Lighthouse BN with `--telemetry-collector-url http://localhost:4317`

Some captured traces can be found here: https://hackmd.io/@jimmygchen/r1sLOxPPeg

Removing the old spans seem to have reduced the memory usage quite a lot - i think we were using them on long running tasks and too excessively:
<img width="910" height="495" alt="image" src="https://github.com/user-attachments/assets/5208bbe4-53b2-4ead-bc71-0b782c788669" />
This commit is contained in:
Jimmy Chen
2025-08-08 15:32:22 +10:00
committed by GitHub
parent 6dfab22267
commit 40c2fd5ff4
52 changed files with 633 additions and 1164 deletions

View File

@@ -37,7 +37,7 @@ use std::num::NonZeroUsize;
use std::path::Path;
use std::sync::Arc;
use std::time::Duration;
use tracing::{debug, error, info, warn};
use tracing::{debug, error, info, instrument, warn};
use types::data_column_sidecar::{ColumnIndex, DataColumnSidecar, DataColumnSidecarList};
use types::*;
use zstd::{Decoder, Encoder};
@@ -1040,6 +1040,7 @@ impl<E: EthSpec, Hot: ItemStore<E>, Cold: ItemStore<E>> HotColdDB<E, Hot, Cold>
/// - `result_state_root == state.canonical_root()`
/// - `state.slot() <= max_slot`
/// - `state.get_latest_block_root(result_state_root) == block_root`
#[instrument(skip(self, max_slot), level = "debug")]
pub fn get_advanced_hot_state(
&self,
block_root: Hash256,
@@ -1111,6 +1112,7 @@ impl<E: EthSpec, Hot: ItemStore<E>, Cold: ItemStore<E>> HotColdDB<E, Hot, Cold>
/// If this function returns `Some(state)` then that `state` will always have
/// `latest_block_header` matching `block_root` but may not be advanced all the way through to
/// `max_slot`.
#[instrument(skip(self), level = "debug")]
pub fn get_advanced_hot_state_from_cache(
&self,
block_root: Hash256,

View File

@@ -6,6 +6,7 @@ use crate::{
use lru::LruCache;
use std::collections::{BTreeMap, HashMap, HashSet};
use std::num::NonZeroUsize;
use tracing::instrument;
use types::{BeaconState, ChainSpec, Epoch, EthSpec, Hash256, Slot};
/// Fraction of the LRU cache to leave intact during culling.
@@ -292,6 +293,7 @@ impl<E: EthSpec> StateCache<E> {
None
}
#[instrument(skip(self), level = "debug")]
pub fn get_by_block_root(
&mut self,
block_root: Hash256,