Instrument tracing spans for block processing and import (#7816)

#7815

- removes all existing spans, so some span fields that appear in logs like `service_name` may be lost.
- instruments a few key code paths in the beacon node, starting from **root spans** named below:

* Gossip block and blobs
* `process_gossip_data_column_sidecar`
* `process_gossip_blob`
* `process_gossip_block`
* Rpc block and blobs
* `process_rpc_block`
* `process_rpc_blobs`
* `process_rpc_custody_columns`
* Rpc blocks (range and backfill)
* `process_chain_segment`
* `PendingComponents` lifecycle
* `pending_components`

To test locally:
* Run Grafana and Tempo with https://github.com/sigp/lighthouse-metrics/pull/57
* Run Lighthouse BN with `--telemetry-collector-url http://localhost:4317`

Some captured traces can be found here: https://hackmd.io/@jimmygchen/r1sLOxPPeg

Removing the old spans seem to have reduced the memory usage quite a lot - i think we were using them on long running tasks and too excessively:
<img width="910" height="495" alt="image" src="https://github.com/user-attachments/assets/5208bbe4-53b2-4ead-bc71-0b782c788669" />
This commit is contained in:
Jimmy Chen
2025-08-08 15:32:22 +10:00
committed by GitHub
parent 6dfab22267
commit 40c2fd5ff4
52 changed files with 633 additions and 1164 deletions

View File

@@ -25,7 +25,7 @@ use std::sync::{
};
use task_executor::TaskExecutor;
use tokio::time::{sleep, sleep_until, Instant};
use tracing::{debug, error, warn};
use tracing::{debug, debug_span, error, instrument, warn, Instrument};
use types::{AttestationShufflingId, BeaconStateError, EthSpec, Hash256, RelativeEpoch, Slot};
/// If the head slot is more than `MAX_ADVANCE_DISTANCE` from the current slot, then don't perform
@@ -253,7 +253,8 @@ async fn state_advance_timer<T: BeaconChainTypes>(
},
"fork_choice_advance_signal_tx",
);
},
}
.instrument(debug_span!("fork_choice_advance")),
"fork_choice_advance",
);
}
@@ -264,6 +265,7 @@ async fn state_advance_timer<T: BeaconChainTypes>(
/// slot then placed in the `state_cache` to be used for block verification.
///
/// See the module-level documentation for rationale.
#[instrument(skip_all)]
fn advance_head<T: BeaconChainTypes>(beacon_chain: &Arc<BeaconChain<T>>) -> Result<(), Error> {
let current_slot = beacon_chain.slot()?;