mirror of
https://github.com/sigp/lighthouse.git
synced 2026-03-06 10:11:44 +00:00
Reduce some EE and builder related ERRO logs to WARN (#3966)
## Issue Addressed NA ## Proposed Changes Our `ERRO` stream has been rather noisy since the merge due to some unexpected behaviours of builders and EEs. Now that we've been running post-merge for a while, I think we can drop some of these `ERRO` to `WARN` so we're not "crying wolf". The modified logs are: #### `ERRO Execution engine call failed` I'm seeing this quite frequently on Geth nodes. They seem to timeout when they're busy and it rarely indicates a serious issue. We also have logging across block import, fork choice updating and payload production that raise `ERRO` or `CRIT` when the EE times out, so I think we're not at risk of silencing actual issues. #### `ERRO "Builder failed to reveal payload"` In #3775 we reduced this log from `CRIT` to `ERRO` since it's common for builders to fail to reveal the block to the producer directly whilst still broadcasting it to the networ. I think it's worth dropping this to `WARN` since it's rarely interesting. I elected to stay with `WARN` since I really do wish builders would fulfill their API promises by returning the block to us. Perhaps I'm just being pedantic here, I could be convinced otherwise. #### `ERRO "Relay error when registering validator(s)"` It seems like builders and/or mev-boost struggle to handle heavy loads of validator registrations. I haven't observed issues with validators not actually being registered, but I see timeouts on these endpoints many times a day. It doesn't seem like this `ERRO` is worth it. #### `ERRO Error fetching block for peer ExecutionLayerErrorPayloadReconstruction` This means we failed to respond to a peer on the P2P network with a block they requested because of an error in the `execution_layer`. It's very common to see timeouts or incomplete responses on this endpoint whilst the EE is busy and I don't think it's important enough for an `ERRO`. As long as the peer count stays high, I don't think the user needs to be actively concerned about how we're responding to peers. ## Additional Info NA
This commit is contained in:
@@ -1575,10 +1575,10 @@ impl<T: EthSpec> ExecutionLayer<T> {
|
||||
&metrics::EXECUTION_LAYER_BUILDER_REVEAL_PAYLOAD_OUTCOME,
|
||||
&[metrics::FAILURE],
|
||||
);
|
||||
error!(
|
||||
warn!(
|
||||
self.log(),
|
||||
"Builder failed to reveal payload";
|
||||
"info" => "this relay failure may cause a missed proposal",
|
||||
"info" => "this is common behaviour for some builders and may not indicate an issue",
|
||||
"error" => ?e,
|
||||
"relay_response_ms" => duration.as_millis(),
|
||||
"block_root" => ?block_root,
|
||||
|
||||
Reference in New Issue
Block a user