Files
lighthouse/beacon_node/client/src/config.rs
Paul Hauner b60304b19f Use BeaconProcessor for API requests (#4462)
## Issue Addressed

NA

## Proposed Changes

Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves:

1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time).
1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued).
1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests).

Presently there are two levels of priorities:

- `Priority::P0`
    - The beacon processor will prioritise these above everything other than importing new blocks.
    - Roughly all validator-sensitive endpoints.
- `Priority::P1`
    - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things.
    - Everything that's not `Priority::P0`
    
The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request:

```
        --http-enable-beacon-processor <BOOLEAN>
            The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to
            "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API
            responses will be executed immediately. [default: true]
```
    
## New CLI Flags

I added some other new CLI flags:

```
        --beacon-processor-aggregate-batch-size <INTEGER>
            Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may
            reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile
            network. [default: 64]
        --beacon-processor-attestation-batch-size <INTEGER>
            Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU
            usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network.
            [default: 64]
        --beacon-processor-max-workers <INTEGER>
            Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource
            consumption. Reducing the value may result in decreased resource usage and diminished performance. The
            default value is the number of logical CPU cores on the host.
        --beacon-processor-reprocess-queue-len <INTEGER>
            Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent
            messages from being dropped while lower values may help protect the node from becoming overwhelmed.
            [default: 12288]
```


I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with.

## Additional Info

I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 

The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response.

I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](efbabe3252). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.
2023-08-08 23:30:15 +00:00

226 lines
8.1 KiB
Rust

use beacon_chain::validator_monitor::DEFAULT_INDIVIDUAL_TRACKING_THRESHOLD;
use beacon_processor::BeaconProcessorConfig;
use directory::DEFAULT_ROOT_DIR;
use environment::LoggerConfig;
use network::NetworkConfig;
use sensitive_url::SensitiveUrl;
use serde_derive::{Deserialize, Serialize};
use std::fs;
use std::path::PathBuf;
use types::{Graffiti, PublicKeyBytes};
/// Default directory name for the freezer database under the top-level data dir.
const DEFAULT_FREEZER_DB_DIR: &str = "freezer_db";
/// Defines how the client should initialize the `BeaconChain` and other components.
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub enum ClientGenesis {
/// Creates a genesis state as per the 2019 Canada interop specifications.
Interop {
validator_count: usize,
genesis_time: u64,
},
/// Reads the genesis state and other persisted data from the `Store`.
FromStore,
/// Connects to an eth1 node and waits until it can create the genesis state from the deposit
/// contract.
#[default]
DepositContract,
/// Loads the genesis state from SSZ-encoded `BeaconState` bytes.
///
/// We include the bytes instead of the `BeaconState<E>` because the `EthSpec` type
/// parameter would be very annoying.
SszBytes { genesis_state_bytes: Vec<u8> },
WeakSubjSszBytes {
genesis_state_bytes: Vec<u8>,
anchor_state_bytes: Vec<u8>,
anchor_block_bytes: Vec<u8>,
},
CheckpointSyncUrl {
genesis_state_bytes: Vec<u8>,
url: SensitiveUrl,
},
}
/// The core configuration of a Lighthouse beacon node.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Config {
data_dir: PathBuf,
/// Name of the directory inside the data directory where the main "hot" DB is located.
pub db_name: String,
/// Path where the freezer database will be located.
pub freezer_db_path: Option<PathBuf>,
pub log_file: PathBuf,
/// If true, the node will use co-ordinated junk for eth1 values.
///
/// This is the method used for the 2019 client interop in Canada.
pub dummy_eth1_backend: bool,
pub sync_eth1_chain: bool,
/// Graffiti to be inserted everytime we create a block.
pub graffiti: Graffiti,
/// When true, automatically monitor validators using the HTTP API.
pub validator_monitor_auto: bool,
/// A list of validator pubkeys to monitor.
pub validator_monitor_pubkeys: Vec<PublicKeyBytes>,
/// Once the number of monitored validators goes above this threshold, we
/// will stop tracking metrics on a per-validator basis. This prevents large
/// validator counts causing infeasibly high cardinailty for Prometheus and
/// high log volumes.
pub validator_monitor_individual_tracking_threshold: usize,
#[serde(skip)]
/// The `genesis` field is not serialized or deserialized by `serde` to ensure it is defined
/// via the CLI at runtime, instead of from a configuration file saved to disk.
pub genesis: ClientGenesis,
pub store: store::StoreConfig,
pub network: network::NetworkConfig,
pub chain: beacon_chain::ChainConfig,
pub eth1: eth1::Config,
pub execution_layer: Option<execution_layer::Config>,
pub http_api: http_api::Config,
pub http_metrics: http_metrics::Config,
pub monitoring_api: Option<monitoring_api::Config>,
pub slasher: Option<slasher::Config>,
pub logger_config: LoggerConfig,
pub always_prefer_builder_payload: bool,
pub beacon_processor: BeaconProcessorConfig,
}
impl Default for Config {
fn default() -> Self {
Self {
data_dir: PathBuf::from(DEFAULT_ROOT_DIR),
db_name: "chain_db".to_string(),
freezer_db_path: None,
log_file: PathBuf::from(""),
genesis: <_>::default(),
store: <_>::default(),
network: NetworkConfig::default(),
chain: <_>::default(),
dummy_eth1_backend: false,
sync_eth1_chain: false,
eth1: <_>::default(),
execution_layer: None,
graffiti: Graffiti::default(),
http_api: <_>::default(),
http_metrics: <_>::default(),
monitoring_api: None,
slasher: None,
validator_monitor_auto: false,
validator_monitor_pubkeys: vec![],
validator_monitor_individual_tracking_threshold: DEFAULT_INDIVIDUAL_TRACKING_THRESHOLD,
logger_config: LoggerConfig::default(),
always_prefer_builder_payload: false,
beacon_processor: <_>::default(),
}
}
}
impl Config {
/// Updates the data directory for the Client.
pub fn set_data_dir(&mut self, data_dir: PathBuf) {
self.data_dir = data_dir.clone();
self.http_api.data_dir = data_dir;
}
/// Gets the config's data_dir.
pub fn data_dir(&self) -> &PathBuf {
&self.data_dir
}
/// Get the database path without initialising it.
pub fn get_db_path(&self) -> PathBuf {
self.get_data_dir().join(&self.db_name)
}
/// Get the database path, creating it if necessary.
pub fn create_db_path(&self) -> Result<PathBuf, String> {
ensure_dir_exists(self.get_db_path())
}
/// Fetch default path to use for the freezer database.
fn default_freezer_db_path(&self) -> PathBuf {
self.get_data_dir().join(DEFAULT_FREEZER_DB_DIR)
}
/// Returns the path to which the client may initialize the on-disk freezer database.
///
/// Will attempt to use the user-supplied path from e.g. the CLI, or will default
/// to a directory in the data_dir if no path is provided.
pub fn get_freezer_db_path(&self) -> PathBuf {
self.freezer_db_path
.clone()
.unwrap_or_else(|| self.default_freezer_db_path())
}
/// Get the freezer DB path, creating it if necessary.
pub fn create_freezer_db_path(&self) -> Result<PathBuf, String> {
ensure_dir_exists(self.get_freezer_db_path())
}
/// Returns the "modern" path to the data_dir.
///
/// See `Self::get_data_dir` documentation for more info.
fn get_modern_data_dir(&self) -> PathBuf {
self.data_dir.clone()
}
/// Returns the "legacy" path to the data_dir.
///
/// See `Self::get_data_dir` documentation for more info.
pub fn get_existing_legacy_data_dir(&self) -> Option<PathBuf> {
dirs::home_dir()
.map(|home_dir| home_dir.join(&self.data_dir))
// Return `None` if the legacy directory does not exist or if it is identical to the modern.
.filter(|dir| dir.exists() && *dir != self.get_modern_data_dir())
}
/// Returns the core path for the client.
///
/// Will not create any directories.
///
/// ## Legacy Info
///
/// Legacy versions of Lighthouse did not properly handle relative paths for `--datadir`.
///
/// For backwards compatibility, we still compute the legacy path and check if it exists. If
/// it does exist, we use that directory rather than the modern path.
///
/// For more information, see:
///
/// https://github.com/sigp/lighthouse/pull/2843
pub fn get_data_dir(&self) -> PathBuf {
let existing_legacy_dir = self.get_existing_legacy_data_dir();
if let Some(legacy_dir) = existing_legacy_dir {
legacy_dir
} else {
self.get_modern_data_dir()
}
}
/// Returns the core path for the client.
///
/// Creates the directory if it does not exist.
pub fn create_data_dir(&self) -> Result<PathBuf, String> {
ensure_dir_exists(self.get_data_dir())
}
}
/// Ensure that the directory at `path` exists, by creating it and all parents if necessary.
fn ensure_dir_exists(path: PathBuf) -> Result<PathBuf, String> {
fs::create_dir_all(&path).map_err(|e| format!("Unable to create {}: {}", path.display(), e))?;
Ok(path)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn serde() {
let config = Config::default();
let serialized =
serde_yaml::to_string(&config).expect("should serde encode default config");
serde_yaml::from_str::<Config>(&serialized).expect("should serde decode default config");
}
}