mirror of
https://github.com/sigp/lighthouse.git
synced 2026-04-17 12:58:31 +00:00
Fix I/O atomicity issues with checkpoint sync (#2671)
## Issue Addressed This PR addresses an issue found by @YorickDowne during testing of v2.0.0-rc.0. Due to a lack of atomic database writes on checkpoint sync start-up, it was possible for the database to get into an inconsistent state from which it couldn't recover without `--purge-db`. The core of the issue was that the store's anchor info was being stored _before_ the `PersistedBeaconChain`. If a crash occured so that anchor info was stored but _not_ the `PersistedBeaconChain`, then on restart Lighthouse would think the database was unitialized and attempt to compare-and-swap a `None` value, but would actually find the stale info from the previous run. ## Proposed Changes The issue is fixed by writing the anchor info, the split point, and the `PersistedBeaconChain` atomically on start-up. Some type-hinting ugliness was required, which could possibly be cleaned up in future refactors.
This commit is contained in:
@@ -182,7 +182,7 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
|
||||
};
|
||||
let backfill_complete = new_anchor.block_backfill_complete();
|
||||
self.store
|
||||
.compare_and_set_anchor_info(Some(anchor_info), Some(new_anchor))?;
|
||||
.compare_and_set_anchor_info_with_write(Some(anchor_info), Some(new_anchor))?;
|
||||
|
||||
// If backfill has completed and the chain is configured to reconstruct historic states,
|
||||
// send a message to the background migrator instructing it to begin reconstruction.
|
||||
|
||||
Reference in New Issue
Block a user