Commit Graph

58 Commits

Author SHA1 Message Date
realbigsean
51c4506c53 smol bugfixes and moar tests 2023-05-09 19:29:03 -04:00
realbigsean
a0f6159cae make sure blobs are sent for processing after stream termination, delete copied tests 2023-05-03 12:02:02 -04:00
realbigsean
af15789b6f improve peer scoring 2023-05-02 18:28:45 -04:00
realbigsean
e3f4218624 error refactoring 2023-05-02 14:09:53 -04:00
realbigsean
56b2365e17 track information about peer source 2023-05-02 12:28:32 -04:00
realbigsean
93bcd6281c some bug fixes and the start of deneb only tests 2023-04-28 15:56:54 -04:00
realbigsean
bfb5242ee3 start fixing up lookup verify error handling 2023-04-28 09:54:09 -04:00
realbigsean
d224fce084 wrap availability check error 2023-04-27 14:15:52 -04:00
realbigsean
69e5e00350 renamings 2023-04-26 14:45:07 -04:00
realbigsean
4390036887 fix existing block lookup tests 2023-04-26 14:44:32 -04:00
realbigsean
83c3ee173f fix lints 2023-04-26 12:05:07 -04:00
realbigsean
b5440f740d fix lints 2023-04-25 09:30:16 -04:00
realbigsean
53c0356f8d smol bugfix 2023-04-24 21:10:52 -04:00
realbigsean
b8708e38de processing peer refactor 2023-04-24 20:47:02 -04:00
realbigsean
91594adc77 refactor single block processed method 2023-04-24 20:15:45 -04:00
realbigsean
76c09dea21 drop parent lookup if either req has a peer disconnect during download 2023-04-24 19:00:21 -04:00
realbigsean
1d18756303 improve retry code 2023-04-24 18:56:19 -04:00
realbigsean
0560b7d1a5 improve peer scoring during certain failures in parent lookups 2023-04-24 16:58:13 -04:00
realbigsean
274aba95c7 consolidate retry error handling 2023-04-24 15:05:49 -04:00
realbigsean
b6531aa1b1 should remove lookup refactor 2023-04-24 13:04:44 -04:00
realbigsean
381044abe7 add peer usefulness enum 2023-04-24 12:27:49 -04:00
realbigsean
3e854ae2d1 fix compile 2023-04-21 16:57:53 -04:00
realbigsean
bacec52017 parent blob lookups 2023-04-20 19:42:33 -04:00
realbigsean
c7142495fd get things compiling 2023-04-20 13:38:05 -04:00
realbigsean
374ec4800a much work 2023-04-19 16:44:19 -04:00
realbigsean
0ad9fdfbbf fix compilation in main block lookup mod 2023-04-19 14:02:41 -04:00
realbigsean
195d802931 start fixing some compile errors 2023-04-17 16:58:18 -04:00
realbigsean
8618c301b5 add delayed processing logic and combine some requests 2023-04-14 16:50:41 -04:00
realbigsean
25ff6e8a5f more work 2023-04-11 13:13:13 -04:00
realbigsean
38e0994dc4 make single block lookup generic 2023-04-04 12:38:01 -04:00
realbigsean
6f12df37cf deal with rpc blobs in groups per block in the da checker. don't cache missing blob ids in the da checker. 2023-04-04 10:29:07 -04:00
realbigsean
8403402620 a lot more reprocessing work 2023-03-31 09:09:56 -04:00
realbigsean
9642ec02fa remove ForceBlockLookup 2023-03-29 09:53:37 -04:00
realbigsean
8d80200bc4 some blob reprocessing work 2023-03-28 18:29:56 -04:00
Pawan Dhananjay
b276af98b7 Rework block processing (#4092)
* introduce availability pending block

* add intoavailableblock trait

* small fixes

* add 'gossip blob cache' and start to clean up processing and transition types

* shard memory blob cache

* Initial commit

* Fix after rebase

* Add gossip verification conditions

* cache cleanup

* general chaos

* extended chaos

* cargo fmt

* more progress

* more progress

* tons of changes, just tryna compile

* everything, everywhere, all at once

* Reprocess an ExecutedBlock on unavailable blobs

* Add sus gossip verification for blobs

* Merge stuff

* Remove reprocessing cache stuff

* lint

* Add a wrapper to allow construction of only valid `AvailableBlock`s

* rename blob arc list to blob list

* merge cleanuo

* Revert "merge cleanuo"

This reverts commit 5e98326878.

* Revert "Revert "merge cleanuo""

This reverts commit 3a4009443a.

* fix rpc methods

* move beacon block and blob to eth2/types

* rename gossip blob cache to data availability checker

* lots of changes

* fix some compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* cargo fmt

* use a common data structure for block import types

* fix availability check on proposal import

* refactor the blob cache and split the block wrapper into two types

* add type conversion for signed block and block wrapper

* fix beacon chain tests and do some renaming, add some comments

* Partial processing (#4)

* move beacon block and blob to eth2/types

* rename gossip blob cache to data availability checker

* lots of changes

* fix some compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* cargo fmt

* use a common data structure for block import types

* fix availability check on proposal import

* refactor the blob cache and split the block wrapper into two types

* add type conversion for signed block and block wrapper

* fix beacon chain tests and do some renaming, add some comments

* cargo update (#6)

---------

Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: realbigsean <seananderson33@gmail.com>
2023-03-24 17:30:41 -04:00
realbigsean
cbd09dc281 finish refactor 2023-01-21 04:48:25 -05:00
Divma
240854750c cleanup: remove unused imports, unusued fields (#3834) 2022-12-23 17:16:10 -05:00
realbigsean
5de4f5b8d0 handle parent blob request edge cases correctly. fix data availability boundary check 2022-12-19 11:39:09 -05:00
realbigsean
8102a01085 merge with upstream 2022-12-01 11:13:07 -05:00
Diva M
979a95d62f handle unknown parents for block-blob pairs
wip

handle unknown parents for block-blob pairs
2022-11-30 17:21:54 -05:00
realbigsean
2157d91b43 process single block and blob 2022-11-30 11:51:18 -05:00
realbigsean
422d145902 chain segment processing for blobs 2022-11-30 09:40:15 -05:00
Diva M
805df307f6 wip 2022-11-28 14:13:12 -05:00
realbigsean
45897ad4e1 remove blob wrapper 2022-11-19 15:18:42 -05:00
realbigsean
7162e5e23b add a bunch of blob coupling boiler plate, add a blobs by root request 2022-11-15 16:43:56 -05:00
Divma
84c7d8cc70 Blocklookup data inconsistencies (#3677)
## Issue Addressed
Closes #3649 

## Proposed Changes

Add a regression test for the data inconsistency, catching the problem in 31e88c5533 [here](https://github.com/sigp/lighthouse/actions/runs/3379894044/jobs/5612044797#step:6:2043).
When a chain is sent for processing, move it to a separate collection and now the test works, yay!

## Additional Info

na
2022-11-07 06:48:34 +00:00
Paul Hauner
fa6ad1a11a Deduplicate block root computation (#3590)
## Issue Addressed

NA

## Proposed Changes

This PR removes duplicated block root computation.

Computing the `SignedBeaconBlock::canonical_root` has become more expensive since the merge as we need to compute the merke root of each transaction inside an `ExecutionPayload`.

Computing the root for [a mainnet block](https://beaconcha.in/slot/4704236) is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations *after* the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times.

## Additional Info

NA
2022-09-23 03:52:42 +00:00
Divma
8c69d57c2c Pause sync when EE is offline (#3428)
## Issue Addressed

#3032

## Proposed Changes

Pause sync when ee is offline. Changes include three main parts:
- Online/offline notification system
- Pause sync
- Resume sync

#### Online/offline notification system
- The engine state is now guarded behind a new struct `State` that ensures every change is correctly notified. Notifications are only sent if the state changes. The new `State` is behind a `RwLock` (as before) as the synchronization mechanism.
- The actual notification channel is a [tokio::sync::watch](https://docs.rs/tokio/latest/tokio/sync/watch/index.html) which ensures only the last value is in the receiver channel. This way we don't need to worry about message order etc.
- Sync waits for state changes concurrently with normal messages.

#### Pause Sync
Sync has four components, pausing is done differently in each:
- **Block lookups**: Disabled while in this state. We drop current requests and don't search for new blocks. Block lookups are infrequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it.
- **Parent lookups**: Disabled while in this state. We drop current requests and don't search for new parents. Parent lookups are even less frequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it.
- **Range**: Chains don't send batches for processing to the beacon processor. This is easily done by guarding the channel to the beacon processor and giving it access only if the ee is responsive. I find this the simplest and most powerful approach since we don't need to deal with new sync states and chain segments that are added while the ee is offline will follow the same logic without needing to synchronize a shared state among those. Another advantage of passive pause vs active pause is that we can still keep track of active advertised chain segments so that on resume we don't need to re-evaluate all our peers.
- **Backfill**: Not affected by ee states, we don't pause.

#### Resume Sync
- **Block lookups**: Enabled again.
- **Parent lookups**: Enabled again.
- **Range**: Active resume. Since the only real pause range does is not sending batches for processing, resume makes all chains that are holding read-for-processing batches send them.
- **Backfill**: Not affected by ee states, no need to resume.

## Additional Info

**QUESTION**: Originally I made this to notify and change on synced state, but @pawanjay176 on talks with @paulhauner concluded we only need to check online/offline states. The upcheck function mentions extra checks to have a very up to date sync status to aid the networking stack. However, the only need the networking stack would have is this one. I added a TODO to review if the extra check can be removed

Next gen of #3094

Will work best with #3439 

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
2022-08-24 23:34:56 +00:00
Divma
f4ffa9e0b4 Handle processing results of non faulty batches (#3439)
## Issue Addressed
Solves #3390 

So after checking some logs @pawanjay176 got, we conclude that this happened because we blacklisted a chain after trying it "too much". Now here, in all occurrences it seems that "too much" means we got too many download failures. This happened very slowly, exactly because the batch is allowed to stay alive for very long times after not counting penalties when the ee is offline. The error here then was not that the batch failed because of offline ee errors, but that we blacklisted a chain because of download errors, which we can't pin on the chain but on the peer. This PR fixes that.

## Proposed Changes

Adds a missing piece of logic so that if a chain fails for errors that can't be attributed to an objectively bad behavior from the peer, it is not blacklisted. The issue at hand occurred when new peers arrived claiming a head that had wrongfully blacklisted, even if the original peers participating in the chain were not penalized.

Another notable change is that we need to consider a batch invalid if it processed correctly but its next non empty batch fails processing. Now since a batch can fail processing in non empty ways, there is no need to mark as invalid previous batches.

Improves some logging as well.

## Additional Info

We should do this regardless of pausing sync on ee offline/unsynced state. This is because I think it's almost impossible to ensure a processing result will reach in a predictable order with a synced notification from the ee. Doing this handles what I think are inevitable data races when we actually pause sync

This also fixes a return that reports which batch failed and caused us some confusion checking the logs
2022-08-12 00:56:38 +00:00
ethDreamer
7c3ff903ca Fix Gossip Penalties During Optimistic Sync Window (#3350)
## Issue Addressed
* #3344 

## Proposed Changes

There are a number of cases during block processing where we might get an `ExecutionPayloadError` but we shouldn't penalize peers. We were forgetting to enumerate all of the non-penalizing errors in every single match statement where we are making that decision. I created a function to make it explicit when we should and should not penalize peers and I used that function in all places where this logic is needed. This way we won't make the same mistake if we add another variant of `ExecutionPayloadError` in the future.
2022-07-20 20:59:38 +00:00