Implement SSZ union type (#2579)

## Issue Addressed

NA

## Proposed Changes

Implements the "union" type from the SSZ spec for `ssz`, `ssz_derive`, `tree_hash` and `tree_hash_derive` so it may be derived for `enums`:

https://github.com/ethereum/consensus-specs/blob/v1.1.0-beta.3/ssz/simple-serialize.md#union

The union type is required for the merge, since the `Transaction` type is defined as a single-variant union `Union[OpaqueTransaction]`.

### Crate Updates

This PR will (hopefully) cause CI to publish new versions for the following crates:

- `eth2_ssz_derive`: `0.2.1` -> `0.3.0`
- `eth2_ssz`: `0.3.0` -> `0.4.0`
- `eth2_ssz_types`: `0.2.0` -> `0.2.1`
- `tree_hash`: `0.3.0` -> `0.4.0`
- `tree_hash_derive`: `0.3.0` -> `0.4.0`

These these crates depend on each other, I've had to add a workspace-level `[patch]` for these crates. A follow-up PR will need to remove this patch, ones the new versions are published.

### Union Behaviors

We already had SSZ `Encode` and `TreeHash` derive for enums, however it just did a "transparent" pass-through of the inner value. Since the "union" decoding from the spec is in conflict with the transparent method, I've required that all `enum` have exactly one of the following enum-level attributes:

#### SSZ

-  `#[ssz(enum_behaviour = "union")]`
    - matches the spec used for the merge
-  `#[ssz(enum_behaviour = "transparent")]`
    - maintains existing functionality
    - not supported for `Decode` (never was)
    
#### TreeHash

-  `#[tree_hash(enum_behaviour = "union")]`
    - matches the spec used for the merge
-  `#[tree_hash(enum_behaviour = "transparent")]`
    - maintains existing functionality

This means that we can maintain the existing transparent behaviour, but all existing users will get a compile-time error until they explicitly opt-in to being transparent.

### Legacy Option Encoding

Before this PR, we already had a union-esque encoding for `Option<T>`. However, this was with the *old* SSZ spec where the union selector was 4 bytes. During merge specification, the spec was changed to use 1 byte for the selector.

Whilst the 4-byte `Option` encoding was never used in the spec, we used it in our database. Writing a migrate script for all occurrences of `Option` in the database would be painful, especially since it's used in the `CommitteeCache`. To avoid the migrate script, I added a serde-esque `#[ssz(with = "module")]` field-level attribute to `ssz_derive` so that we can opt into the 4-byte encoding on a field-by-field basis.

The `ssz::legacy::four_byte_impl!` macro allows a one-liner to define the module required for the `#[ssz(with = "module")]` for some `Option<T> where T: Encode + Decode`.

Notably, **I have removed `Encode` and `Decode` impls for `Option`**. I've done this to force a break on downstream users. Like I mentioned, `Option` isn't used in the spec so I don't think it'll be *that* annoying. I think it's nicer than quietly having two different union implementations or quietly breaking the existing `Option` impl.

### Crate Publish Ordering

I've modified the order in which CI publishes crates to ensure that we don't publish a crate without ensuring we already published a crate that it depends upon.

## TODO

- [ ] Queue a follow-up `[patch]`-removing PR.
This commit is contained in:
Paul Hauner
2021-09-25 05:58:36 +00:00
parent a844ce5ba9
commit fe52322088
63 changed files with 1515 additions and 571 deletions

View File

@@ -48,6 +48,8 @@ pub enum DecodeError {
ZeroLengthItem,
/// The given bytes were invalid for some application-level reason.
BytesInvalid(String),
/// The given union selector is out of bounds.
UnionSelectorInvalid(u8),
}
/// Performs checks on the `offset` based upon the other parameters provided.
@@ -172,9 +174,18 @@ impl<'a> SszDecoderBuilder<'a> {
/// Declares that some type `T` is the next item in `bytes`.
pub fn register_type<T: Decode>(&mut self) -> Result<(), DecodeError> {
if T::is_ssz_fixed_len() {
self.register_type_parameterized(T::is_ssz_fixed_len(), T::ssz_fixed_len())
}
/// Declares that a type with the given parameters is the next item in `bytes`.
pub fn register_type_parameterized(
&mut self,
is_ssz_fixed_len: bool,
ssz_fixed_len: usize,
) -> Result<(), DecodeError> {
if is_ssz_fixed_len {
let start = self.items_index;
self.items_index += T::ssz_fixed_len();
self.items_index += ssz_fixed_len;
let slice = self.bytes.get(start..self.items_index).ok_or_else(|| {
DecodeError::InvalidByteLength {
@@ -300,7 +311,7 @@ impl<'a> SszDecoder<'a> {
///
/// Panics when attempting to decode more items than actually exist.
pub fn decode_next<T: Decode>(&mut self) -> Result<T, DecodeError> {
T::from_ssz_bytes(self.items.remove(0))
self.decode_next_with(|slice| T::from_ssz_bytes(slice))
}
/// Decodes the next item using the provided function.
@@ -312,15 +323,30 @@ impl<'a> SszDecoder<'a> {
}
}
/// Reads a `BYTES_PER_LENGTH_OFFSET`-byte union index from `bytes`, where `bytes.len() >=
/// BYTES_PER_LENGTH_OFFSET`.
pub fn read_union_index(bytes: &[u8]) -> Result<usize, DecodeError> {
read_offset(bytes)
/// Takes `bytes`, assuming it is the encoding for a SSZ union, and returns the union-selector and
/// the body (trailing bytes).
///
/// ## Errors
///
/// Returns an error if:
///
/// - `bytes` is empty.
/// - the union selector is not a valid value (i.e., larger than the maximum number of variants.
pub fn split_union_bytes(bytes: &[u8]) -> Result<(UnionSelector, &[u8]), DecodeError> {
let selector = bytes
.first()
.copied()
.ok_or(DecodeError::OutOfBoundsByte { i: 0 })
.and_then(UnionSelector::new)?;
let body = bytes
.get(1..)
.ok_or(DecodeError::OutOfBoundsByte { i: 1 })?;
Ok((selector, body))
}
/// Reads a `BYTES_PER_LENGTH_OFFSET`-byte length from `bytes`, where `bytes.len() >=
/// BYTES_PER_LENGTH_OFFSET`.
fn read_offset(bytes: &[u8]) -> Result<usize, DecodeError> {
pub fn read_offset(bytes: &[u8]) -> Result<usize, DecodeError> {
decode_offset(bytes.get(0..BYTES_PER_LENGTH_OFFSET).ok_or_else(|| {
DecodeError::InvalidLengthPrefix {
len: bytes.len(),