This change speeds up trie hashing and all other activities that require
RLP encoding of trie nodes by approximately 20%. The speedup is achieved by
avoiding reflection overhead during node encoding.
The interface type trie.node now contains a method 'encode' that works with
rlp.EncoderBuffer. Management of EncoderBuffers is left to calling code.
trie.hasher, which is pooled to avoid allocations, now maintains an
EncoderBuffer. This means memory resources related to trie node encoding
are tied to the hasher pool.
Co-authored-by: Felix Lange <fjl@twurst.com>
This change makes use of the new code generator rlp/rlpgen to improve the
performance of RLP encoding for Header and StateAccount. It also speeds up
encoding of ReceiptForStorage using the new rlp.EncoderBuffer API.
The change is much less transparent than I wanted it to be, because Header and
StateAccount now have an EncodeRLP method defined with pointer receiver. It
used to be possible to encode non-pointer values of these types, but the new
method prevents that and attempting to encode unadressable values (even if
part of another value) will return an error. The error can be surprising and may
pop up in places that previously didn't expect any errors.
To make things work, I also needed to update all code paths (mostly in unit tests)
that lead to encoding of non-pointer values, and pass a pointer instead.
Benchmark results:
name old time/op new time/op delta
EncodeRLP/legacy-header-8 328ns ± 0% 237ns ± 1% -27.63% (p=0.000 n=8+8)
EncodeRLP/london-header-8 353ns ± 0% 247ns ± 1% -30.06% (p=0.000 n=8+8)
EncodeRLP/receipt-for-storage-8 237ns ± 0% 123ns ± 0% -47.86% (p=0.000 n=8+7)
EncodeRLP/receipt-full-8 297ns ± 0% 301ns ± 1% +1.39% (p=0.000 n=8+8)
name old speed new speed delta
EncodeRLP/legacy-header-8 1.66GB/s ± 0% 2.29GB/s ± 1% +38.19% (p=0.000 n=8+8)
EncodeRLP/london-header-8 1.55GB/s ± 0% 2.22GB/s ± 1% +42.99% (p=0.000 n=8+8)
EncodeRLP/receipt-for-storage-8 38.0MB/s ± 0% 64.8MB/s ± 0% +70.48% (p=0.000 n=8+7)
EncodeRLP/receipt-full-8 910MB/s ± 0% 897MB/s ± 1% -1.37% (p=0.000 n=8+8)
name old alloc/op new alloc/op delta
EncodeRLP/legacy-header-8 0.00B 0.00B ~ (all equal)
EncodeRLP/london-header-8 0.00B 0.00B ~ (all equal)
EncodeRLP/receipt-for-storage-8 64.0B ± 0% 0.0B -100.00% (p=0.000 n=8+8)
EncodeRLP/receipt-full-8 320B ± 0% 320B ± 0% ~ (all equal)
This change adds a code generator tool for creating EncodeRLP method
implementations. The generated methods will behave identically to the
reflect-based encoder, but run faster because there is no reflection overhead.
Package rlp now provides the EncoderBuffer type for incremental encoding. This
is used by generated code, but the new methods can also be useful for
hand-written encoders.
There is also experimental support for generating DecodeRLP, and some new
methods have been added to the existing Stream type to support this. Creating
decoders with rlpgen is not recommended at this time because the generated
methods create very poor error reporting.
More detail about package rlp changes:
* rlp: externalize struct field processing / validation
This adds a new package, rlp/internal/rlpstruct, in preparation for the
RLP encoder generator.
I think the struct field rules are subtle enough to warrant extracting
this into their own package, even though it means that a bunch of
adapter code is needed for converting to/from rlpstruct.Type.
* rlp: add more decoder methods (for rlpgen)
This adds new methods on rlp.Stream:
- Uint64, Uint32, Uint16, Uint8, BigInt
- ReadBytes for decoding into []byte
- MoreDataInList - useful for optional list elements
* rlp: expose encoder buffer (for rlpgen)
This exposes the internal encoder buffer type for use in EncodeRLP
implementations.
The new EncoderBuffer type is a sort-of 'opaque handle' for a pointer to
encBuffer. It is implemented this way to ensure the global encBuffer pool
is handled correctly.
Some small fixes to get the existing debug methods to conform to the spec. Mainly dropping the encoding information from the method name as it should be deduced from the debug context and allowing the method to be invoked by either block number or block hash. It also adds the method debug_getTransaction which returns the raw tx bytes by tx hash. This is pretty much equivalent to the eth_getRawTransactionByHash method.
* eth/catalyst: warn less frequently if no beacon client is available
* eth/catalyst: tweak warning frequency a bit
* eth/catalyst: some more tweaks
* Update api.go
Co-authored-by: Felix Lange <fjl@twurst.com>
* trie: fix memory leak in trie iterator
In the trie iterator, live nodes are tracked in a stack while iterating.
Popped node states should be explictly set to nil in order to get
garbage-collected.
* trie: fix empty trie iterator
* eth/fetcher: introduce some lag in tx fetching
* eth/fetcher: change conditions a bit
* eth/fetcher: use per-batch quota check
* eth/fetcher: fix some comments
* eth/fetcher: address review concerns
* eth/fetcher: fix panic + add warn log
* eth/fetcher: fix log
* eth/fetcher: fix log
* cmd/devp2p/internal/ethtest: fix ignorign tx announcements from prev. tests
* cmd/devp2p/internal/ethtest: fix TestLargeTxRequest
This increases the number of tx relay messages the test waits for. Since
go-ethereum now processes incoming txs in smaller batches, the
announcement messages it sends are also smaller.
Co-authored-by: Felix Lange <fjl@twurst.com>