* feature: do trie prefetch on state prefetch
Currently, state prefetch just pre execute the transactions and discard the results.
It is helpful to increase the snapshot cache hit rate.
It would be more helpful, if it can do trie prefetch at the same time, since the it will
preload the trie node and build the trie tree in advance.
This patch is to implement it, by reusing the main trie prefetch and doing finalize after
transaction is executed.
* some code improvements for trie prefetch
** increase pendingSize before dispatch tasks
** use throwaway StateDB for TriePrefetchInAdvance and remove the prefetcherLock
** remove the necessary drain operation in trie prefetch mainloop,
trie prefetcher won't be used after close.
* trie prefetcher for From/To address in advance
We found that trie prefetch could be not fast enough, especially trie prefetch of
the outer big state trie tree.
Instead of do trie prefetch until a transaction is finalized, we could do trie prefetch
in advance. Try to prefetch the trie node of the From/To accounts, since their root hash
are most likely to be changed.
* Parallel TriePrefetch for large trie update.
Currently, we create a subfetch for each account address to do trie prefetch. If the address
has very large state change, trie prefetch could be not fast enough, e.g. a contract modified
lots of KV pair or a large number of account's root hash is changed in a block.
With this commit, there will be children subfetcher created to do trie prefetch in parallell if
the parent subfetch's workload exceed the threshold.
* some improvemnts of parallel trie prefetch implementation
1.childrenLock is removed, since it is not necessary
APIs of triePrefetcher is not thread safe, they should be used sequentially.
A prefetch will be interrupted by trie() or clos(), so we only need mark it as
interrupted and check before call scheduleParallel to avoid the concurrent access to paraChildren
2.rename subfetcher.children to subfetcher.paraChildren
3.use subfetcher.pendingSize to replace totalSize & processedIndex
4.randomly select the start child to avoid always feed the first one
5.increase threshold and capacity to avoid create too many child routine
* fix review comments
** nil check refine
** create a separate routine for From/To prefetch, avoid blocking the cirtical path
* remove the interrupt member
* not create a signer for each transaction
* some changes to triePrefetcher
** remove the abortLoop, move the subfetcher abort operation into mainLoop
since we want to make subfetcher's create & schedule & abort within a loop to
avoid concurrent access locks.
** no wait subfetcher's term signal in abort()
it could speed up the close by closing subfetcher concurrently.
we send stop signnal to all subfetchers in burst and wait their term signal later.
* some coding improve for subfetcher.scheduleParallel
* fix a UT crash of s.prefetcher == nil
* update parallel trie prefetcher configuration
tested with different combination of parallelTriePrefetchThreshold & parallelTriePrefetchCapacity,
found the most efficient configure could be:
parallelTriePrefetchThreshold = 10
parallelTriePrefetchCapacity = 20
* fix review comments: code refine
* Redesign triePrefetcher to make it thread safe
There are 2 types of triePrefetcher instances:
1.New created triePrefetcher: it is key to do trie prefetch to speed up validation phase.
2.Copied triePrefetcher: it only copy the prefetched trie information, actually it won't do
prefetch at all, the copied tries are all kept in p.fetches.
Here we try to improve the new created one, to make it concurrent safe, while the copied one's
behavior stay unchanged(its logic is very simple).
As commented in triePrefetcher struct, its APIs are not thread safe. So callers should make sure
the created triePrefetcher should be used within a single routine.
As we are trying to improve triePrefetcher, we would use it concurrently, so it is necessary to
redesign it for concurrent access.
The design is simple:
** start a mainLoop to do all the work, APIs just send channel message.
Others:
** remove the metrics copy, since it is useless for copied triePrefetcher
** for trie(), only get subfetcher through channel to reduce the workload of mainloop
* some code enhancement for triePrefetcher redesign
* some fixup: rename, temporary trie chan for concurrent safe.
* fix review comments
* add some protection in case the trie prefetcher is already stopped
* fix review comments
** make close concurrent safe
** fix potential deadlock
* replace channel by RWMutex for a few triePrefetcher APIs
For APIs like: trie(), copy(), used(), it is simpler and more efficient to
use a RWMutex instead of channel communicaton.
Since the mainLoop would be busy handling trie request, while these trie request
can be processed in parallism.
We would only keep prefetch and close within the mainLoop, since they could update
the fetchers
* add lock for subfecter.used access to make it concurrent safe
* no need to create channel for copied triePrefetcher
* fix trie_prefetcher_test.go
trie prefetcher’s behavior has changed, prefetch() won't create subfetcher immediately.
it is reasonable, but break the UT, to fix the failed UT
This enables the following linters
- typecheck
- unused
- staticcheck
- bidichk
- durationcheck
- exportloopref
- gosec
WIth a few exceptions.
- We use a deprecated protobuf in trezor. I didn't want to mess with that, since I cannot meaningfully test any changes there.
- The deprecated TypeMux is used in a few places still, so the warning for it is silenced for now.
- Using string type in context.WithValue is apparently wrong, one should use a custom type, to prevent collisions between different places in the hierarchy of callers. That should be fixed at some point, but may require some attention.
- The warnings for using weak random generator are squashed, since we use a lot of random without need for cryptographic guarantees.
This PR fixes the flaw that @rjl493456442 found in https://github.com/ethereum/go-ethereum/pull/#issuecomment-1093817551 , namely, that the snapshot iterator uses the combined (disk + difflayers) 'view', wheres the raw iterator uses only the disk 'view'.
This PR instead splits up the work: one phase is iterating the disk layer data, another phase is loading the journalled difflayers and performing the same check there.
This commit replaces ioutil.TempDir with t.TempDir in tests. The
directory created by t.TempDir is automatically removed when the test
and all its subtests complete.
Prior to this commit, temporary directory created using ioutil.TempDir
had to be removed manually by calling os.RemoveAll, which is omitted in
some tests. The error handling boilerplate e.g.
defer func() {
if err := os.RemoveAll(dir); err != nil {
t.Fatal(err)
}
}
is also tedious, but t.TempDir handles this for us nicely.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* core/state/snapshot: fix BAD BLOCK error when snapshot is generating
* core/state/snapshot: alternative fix for the snapshot generator
* add comments and minor update
Co-authored-by: Martin Holst Swende <martin@swende.se>