Node discovery periodically revalidates the nodes in its table by sending PING, checking
if they are still alive. I recently noticed some issues with the implementation of this
process, which can cause strange results such as nodes dropping unexpectedly, certain
nodes not getting revalidated often enough, and bad results being returned to incoming
FINDNODE queries.
In this change, the revalidation process is improved with the following logic:
- We maintain two 'revalidation lists' containing the table nodes, named 'fast' and 'slow'.
- The process chooses random nodes from each list on a randomized interval, the interval being
faster for the 'fast' list, and performs revalidation for the chosen node.
- Whenever a node is newly inserted into the table, it goes into the 'fast' list.
Once validation passes, it transfers to the 'slow' list. If a request fails, or the
node changes endpoint, it transfers back into 'fast'.
- livenessChecks is incremented by one for successful checks. Unlike the old implementation,
we will not drop the node on the first failing check. We instead quickly decay the
livenessChecks give it another chance.
- Order of nodes in bucket doesn't matter anymore.
I am also adding a debug API endpoint to dump the node table content.
Co-authored-by: Martin HS <martin@swende.se>
Package filepath implements utility routines for manipulating filename paths in a way compatible with the target operating system-defined file paths.
Package path implements utility routines for manipulating slash-separated paths.
The path package should only be used for paths separated by forward slashes, such as the paths in URLs
Improving two things here:
On hive, where we look at these tests, the Go code comment above the test
is not visible. When there is a failure, it's not obvious what the test is actually
expecting. I have converted the comments in to printed log messages to
explain the test more.
Second, I noticed that besu is failing some tests because it happens to request
a header when we want it to send transactions. Trying the minimal fix here to
serve the headers.
Co-authored-by: lightclient <14004106+lightclient@users.noreply.github.com>
Here we update the eth and snap protocol test suites with a new test chain,
created by the hivechain tool. The new test chain uses proof-of-stake. As such,
tests using PoW block propagation in the eth protocol are removed. The test suite
now connects to the node under test using the engine API in order to make it
accept transactions.
The snap protocol test suite has been rewritten to output test descriptions and
log requests more verbosely.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR replaces Geth's logger package (a fork of [log15](https://github.com/inconshreveable/log15)) with an implementation using slog, a logging library included as part of the Go standard library as of Go1.21.
Main changes are as follows:
* removes any log handlers that were unused in the Geth codebase.
* Json, logfmt, and terminal formatters are now slog handlers.
* Verbosity level constants are changed to match slog constant values. Internal translation is done to make this opaque to the user and backwards compatible with existing `--verbosity` and `--vmodule` options.
* `--log.backtraceat` and `--log.debug` are removed.
The external-facing API is largely the same as the existing Geth logger. Logger method signatures remain unchanged.
A small semantic difference is that a `Handler` can only be set once per `Logger` and not changed dynamically. This just means that a new logger must be instantiated every time the handler of the root logger is changed.
----
For users of the `go-ethereum/log` module. If you were using this module for your own project, you will need to change the initialization. If you previously did
```golang
log.Root().SetHandler(log.LvlFilterHandler(log.LvlInfo, log.StreamHandler(os.Stderr, log.TerminalFormat(true))))
```
You now instead need to do
```golang
log.SetDefault(log.NewLogger(log.NewTerminalHandlerWithLevel(os.Stderr, log.LevelInfo, true)))
```
See more about reasoning here: https://github.com/ethereum/go-ethereum/issues/28558#issuecomment-1820606613
During snap-sync, we request ranges of values: either a range of accounts or a range of storage values. For any large trie, e.g. the main account trie or a large storage trie, we cannot fetch everything at once.
Short version; we split it up and request in multiple stages. To do so, we use an origin field, to say "Give me all storage key/values where key > 0x20000000000000000". When the server fulfils this, the server provides the first key after origin, let's say 0x2e030000000000000 -- never providing the exact origin. However, the client-side needs to be able to verify that the 0x2e03.. indeed is the first one after 0x2000.., and therefore the attached proof concerns the origin, not the first key.
So, short-short version: the left-hand side of the proof relates to the origin, and is free-standing from the first leaf.
On the other hand, (pun intended), the right-hand side, there's no such 'gap' between "along what path does the proof walk" and the last provided leaf. The proof must prove the last element (unless there are no elements).
Therefore, we can simplify the semantics for trie.VerifyRangeProof by removing an argument. This doesn't make much difference in practice, but makes it so that we can remove some tests. The reason I am raising this is that the upcoming stacktrie-based verifier does not support such fancy features as standalone right-hand borders.
This change addresses an issue in snap sync, specifically when the entire sync process can be halted due to an encountered empty storage range.
Currently, on the snap sync client side, the response to an empty (partial) storage range is discarded as a non-delivery. However, this response can be a valid response, when the particular range requested does not contain any slots.
For instance, consider a large contract where the entire key space is divided into 16 chunks, and there are no available slots in the last chunk [0xf] -> [end]. When the node receives a request for this particular range, the response includes:
The proof with origin [0xf]
A nil storage slot set
If we simply discard this response, the finalization of the last range will be skipped, halting the entire sync process indefinitely. The test case TestSyncWithUnevenStorage can reproduce the scenario described above.
In addition, this change also defines the common variables MaxAddress and MaxHash.
This is a minor refactor in preparation of changes to range verifier. This PR contains no intentional functional changes but moves (and renames) the light.NodeSet
This PR makes the tool use the --bootnodes list as the input to devp2p crawl.
The flag will take effect if the input/output.json file is missing or empty.
* core/forkid: skip genesis forks by time
* core/forkid: add comment about skipping non-zero fork times
* core/forkid: skip all time based forks in genesis using loop
* core/forkid: simplify logic for dropping time-based forks
This changes the forkID calculation to ignore time-based forks that occurred before the
genesis block. It's supposed to be done this way because the spec says:
> If a chain is configured to start with a non-Frontier ruleset already in its genesis, that is NOT considered a fork.
The Go authors updated golang/x/ext to change the function signature of the slices sort method.
It's an entire shitshow now because x/ext is not tagged, so everyone's codebase just
picked a new version that some other dep depends on, causing our code to fail building.
This PR updates the dep on our code too and does all the refactorings to follow upstream...
The clean trie cache is persisted periodically, therefore Geth can
quickly warmup the cache in next restart.
However it will reduce the robustness of system. The assumption is
held in Geth that if the parent trie node is present, then the entire
sub-trie associated with the parent are all prensent.
Imagine the scenario that Geth rewinds itself to a past block and
restart, but Geth finds the root node of "future state" in clean
cache then regard this state is present in disk, while is not in fact.
Another example is offline pruning tool. Whenever an offline pruning
is performed, the clean cache file has to be removed to aviod hitting
the root node of "deleted states" in clean cache.
All in all, compare with the minor performance gain, system robustness
is something we care more.
Follow-up to #26697, makes the crawler less verbose on route53-based scenarios.
It also changes the loglevel from debug to info on Updates, which are typically the root, and can be interesting to see.
The EmptyRootHash and EmptyCodeHash are defined everywhere in the codebase, this PR replaces all of them with unified one defined in core/types package, and also defines constants for TxRoot, WithdrawalsRoot and UncleRoot
Our discovery crawler spits out a huge amount of logs, most of which is pretty non-interesting. This change moves the very verbose output to Debug, and adds a 8-second status log message giving the general idea about what's going on.
* p2p/discover: add more packet information in logs
This adds more fields to discv5 packet logs. These can be useful when
debugging multi-packet interactions.
The FINDNODE message also gets an additional field, OpID for debugging
purposes. This field is not encoded onto the wire.
I'm also removing topic system related message types in this change.
These will come back in the future, where support for them will be
guarded by a config flag.
* p2p/discover/v5wire: rename 'Total' to 'RespCount'
The new name captures the meaning of this field better.