more comments

2022-05-16 05:56:57 +00:00 · 2022-05-16 05:56:57 +00:00 · 4e142d1fb1
commit 4e142d1fb1
parent ed89d8f01a
2 changed files with 8 additions and 8 deletions
--- a/TODO.md
+++ b/TODO.md
@ -1,16 +1,23 @@
 # Todo

+- [ ] improve caching
+  - [ ] if the eth_call (or similar) params include a block, we can cache for longer
+  - [ ] if the call is something simple like "symbol" or "decimals", cache that too
+  - [ ] when we receive a block, we should store it for later eth_getBlockByNumber, eth_blockNumber, and similar calls
 - [ ] eth_sendRawTransaction should return the most common result, not the first
 - [ ] if chain split detected, don't send transactions
 - [ ] endpoint for health checks. if no synced servers, give a 502 error
+  - [ ] move from warp to auxm?
 - [ ] some production configs are occassionally stuck waiting at 100% cpu
  - looks like its getting stuck on `futex(0x7fc15067b478, FUTEX_WAIT_PRIVATE, 1, NULL`
  - they stop processing new blocks. i'm guessing 2 blocks arrive at the same time, but i thought our locks would handle that
+  - even after removing a bunch of the locks, the deadlock still happens. i can't reliably reproduce. i just let it run for awhile and it happens.
 - [ ] proper logging with useful instrumentation
 - [ ] handle websocket disconnect and reconnect
 - [ ] warning if no blocks for too long. maybe reconnect automatically?
 - [ ] if the fastest server has hit rate limits, we won't be able to serve any traffic until another server is synced.
    - thundering herd problem if we only allow a lag of 0 blocks
+    - we can fix this by only `publish`ing the sorted list once a certain sync limit is reached 
 - [ ] tarpit hard_ratelimit at the start, but reject if incoming requests is super high?
 - [ ] add the backend server to the header?
 - [ ] the web3proxyapp object gets cloned for every call. why do we need any arcs inside that? shouldn't they be able to connect to the app's? can we just use static lifetimes
@ -18,10 +25,6 @@
 - [ ] if a request gets a socket timeout, try on another server
  - maybe always try at least two servers in parallel? and then return the first? or only if the first one doesn't respond very quickly?
 - [ ] incoming rate limiting (by ip or by api key or what?)
- [ ] improve caching
-  - [ ] if the eth_call (or similar) params include a block, we can cache for longer
-  - [ ] if the call is something simple like "symbol" or "decimals", cache that too
-  - [ ] when we receive a block, we should store it for later eth_getBlockByNumber, eth_blockNumber, and similar calls
 - [ ] measure latency to nodes?
 - [ ] one proxy for mulitple chains?
 - [ ] zero downtime deploys
--- a/web3-proxy/src/connections.rs
+++ b/web3-proxy/src/connections.rs
@ -317,10 +317,6 @@ impl Web3Connections {
            .map(|x| x.inner.clone())
            .unwrap();

-        // // TODO: how should we include the soft limit? floats are slower than integer math
-        // let a = a as f32 / self.soft_limit as f32;
-        // let b = b as f32 / other.soft_limit as f32;
-
        // TODO: better key!
        let sort_cache: HashMap<String, (f32, u32)> = synced_rpc_arcs
            .iter()
@ -330,6 +326,7 @@ impl Web3Connections {
                let active_requests = connection.active_requests();
                let soft_limit = connection.soft_limit();

+                // TODO: how should we include the soft limit? floats are slower than integer math
                let utilization = active_requests as f32 / soft_limit as f32;

                (key, (utilization, soft_limit))