NEP: Eliminate TTN Gas Cost by jakmeier · Pull Request #424 · near/NEPs · GitHub
When executing a smart contract, the gas costs are generally dynamic. But they are predictable, if the code paths taken are known. With one exception: storage operations. Those are impossible to predict accurately.
Here is a suggestion how storage costs could become predictable. The idea is to remove the dynamically counted number of touched trie node from the storage equation. The same cost can be approximated by the storage key length, which is something the dApp developer controls.
Why do we need this change?
There are two main reasons why this change is desirable.
- Predictable gas costs are necessary for a good devX.
- Validator client implementations are currently restrained by the gas cost model. They could be more efficient if they didn’t have to count the exact number of trie nodes on every access. This, eventually, could lead to lower gas fees overall.
What exactly is the proposed change?
Current formula:
-
base_fee
+ -
per_value_byte_fee
*value_bytes
+ -
per_key_byte_fee
*key_bytes
+ -
touched_trie_node_fee
*uncached_trie_nodes
+ -
cached_trie_node_fee
*cached_trie_nodes
This formula exists with different parameter values for each type of storage operation. (read/write/has_key/remove).
New suggested formula:
-
base_fee
+ -
per_value_byte_fee
*value_bytes
+ -
per_key_byte_fee
*key_bytes
touched_trie_node_fee * uncached_trie_nodes +cached_trie_node_fee * cached_trie_nodes
This, but with increased values on per_key_byte_fee
to balance costs out.
How does this make costs predictable?
Costs are unpredictable for two reasons.
- The number of touched trie node depends not only on the own account’s state, but also what other accounts exist on the chain.
- The number of cached trie node depends on all receipts that have been executed in the same chunk, on the same account or on different accounts.
Both are no longer an issue when the trie node count is removed from the gas formula.
How/why can we just remove parts of the gas cost?
The dApp should be paying for all the work validators do to execute a function call. This includes reading and writing trie nodes when storage is accessed. This is mostly true today and will still be true if the proposal is accepted and implemented.
Charging for trie node costs is not trivial to get right. When multiple storage requests are made in the same function call, many nodes are accessed more than once. The first time a node is accessed, it can potentially be an expensive fetch from disk. Following requests can be served from memory, which is at least an order of magnitude cheaper.
Today, the trie node cost is charged either as a full DB access or as an in-memory access, using different parameter values for each. But because all validators have to agree on which costs is charged, the exact caching strategy must be part of the protocol specification. For that, the chunk cache is specified as a least-recently-used cache with fixed size that starts empty for every chunk processed.
The chunk cache effectively safes somewhere between 30% and 90% of all storage costs today, compared to always charging the full DB cost. In reality, however, 95% - 99% of trie nodes are served from memory. So in some sense it is still overcharged. But to compensate that, the “full DB access” cost is a bit cheaper than it should be, since a validator never has to actually fetch all those values from disk.
To summarise, today’s cost model is an approximation of the trie node hit rate that relies on counting the exact number of trie nodes touched.
The new proposed model is also an approximation. However, it uses the fact that the number of trie nodes is limited by the key length. A key with 20 bytes can produce at most 40 trie nodes, for example.
What was that about client implementation becoming more efficient and lower gas fees?
Today, NEAR Protocol clients must implement storage requests in a very specific way. They have to look up trie node after trie node, starting at the trie root of the current chunk, all the way down to the leaf with the accessed value.
A more efficient way to implement storage would be to have a direct mapping from storage keys to values, rather than traversing the trie each time. This idea is currently already in the late implementation phase for nearcore, under the name of FlatStorage. However, any potential gas fee reduction from such optimisations cannot be realized with the current gas model. Because the protocol mandates counting of the exact number of trie nodes in order to subtract the right amount of gas.
Ok but what new parameter values to use and how does that affect my favourite dApp on NEAR?
This is still work-in-progress. Some analysis on this can be found in this public document: Eliminate TTN gas cost - Google Docs
More details on exact parameter values will follow. And a NEP will be written for this, should the community agree that, in principle, this is a benefitial change.