High-level overview of NEAR fees today

The goal of this post is to give an overview of how we approach NEAR fees today and suggest modifications to our approach.

Background and Motivation

For any blockchain it is important to have precise estimation of its fees, however it is even more important to NEAR for several reasons:

  • NEAR uses so-called pre-state root, which means the transactions/receipts included in block X by validator A are required by the protocol to be executed in block X+1 by potentially a different validator B. This means that validator B is at the mercy of validator A – if there is an underestimated fee validator A can make validator B execute transactions that abuse this underestimation;
  • While validators can pick and choose which transactions to include in a block, they have no say in which receipts are included into a block, because of how receipts execution is implemented to support sharding. This means that validators do not have control of what they execute most of the time;

In some other blockchains like Ethereum or Bitcoin validators can choose to not include transactions into a block for any reason. But this is not the case with NEAR. NEAR validators cannot discard or postpone transactions/receipts even if they run out of time to produce a block. If we miscalculate the fees and the execution takes longer than 1 sec then multiple bad things can happen:

  • If validator takes longer than 1 sec to produce a block, then other validators attempt to skip this block. If there are many of skipped blocks like this in a row, then we might observe disruption of the network, that might result in dApp UI lagging;
  • Independently on whether the delayed block is accepted or skipped, it slows down the overall block production. There are some mechanisms that make assumptions on the block production rate, e.g. inflation mechanism. If delays happen too frequently these assumptions will be broken and we don’t know the full extend of the economic side-effects on that;

Therefore, NEAR developers need to be extremely careful with estimating the fees. An additional complicating factor is that, generally for any blockchain, it is difficult to increase fees, since it might break some contracts. For example, if contract A calls contract B while attaching X amount of gas, then this constant X might be forever hardcoded in the code of contract A. If later we increase certain fee, X might become insufficient for the execution of contract B. But, since the contracts can be potentially immutable, they might become forever broken by the fee increase.

The consequence of overestimating fee is however, less severe. It only causes NEAR TPS to be lower than what it can actually be, since protocol will force even the full blocks to take significantly less than 1 second each, but will still wait 1 full second for each of them.

We have two major ways of addressing this constraint:

  • We try our damn hardest with making sure our fees are estimated correctly. Our current param estimator uses CPU emulator to count the precise number of CPU instructions that it takes for each operation, and we have removed many sources of non-determinism like hashmaps with random seeds, to make most of the fees reproducible within 5-15%;
  • We aggressively overestimate our fees all over the param estimator source code. In many places we consider the cost of operation A to be cost of operation A + another operation B. We also have a safety multiplier which artificially increases the cost of many operations by 3, until we gain full confidence in our fees;

Unfortunately the former relies on extreme level of diligence and code quality which is extremely laborious, and the latter means that our TPS is several times lower than what it could be.

There is, however, a silver-lining – since protocol design requires us to be extremely precise with the fees, it pushes us into developing a very deep insight into the performance of operations, which in a long-run helps us with finding and fixing performance bottlenecks in a very data-driven way.

Suggested approach

There are several things to consider:

  • How we approach improvement of the param estimator;
  • How we approach the existing fee issues;
  • What ways of increasing fees are available to us;

Approaching improvement of param estimator

It is clear that we want to improve the experience of using param estimator, however this improvement should not be constrained to only code cleanup. I think we should revisit the approach to fee estimation entirely on a high-level and answer the following questions:

  • Is it necessarily the case that for each new fee we need to write a new benchmark? This forces us to rely on the correctness of each individual benchmark, which increases the surface area for potential bugs. Is it possible to have some universal benchmark that allows us to test most of the fees, similarly to how pperf can build flame graph from a single execution?
  • Can we make estimation of all fees completely independent from each other? This can allow us to parallelize the estimation and it can allow developers to test their changes by running param estimator for one fee only;
  • How would the best developer API of param estimator look like? Should we use a config file or command line args to configure it?
  • Can we run param estimator on a regular basis and plot the graph of improvements/regressions in nearcore? The script that would run it can also be used as a go-to script for running param estimator. Right now the instructions for running param estimator are in docs which might get outdated;

Approaching existing fee issues

We have noticed that some fees are currently outdated (i.e. if we run param estimator it produces different fees), and so we decided to spend some amount of time doing archeology, trying to understand what are the historical reasons behind these discrepancies. This is a very time-consuming approach, and I suggest an alternative approach of abandoning an idea of trying to understand the current fees, but instead focusing on making param estimator so simple and transparent that when it produces a certain fee today, we know for sure that it is correct, and can ignore the past history behind this fee.

Finding a way to increase the fees

Not being able to increase the fees is extremely constraining to the development. However, there might be several way of how we can increase the fees nonetheless.

  1. If we can find a way to know whether an increase of a fee will break any contract then we will likely find out that increasing some fees does not lead to the breakages, either because other fees dominate the contracts cost, or because average contract developer attaches a very large amount of gas to function calls. We could develop a tool that replays all past transactions, but using a new fee config and checks that none of the old successful transactions starts failing. This would give us some level of confidence. However, developing such tool will take time;
  2. We can bundle fee changes together. E.g. if we are planning to greatly reduce a fee that affects all function call contracts then we might increase another fee in the same protocol upgrade. We would however need to argue that the overall cost of function calls is not increasing;
  3. We can increase a fee, while reducing it safety coefficient, to make the resulting fee look smaller;
  4. We can increase a fee, while proportionally reducing all other fees, and the total amount of gas that we allow in the block. Unfortunately, we cannot reduce block gas limit by more than ~3x, since we allow attaching 300Tgas to transactions.

In a follow-up private post, I will list all the fee issues that we currently have.

5 Likes

We also need to consider what is a reasonable environment for running the parameter estimator. Today we choose to run it by generating some accounts in the state first and the number of accounts we choose is somewhat arbitrary and may not reflect what happens on mainnet for example. For certain storage related host functions, it is quite important that we estimate them with a reasonable assumption of state size, for example.

We can also consider raising hardware requirements. Using better hardware could potentially reduce certain fees dramatically.

3 Likes