Better Accounting for Blockchains
Towards 18-decimal-place accurate ETH
and ERC20
token accounting
TrueBlocks is pleased to announce the release of our newest version: v0.43.0. This is our best version yet and includes much better ERC20 token accounting. In this article, we present a few thoughts on what’s changed and where we’re headed from here.
A quick thought on the word “Accounting”
TrueBlocks has always been about accounting…in a certain sense. It depends on what you mean by the word.
The word “accounting” is loaded with hundreds of years of baggage. Ever since Luca Pacioli invented double-entry bookkeeping in 1494, “accounting” has had a primarily financial sense.
This is still true, but with the advent of reproducible, immutable, worldwide-accessible data, we would like to expand that word to include EVM smart contract “state.” Our ultimate goal at TrueBlocks is perfect 18-decimal place “accounting” for smart contract “state,” not just the financial aspects of a smart contract.
What does this mean? We’re working that out…but you wouldn’t be too far off if the words “permissionlessly reproduce every state change off-chain” floated into your head.
This article, however, is about the financial sense of the word.
Where we were
Before the current version (v0.43.0), TrueBlocks could reconcile, with high accuracy, the ETH accounting for any Ethereum address. It still does that, but with this version, it brings the same high-accuracy accounting to ERC20 tokens.
What does “reconcile” mean? It means three things:
- Inter-transaction agreement (A):
At each appearance of an address on the chain, an off-chain balance that was calculated at the previous appearance must equal the on-chain balance prior to the current transaction. In other words, we still have the same amount of tokens that we had the last time we looked. This ensures that there are no missing appearances. - Intra-transaction agreement (B):
The on-chain beginning balance of the current appearance (which is the same as the off-chain balance of the previous appearance — see A) plus all incoming amounts, minus all the outgoing amounts, is equal to the ending balance at the current appearance. This ensures we’ve noted all movements of value. - Each of the two previous aspects of the reconciliation is true for all assets transferred during the appearance.
In effect, the history of the token holdings of any EVM-based addresses is an intertwined time-ordered ledger. Visually, it looks like this:
Note: The above three rules apply to each individual asset (ETH
and ERC20 tokens) separately for each appearance.
Note: In the above discussion, “on-chain balance” means that we query the blockchain using eth_getBalance
for ETH
and the ERC20’s getBalance
routine for tokens.
This is not brain science.
In fact, anyone who’s familiar with accounting (Luca?) will recognize the above as a Balance Sheet and a Profit and Loss statement.
Please don’t miss the fact that this is done every fifteen seconds — and it’s done a laptop computer (that’s TrueBlocks real import — full decentralization).
So, where does the complication come from?
Three places: (1) you must make sure you have every appearance, (2) you must know where to look in the data for asset transfers, and (3) you must deal with quirks of the RPC and buggy smart contracts.
We address these three issues next.
Make sure you have every appearance
The first thing one must do if one wishes to make 18-decimal-place accurate accounting is to get an accurate list of every transaction in which a particular address appears.
Getting a list of appearances is basically impossible with the current EVM RPC implementations. This is exactly the reason we built the Unchained Index. (We won’t rehash what the Unchained Index is here, having written about it many times elsewhere).
Using chifra
our command line tool, we run
chifra list <address>
which compiles a complete list of appearances for the address. Next, we run chifra export <address> --accounting
which produces the accounting output we’re looking for.
Know where to look for asset transfers
How do we know where to look for the value transfers, given the appearances?
An important to note is that a single transaction may transfer many different assets. For example, this transaction transfers two assets and generates eight events, three of which are Transfer
events. Three of the events are Mint
, Burn
, and Withdraw
. And there’s some other crap as well. Which ones matter? It turns out only the Transfer
events matter. The other events may add color, but they do not indicate a transfer of value.
Focusing just onETH
transfers, we can see why the accounting can get quite complicated. There are 12 different ways ETH
can enter or leave an account (on the Ethereum mainnet — other chains may differ). Moreover, for each value transfer, the disposition of the money depends on which address is being accounted for. (For example, an EOA-to-EOA ETH
transfer may be “incoming” if we’re accounting for the recipient but “outgoing” if we’re accounting for the sender.)
These are eight ways in which ETH
can enter into an account:
amountIn # a transaction's top-line value, if the recipient
internalIn # value transferred internally to smart contract call
selfDestructIn # value received from other address's self destruct
minerBaseRewardIn # value received from being the winning miner
minerTxFeeIn # transaction fees received for producing the block
minerNephewRewardIn # value received for producing a nephew
minerUncleRewardIn # value received for producing an uncle
prefundIn # value received during the genesis block
==================
totalIn # sum of the above
These are four ways for ETH
to leave an account:
amountOut # a transaction's top-line value, if the sender
internalOut # value transferred internally to smart contract call
selfDestructOut # value received from other address's self destruct
gasOut # gas spent during the transaction
=================
totalOut # sum of the above
Once we have this information, the reconciliation is really kind of easy:
amountNet # totalIn - totalOut
prevBal # the calculated balance at the end of the last tx
begBal # the balance queried at the end of the previous tx
endBal # the balance queried at the end of the current tx
expectedEndBal # prevBal + amountNet
# and an appearance is
reconciled # if prevBal == begBal (A), and
# expectedBal == endBal (B)
ERC20
token accounting is identical to ETH
but simpler. There are only two ways for a token to enter or leave an account. Again, it depends on whether you’re accounting for the sender or the recipient.
recognized # the transfer appears directly in a Transfer event
implied # the transfer does not appear in an event
recognized
transfers, of course, are simple. The ERC20
Transfer event is clearly defined. The trouble is the implied transfers. These come in various flavors, as detailed below, but generally speaking, what they do not do is notate a transfer of value in a standardized way.
Deal with the shortcomings of the RPC
In the table shown below, you will see that our reconciliations fail in 0.01% (that’s 1 in one thousand, in case you were wondering). Why do they fail? Because of implicit transfers.
The reasons why this happens are too numerous to mention here, but they fall into three basic types (I’m sure there are more).
- The first type of failed reconciliation is “missing appearances.” The Unchained Index solves this problem as well as any solution we know of. We’ve written about it here and here. Think about this problem for a second. What do you think the chances are of reconciliations balancing if there are missing transactions? Pretty low.
- The second type of failed reconciliations we’ve seen is non-compliant
ERC-20
transfers. Recently, certain people have been taking advantage of the fact that a smart contract may generateTransfer
events even if no actual value has been transferred. Here’s an example: https://etherscan.io/tx/0x506e7978ba52886681b75797e4403579ba703b5f9df576a34602ada1709085fb. Take a look at this transaction. Notice that there are 1,000Transfer
events but no state change. This is spam, intended to “fish” people into revealing their private keys. Our solution to this problem is detailed here. (Note this is not yet implemented.) - The third type of failed reconciliation happens due to incorrect accounting by a smart contract. For example, this transaction (https://etherscan.io/tx/0x634799410165000edaf1b1e8e5e8055b39cdd534d3c3dc9738865d39adb5d888) does not balance if one considers the
Transfer
events only (as one would be able to do if the smart contract was well written). Theinterest
, which, in fact, is a transfer of value, does not generate aTransfer
event. Nor does the existingTransfer
event reflect this value transfer. (I’m not saying it should, just that it doesn’t.) WithoutTransfer
events detailing each value transfer, there is no way to account for a token without resorting to the “per-smart-contract” context, which is scalable or supportable in the long run. Our solution to this problem is detailed here. (Again, note this is also not yet implemented.) - The fourth difficulty one encounters when trying to do off-chain accounting is slow speed. The RPC is slow. Especially when one queries for account balances. We’ve also solved this problem by providing a local-first, client-side cache. You can read about this on our website.
Taking the above into consideration, we are now able to account for nearly 99.9% of all transactions we’ve tested.
Where we’re going
With this release, we’ve duggen deeper into ERC20
token accounting. That’s a good thing, but it comes with some downsides.
The main downside is that the speed of our processing has nominally decreased (although, see the table below). Don’t worry. We have it covered.
One of the reasons we made these changes was to prepare for a port of our C++ code to GoLang. We made major progress on this front behind the scenes. The code from the previous version had the two aspects of the reconciliation calculation tightly coupled. Above, we call these two aspects (A) Inter-transaction agreement and (B) Intra-transaction agreement.
With this release, we’ve separated the two processes. By tightly coupling the two functions, we had inadvertently made it impossible to do either process in parallel. This is no longer true.
In the GoLang code, which we’ve committed to finishing by March, we will be able to do the balance queries of part A and the data acquisition portion of part (B) currently. We hope this will regain the performance losses (and then some) we’ve seen in this release.
In the table below, we present a comparison of the current version of our code with the previous version. Notice that when one looks at the actual “useful” reconciliations, the processing is actually faster. This is due to a lot of wasted querying in the previous version.
Upshot: The new version is better and poised to get even better wjem we port to GoLang.
Conclusion
We’re enamored with the idea of perfect accounting, not only the financial aspects of smart contracts but for all “state changes” for any address, “smart” or not.
Why are we doing this? Why do we try to create perfect, 18-decimal-place accurate accounting for any address? Well, I guess I’d say that — even though it may seem like a high hill to climb — because it’s there.
Support Our Work
Thank you to our biggest supporters, Meriam Zandi and Dawid Szlachta, for your help with this article. Also, a huge thank you to our users who inspire us.
TrueBlocks is self-funded from our own personal funds and grants from our supporters, including The Ethereum Foundation (2018, 2022), Consensys (2019), Moloch DAO (2021), Filecoin/IPFS (2021), and, of course, our GitCoin donors.
If you like this article and wish to support our work, please donate to our GitCoin grant https://gitcoin.co/grants/184/trueblocks. Even small amounts have a big impact.
If you’d rather, feel free to send ETH or any other token to us directly at trueblocks.eth or 0xf503017d7baf7fbc0fff7492b751025c6a78179b
.