Compacting ERC20 Logs

Published in

Coinmonks

7 min readDec 14, 2021

I have zero time, so I herein produce a very quick take on something I’ve been thinking about for a long time. This is more a thought experiment than anything else.

Fact: Very, very, very many of the transactions on the Ethereum blockchain are simple ERC20 token transfers.

So what?

What If ERC20 Tokens were Native to the Chain?

Every token transfer that appears on the chain using the standard ERC20 token transfer function (or transferFrom, which we see below can be safely ignored).

The transfer function is defined in the ERC20 standard as:

function transfer(address _to, uint256 _value)

which translates into a four-byte signature of 0xa9059cbb.

Looking at a typical ERC20 transfer gotten from the RPC, it looks something like this:

And, the log generated by a transfer looks like this:

If you look closely at this there’s a huge amount of redundant information.

Coloring Stuff Makes it More Obvious

Looking at that some gobblygook bytes colored to make it more clear, gives:

We see…

for the transaction and…

for the log. Notice anything?

Every piece of the token transfer’s receipt is already present in the transaction data except status.

Here’s an Idea

What if we removed the redundant information from the events that the standard requires and turned the log for transfers into a special case native to the chain.

The log for a transfer could look something like this, for example:

In other words, all we really need to know about an ERC20 token transfer is whether or not it succeeded. All the other information is already included in the transaction.

Obvious Objections

It’s too hard, to change now. All the dApps would break.

This is true. A lot of things would break. But I’m suggesting we think forward to the next 50 years, not the last five.

It’s too hard, all the client code would have to be modified.

This is also true. See my previous response and see below where I make a quick back-of-the-envelope calc on how much space this might save.

It slows down queries.

This is also true. It would slow down queries because the node software is highly optimized for delivering log-related queries.

Currently, a dApp sends the transaction and then queries for the log and, as a result, this is why the redundant information is probably included. Presumably, though, the dApp was the source of the transaction, so it already has the information needed to reconstruct the full log.

In the case of an after-the-fact, off-chain scrape the transactional information is most likely available already as well because many off-chain scrapes will scan the blocks and scan the transactions before scanning particular logs.

Benefits

The size of data stored by a node is decreased.

Even if this doesn’t make it into the protocol level by becoming a special case native primitive, the observation leads us to conclude that we could greatly reduce the size of the data on the machine’s hard drive. We’ve calculated an estimate below using very rough back-of-the-envelope calculations.

The number of total bytes transferred over the wire is cut in half.

The amount of space “on the wire“’” is infinite — isn’t it? What does high traffic even mean?

It means there are too many bytes trying to jamb their way onto the wire. So every little bit counts, and this could lower the number of bytes going over the wire significantly.

Smaller data means more “regular people” can run nodes.

The whole goal of everything I do is to make running a local node easier. One of the biggest complaints about running a node is how much disc space it takes up. This would lower that amount and thereby allow more people to run more nodes.

How Much Space Might This Save?

We ran the following commands from the TrueBlocks command-line tool chifra:

chifra blocks 1756978-1757000 | grep input | cut -c1-26

This produced a file (file.txt) with data from around 24,000 transactions extrating only the input fields. This represents about 200 blocks randomly sampled across blocks between 3,000,000 and 13,000,000.

That data looks like this:

"input": "0x",
"input": "0x18cbafe5"
"input": "0xc9807539"
"input": "0xab834bab"... plus 24,375 more rows...

Not amazing, but fairly interesting.

We ran the following command against that data file and found 1,578 different four-byte codes in the 24,379 records with 10 of them showing more than 100 transactions with that four-byte.

cat file.txt | sort | uniq -c | sort -n

Using the Ethereum Four Byte Directory, we find (for 10 most frequently appearing functions) this information:

Count     Four-Byte     Signature
------------------------------------------------------------------
  105     0x23b872dd    transferFrom(address,address,uint256)
  111     0x202ee0ed    submit(uint256,int256)
  117     0x6ea056a9    sweep(address,uint256)
  127     0xef343588    trade(uint256[8],address[4],...) 
  191     0x38ed1739    swapExactTokensForTokens(uint256,uint256...)
  195     0x18cbafe5    swapExactTokensForETH(uint256,uint256,...)
  352     0x7ff36ab5    swapExactETHForTokens(uint256,address[],...)
  486     0x095ea7b3    approve(address,uint256)
 7485     0xa9059cbb    transfer(address,uint256)
10343     0x            (straight up ETH transfer)

Or, stated as percentages:

Percent   Four-Byte     Signature
------------------------------------------------------------------
  0.54%   0x23b872dd    transferFrom(address,address,uint256)
  0.57%   0x202ee0ed    submit(uint256,int256)
  0.60%   0x6ea056a9    sweep(address,uint256)
  0.65%   0xef343588    trade(uint256[8],address[4],...) 
  0.98%   0x38ed1739    swapExactTokensForTokens(uint256,uint256...)
  1.00%   0x18cbafe5    swapExactTokensForETH(uint256,uint256,...)
  1.80%   0x7ff36ab5    swapExactETHForTokens(uint256,address[],...)
  2.49%   0x095ea7b3    approve(address,uint256)
 38.36%   0xa9059cbb    transfer(address,uint256)
 53.01%   0x            (straight up ETH transfer)

So 91% of all the transactions we sampled were either a straight-up transfer of ETH or an ERC20 token transfer.

Hand Waving

We ran the following command against the same set of blocks:

chifra blocks --raw 3000000-13000000:50000 | jq | grep size

and summed the result to find that the 200 blocks we sampled take up 5,132,592 bytes (5 MB) on the hard drive. Extending that out across the 13,800,429 blocks at the time of this writing, we get an estimated size for just the blocks alone at 5,132,592 * 13,800,429 = 354,159,857,410 bytes or about 350 GB for the block data alone.

A very rough guess is that there is as much log data (which isn’t stored as part of the blocks) as there are blocks, and if we add 350 GB to 350 GB we get 700 GB which is on the order of magnitude of the known chain size (2TB).

So, let’s use 350 GB as the size of just the logs.

Extending 350 GB * .3836 * .1 (because we can decrease the size of a transfer log to 1/10 its current size) we get 13.5 GB. Is that a lot? Not really….

We could if we replaced all the transfer logs with a simple boolean showing success or failure and picked up the remainder of the data from the transaction that spawned the transfer, decrease the size of the data on the hard drive by about 15 GB or 1% of the total (assuming 1.5 TBin total).

Conclusion: Not worth the effort!

→[Correction — 12/30/2021]

I made a mistake in the above calc. It should have used a value of .9 not .1 since we are decreasing the size to 1/10 its original size. So 350 GB * .3836 * .9 would save 120.834 GB . That’s actually pretty much, so different conclusion. Might be worth it.

→[Correction — 12/30/2021]

Support Our Work

TrueBlocks is totally self-funded from our own personal funds and a few grants such as The Etheruem Foundation (2018), Consensys (2019), Moloch DAO (2021), and most recently Filecoin/IPFS (2021).

If you like this article or you simply wish to support our work go to our GitCoin grant https://gitcoin.co/grants/184/trueblocks. Donate to the next matching round. We get the added benefit of a larger matching grant. Even small amounts have a big impact.

If you’d rather, feel free to send any token to our public Ethereum address at trueblocks.eth or 0xf503017d7baf7fbc0fff7492b751025c6a78179b.

Join Coinmonks Telegram Channel and Youtube Channel learn about crypto trading and investing

Also Read

An ultimate guide to Leveraged Token [Bull Token]Leveraged tokens are ERC20 tokens with leveraged exposure without taking care of the margin, requirements, management…
medium.com

Best Crypto Exchange | Top 10 cryptocurrency exchanges in 2021Crypto trading on cryptocurrency exchanges requires knowledge about the market, which can help you gain profit. Before…
blog.coincodecap.com

Best Crypto Swap Platforms in 2021 | CoinCodeCapIf we look into today's scenario, numerous cryptocurrency swap platforms offer a wide range of features and deep…
blog.coincodecap.com

Best Crypto Lending Platform in 2021 | Top 6 Bitcoin Lending PlatformsGet the best lending interest rates for Bitcoin and other cryptocurrencies
medium.com

Best FREE Crypto Trading Bots in 2021Best crypto trading bots for Binance, Coinbase, Kucoin, and other crypto exchanges in 2021. Quadency, Bitsgap…
medium.com

Best 4 Crypto Trading Signals Telegram ChannelsIt is tedious to find the right crypto trading signals provider. So, in this article, we will be talking about the best…
medium.com

Bitsgap Review 2021 | Get Signals, Trading Bots, and ArbitrageIn this article, we will review Bitsgap, a one-stop crypto trading platform that caters to all your trading needs. It…
blog.coincodecap.com

Coinmonks

Compacting ERC20 Logs

What If ERC20 Tokens were Native to the Chain?

Coloring Stuff Makes it More Obvious

Here’s an Idea

Obvious Objections

Benefits

How Much Space Might This Save?

Hand Waving

Also Read

An ultimate guide to Leveraged Token [Bull Token]

Leveraged tokens are ERC20 tokens with leveraged exposure without taking care of the margin, requirements, management…

Best Crypto Exchange | Top 10 cryptocurrency exchanges in 2021

Crypto trading on cryptocurrency exchanges requires knowledge about the market, which can help you gain profit. Before…

Best Crypto Swap Platforms in 2021 | CoinCodeCap

If we look into today's scenario, numerous cryptocurrency swap platforms offer a wide range of features and deep…

Best Crypto Lending Platform in 2021 | Top 6 Bitcoin Lending Platforms

Get the best lending interest rates for Bitcoin and other cryptocurrencies

Best FREE Crypto Trading Bots in 2021

Best crypto trading bots for Binance, Coinbase, Kucoin, and other crypto exchanges in 2021. Quadency, Bitsgap…

Best 4 Crypto Trading Signals Telegram Channels

It is tedious to find the right crypto trading signals provider. So, in this article, we will be talking about the best…

Bitsgap Review 2021 | Get Signals, Trading Bots, and Arbitrage

In this article, we will review Bitsgap, a one-stop crypto trading platform that caters to all your trading needs. It…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Coinmonks

Written by Thomas Jay Rush

No responses yet