Measuring the influence of on-chain metrics on Ethereum price

Mykhailo Kushnir
Level Up Coding
Published in
5 min readMar 30, 2022

--

Machine learning approach to the estimation of the price factors of the second-largest crypto coin.

Ethereum price prediction

The economy has an abstraction of Homo Economicus — a figure that denotes perfectly reasonable participants in the financial world. These effective and nonexistent creatures are making their decisions perfectly relying on objective factors. While we, as Homo Sapiens, are not exactly capable of maintaining such devotion for a long period of time, no one can deny us the opportunity of trying.

The obstacle we have to deal with is that it’s usually hard to estimate the price of products or services we’d like to buy or offer. Analyzing development costs is usually not enough, even if it includes such complex metrics as revenue expectations or taxes because there would always exist this footprint of human nature — speculation. Cryptocurrency price valuation has this footprint all over the place.

Still, popular coins like Bitcoin or Ethereum have a lot of objective data about them exposed to the external world. Network activity, number of transactions, the difficulty of confirmation of economical operations — these are all available and can be used to form an unbiased opinion about the value of named instruments. At least in theory.

In practice, I would try to apply simple regression models to available on-chain metrics and then we’d discuss the relative success of such exercise and what can be done to improve it.

Data retrieval

For this part, I was heavily using cryptoquant.com API, where I’ve started using PRO subscription recently. I do believe that you’ll see a few posts from me in the upcoming days with data provided by this platform.

On-chain metrics are typically used to describe the technical state of networks. Here are some that I’m using for my financial analysis:

  • Tokens supply
  • Active Addresses
  • Transactions number
  • ETH 2.0 staked

The resulting dataframe would look like this:

Data retrieved for experiment

I would also use the yfinance package to get pricing data for the designated period:

import yfinanceeth = yfinance.download(f'ETH-USD', df.index.min(), df.index.max())df = df.join(eth)

Modelling with FB Prophet

Facebook Prophet incapsulates basic tools needed for time series modelling. It’s very convenient for quick prototyping and works well with pandas as a source of data. Under the hood, the framework tries to fit a function to represent seasonality, trend and holiday effects and error term. These components of function can also be used as a byproduct of the forecast as you’ll see soon.

FB Prophet also pretty easy to use it in python:

Results

…are not promising at the beginning

The predicted price (orange) is much higher than the actual one (blue)

This model would hardly help you earn an ETH, but before you rush to conclusions, let us dig a bit deeper. First of all, let us look into one of the components I’ve mentioned above:

Trend

The model estimates the trend as uprising and that’s reasonably based on previous performance. In the end, Ethereum actually progresses by 3-digits percentages every year. In reality at the beginning of 2022, though, technical progress was mitigated by market context. The drop of Bitcoin at the end of 2021, restrictions in monetary policies in the world and postcovid trading issues made their influence on price.

Integration of market metrics

In order to represent the economical context, I’ll introduce another portion of metrics relevant to tokenomics and exchange-related data:

  • Exchange reserve in ETH and USD — represents the number of ETH coins on wallets reserved by crypto exchanges. There is scientifical research proving that a rise in those numbers is often a good signal for an upcoming sellout and a decline means that everyone’s HODL’ing.
  • Estimated leverage ratio — shows the level of credit shoulder in derivatives trading. A good indicator of upcoming volatility in the market.
  • Open interest — shows a number of open positions in derivates. Has a very high correlation with price, but it’s hard to use it in actual prediction as it’s typically posted by exchanges with some time lag. Much more useful for educational purposes, though.

Updated results

With market metrics, predictions are much closer to the expected value

Much closer to perfection! Mean absolute error is now around 200$ making it a decent estimator of the current price structure. So what has changed?

Surprisingly, open interest is not playing that “prophet” role here. It is possible to get the same 8% of the mean absolute percentage error without this feature. After playing with some combinations of columns to use, I’ve figured that “Reserve” has good enough predictive power when it’s aligned with on-chain metrics, but is not sufficient on its own.

Hypothesis & Summary

While it seems to be possible to fit a price function for ETH using only objective metrics, it’s still not perfect. 8% of mean absolute percentage error means that you’ll miss by hundreds of dollars. There could be several explanations for why this model doesn’t fit closer to the actual price time series:

  1. Basic. The model is not complex enough. For example, it’s not calculating intraseries patterns as would be done by deep learning models.
  2. Incomplete. There are other available metrics. For instance, the whale ratio is often used to describe the influence of a small portion of holders on coin price. The momentum factor can also be incorporated through the usage of various averaged representations of price (EMA, SMA, etc.)
  3. Speculation. At the beginning of this article, I’ve mentioned that every product or service on the market has some x-factor in its price derived from human willingness to buy it. People can over/underestimate the cost, making unperfect economical decisions.

I’d like to believe that all three hypotheses are involved. This makes price prediction in crypto so interesting as a task that can be solved with regression of less than features is not something you should write about :)

--

--