Crypto Backtester 🧪🔍⚖️📈¶

In this guide a cryptocurrency backtest is created to compare different rebalancing and trading strategies. It covers working with crypto price api's to retrieve information, manipulate this information using `pandas` and `numpy` into something that can be used in time series computations to implement different trading strategies. Interactive time series plotting is covered using `plotly`. This guide provides a useful reference for creating your own crypto back tester as well as a general best practices when dealing the time series data in python.¶

Context and Background¶

Portfolio rebalancing refers to the practice of keeping the original proportions of your investments intact regardless of individual asset price fluctuations. There is a plethora of reasons why rebalance your portfolio is a good idea, most of which revolve around diversification of risk through holding assets that have low correlation between each other.

The way this works in practice is simple. Letʼs suppose you decide to invest $100 in BTC, ETH and NEO. Your portfolio begins with three assets in equal proportions. A few days later, NEO takes off and outperforms the other assets in your portfolio. Now, your basket is disproportionately weighted in NEOʼs favor.

Rebalancing your portfolio, in this case, would entail selling some NEO and redistributing the profits across the rest of your portfolio to bring back the original proportions. Portfolio rebalancing spreads the wealth from your stronger assets, selling them at higher prices, and rebuys other assets in your portfolio at lower prices.

This guide will focus less on the economic reasoning and justification and more on the technical implementation details.

Tokens Selected and Models Tested¶

There are numerous different kinds of rebalancing strategies that could be implemented. This guide will focus on two popular methods, primarily time and threshold based rebalancing. Time based rebalancing will only consider to preform a rebalance if enough time has passed. Threshold rebalancing will only consider to preform a rebalnce if the allocations within the fund have deviated past a predefined threshold. There are tones of other options and flavours out there to explore, some of which can be found here. However the processes and principles explored with the simple time and threshold can be extended to any generic rebalancing strategy you can think of and so these are sufficient for this guide.

From a token perspective there is literally thousands of them that could be included in the models. For simplicity sake we will limit our exploration to a select few. More complex portfolios could include more tokens but again the basic ideas that are explored here remain the same.

The tokens explored are: ["btc", "eth", "xrp", "ltc", "dash", "xmr", "doge"] as well as USD to represent a stable asset in the fund.

Local Setup¶

It is recomended that all packages are installed within a python virtual environment if you want to run this notebook locally. This will ensure that packages are not installed to your local environment and makes the execution consistent between machines. This can be done as follows:

Install virtualenv if you dont already have it

pip install virtualenv

Create the virtual enviroment

virtualenv venv -p python3

Activate the virtual enviroment. You shell should now say (venv) on the left side.

source venv/bin/activate

Install packages

pip install -r requirments.txt

Ensure kernel can be accessed by Jupyter

python -m ipykernel install --user --name=CryptoBacktester

Start the notebook.

jupyter notebook

Now navigate to http://localhost:8888 where you can find the live notebook. From within the notebook your Kernel should be set to CryptoBacktester in the top right. If it's not click the Kernel tab and change it to CryptoBacktester.

Initial Setup¶

We begin by loading in all the packages that we need. Nothing too out of the ordinary here except for Cryptory which provided a collection of API end points for retrieving historic crypto (and other markets) price information. We also need to import the plotly.offline to enable us to plot unlimited plot.ly figures without hitting into their limits. We also turn off the checking of ssl certificates because this is required to pull the crypto prices from within the python virtualenv which does not contain ssl certificates.

# load package
from cryptory import Cryptory
from plotly.offline import init_notebook_mode, iplot #import offline mode

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.offline as pyo
import plotly.graph_objs as go
import ssl

#init notebook for plotly
init_notebook_mode()

#disable ssl for cryptory API & virtualenv
ssl._create_default_https_context = ssl._create_unverified_context
np.set_printoptions(suppress=True)

Pulling Data from Crypto API and Extracting to Pandas Dataframe¶

First we setup the Cryptory object to pull data from 2017-01-01 to the present date and extract and print some sample data. Note that this means that the latest data is the most recent time the script was run, which is a strong advantage over using csv files.

# initialise object 
# pull data from start of 2017 to present day
my_cryptory = Cryptory(from_date = "2017-01-01")

# get historical bitcoin prices from coinmarketcap
my_cryptory.extract_coinmarketcap("bitcoin").head()

The next step is to define all the tokens we want to get prices for and capture them within a dataframe. We are using these tokens specifically as they are all high market cap and go back to 2017-01-01.

The process of reading in the data involves iterating over all token's and calling the extract_bitinfocharts on each symbol. This calls the Cryptory package API to get the price & date information for this specific token in USDs. This is then joined onto the other sets of price records based on common date fields. This can be thought of as an inner join from SQL on the common date rows.

all_coins_df = my_cryptory.extract_bitinfocharts("btc") #start by filling the dataframe(df) with btc
# coins of interest
bitinfocoins = ["btc", "eth", "xrp", "ltc", "dash", "xmr", "doge"] #then fill it with all the others
for coin in bitinfocoins[1:]: # [1:] skips the first item(btc)
    all_coins_df = all_coins_df.merge(my_cryptory.extract_bitinfocharts(coin), on="date", how="left")

We then replace all nan data with zero, make the date column the index so it is iterable later and inset a new column named usd_price with a value of 1 for all rows. This is used later to represent having usd within the portfolio as well as the other cryptos. We then re-order the data so that the top (position 0) is the oldest data. This makes sense to do because when iterating it's more logical to work from the back (position 0) to the front (end position).

all_coins_df = all_coins_df.fillna(0) #remove nans
all_coins_df.set_index('date',inplace=True) # make date index
all_coins_df.insert(0,"usd_price",1) #add usd row
all_coins_df = all_coins_df.reindex(index=all_coins_df.index[::-1]) #re-order data
all_coins_df.head() #print sample rows

Next, we extract the names of all the tokens as the columns from the dataframe. we can use this later to itterate over.

tokens = all_coins_df.columns.tolist()
tokens

['usd_price',
 'btc_price',
 'eth_price',
 'xrp_price',
 'ltc_price',
 'dash_price',
 'xmr_price',
 'doge_price']

Basic Time Series Data Visualization and Plotting¶

At this point we have created a pandas dataframe that contains daily price data for 7 different crypto currencies against the USD. Next we will visualize this data in a simple time series plot.

This process involves iterating over all token names and using this as the key to extract column data from the pandas dataframe. This information is stored in a go.Scatter object which is appended to a plot_data array which is used in the plotting of the price information. Other information is also appended such as the name of the crypto as well as the type of plot to generate.

plot_data = []

for index, coin in enumerate(tokens):
    coin_chart_info = go.Scatter(
        x = all_coins_df.index, #set all the x's to the index from the dataframe. This was the date
        y = all_coins_df[coin].tolist(), #set all the y's to the current coins magnitude in the dataframe
        mode = 'lines',
        name = coin[:-6]) # the [:-6] is to remove the `_price` part of the name from the dataframe
    plot_data.append(coin_chart_info) #add this coins chart info to the plot_data array
    tokens[index] = coin[:-6]

The iplot function can also take a layout parameter which we use to define the heading for the plot as well as moving the legend to the top left. We then generate the iplot.

And now we have a beautiful interactive plot! You can zoom in on it by dragging your mouse. Double click to zoom out again. You can also toggle coins from the legend. Try turning btc off by clicking it.

layout = go.Layout(
    title = "Historic Price Over Time",
    autosize=True,
    showlegend=True,
    legend=dict(x=0.05, y=0.95))

fig = dict(data=plot_data, layout = layout)
iplot(fig)

Time based Rebalancing¶

Next we jump right into build a rebalancing strategy. This initial implementation will not consider fees or rebalancing intervals; it will simply take whatever the distribution within the portfolio is at the end of the day and compare this to what it should be. It will then “execute the trades” to bring the portfolio back to equilibrium. After this, we will consider how to define a collection of different portfolio strategies and compare and contrast them on the same plot.

The first thing we do is grab top row from the all_coins_df dataframe that stores all the historic information. Then, this row from the dataframe is converted to a np.array where each position in the array corresponds to a diffrent column from the dataframe. We will require this information to calculate the initial allocation within the fund.

initial_price = all_coins_df.head(1)
initial_price = np.array(initial_price)[0]
initial_price

array([  1.      , 970.988   ,   8.233   ,   0.00651 ,   4.389   ,
        11.356   ,  13.532   ,   0.000224])

For the simplest case we will set all tokens in our dataset to be part of the fund. Each token will get 1/8 of the total value.

initial_weights = np.ones(len(tokens)) / len(tokens)
initial_weights

array([0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125])

Next, the amount of capital the fund starts off with is set. We set this to 1000 usd as an arbitrary number.

INVEST = 1000

The initial allocations are now calculated. These allocations are in the respective currencies base price. For example the initial allocation of bitcoin is 0.12873486. This is 1/8 of the 1000 usd, at the given starting price of bitcoin of 970.988.

initial_allocation = INVEST * initial_weights / initial_price
initial_allocation[np.isinf(initial_allocation)] = 0
initial_allocation

array([   125.        ,      0.12873486,     15.18280092,  19201.22887865,
           28.48029164,     11.00739697,      9.23736329, 558035.71428571])

We can also work out the initial position. This is the value of each set of tokens purchased, based off the allocations we defined. This should logically be the total investment $\times$ the fraction per token and works out to 125 per token.

initial_positions = initial_allocation * initial_price
initial_positions

array([125., 125., 125., 125., 125., 125., 125., 125.])

At any point in time the value of the portfolio is the sum of a row in the pandas dataframe. Therefore the starting value initial_portfolio_value is simply the sum of the initial_positions row.

initial_portfolio_value = initial_positions.sum().sum()
initial_portfolio_value

1000.0

We now know all the information we need to start doing some rebalancing simulations. This process will happen as follows:

For each row in the all_coins_df dataframe we itterate over and do a series of calculations. Each row corisponds to a collection of prices over all crypto pairs listed for a given day. These prices are stored in the current_price variable on each loop. This amounts to preforming some logic at every day in the data. At each day the following is calculated and stored:
- hodl as the value of not trading at all based on: $CurrentAllocation\times CurrentPrice$. At each day this information is stored in a new slot within the hodl variable.
- rb as the value of the portfolio based on rebalancing calculation: $CurrentAllocation \times CurrentPrice$. At each day the current value of the rebalancing portfolio is stored in rb.
- Each day a check is preformed to see if there has been a deviation of the price from the desired value. If there has been then the current_allocation is updated and will be used in the next day to preform "trades" based on $CurrentPortfolioValue \times \frac{InitialWeights}{CurrentPrice}$

Note that at no point are we considering trading fees.

hodl = {}
rb = {}

current_allocation = initial_allocation

for i, current_price in all_coins_df.iterrows():
    # hodl positions
    hodl[i] = initial_allocation * current_price
    
    # current positions
    current_positions = current_allocation * current_price
    rb[i] = current_positions
    
    # rebalance
    current_portfolio_value = current_positions.sum()
    if(current_price.min() > 0):
        current_allocation = current_portfolio_value * initial_weights / current_price

Next we convert the generated dictionaries to panda DataFrames so we can do computation over them and use them in plotting more easily.

hodl = pd.DataFrame(hodl).T
rb = pd.DataFrame(rb).T

At this point we have two dataframes (hodl and rb) which represent the values of each crypto at each day over the whole period. We will print out the top 15 rows of one of these frames to see what it looks like.

In the table below each row shows the value of each crypto in time, starting from the same $\frac{1}{8}\times1000$. Every day The value of each crypto is changing with the change in price. As this is the hodl strategy, no balancing is done at all as can be seen by the fixed usd_price table.

hodl.head(10)

We can also plot the hodl portfolio over time. Remember that at the beginning we had 12.5% of each asset (125 usd). We can see how these have evolved to the end of time. We will also define a useful function fundSort which we can use to sort the array of dataframes such that the put tokens that had the highest value first after the period.

This figure is really interesting as we can see how much 125 usd could have grown to, if invested in all the cryptos from the beginning of the period. We can see some absolutely crazy numbers here; XRP grew from 125 USD in jan 2017 and was worth at a maximum ~70k USD about a year later! That's a growth of ~583 in about a year.

def fundSort(fund):
    return fund['y'][-1]

plot_data = []

for index, coin in enumerate(hodl):
    coin_chart_info = go.Scatter(
        x = hodl.index, #set all the x's to the index from the dataframe. This was the date
        y = hodl[coin].tolist(), #set all the y's to the current coins magnitude in the dataframe
        mode = 'lines',
        # the [:-6] is to remove the `_price` part of the name from the dataframe
        name = coin[:-6] + "->\t\t$" + str(round(hodl[coin][-1],3))) 
    plot_data.append(coin_chart_info) #add this coins chart info to the plot_data array
    

#sort the data to be plotted based on the total value of the holdings after the period
plot_data.sort(key = fundSort, reverse = True) 

#spesify the layout for the plot
layout = go.Layout(
    title = "Hodl Portfolio asset value over time",
    autosize=True,
    showlegend=True,
    legend=dict(x=0.05, y=0.95))

#display the plot inline
fig = dict(data=plot_data, layout = layout)
iplot(fig)

Next, we can generate a similar plot but for the rebalanced fund values over time. These should all follow a similar value as we've enforced that at all periods in time the fund maintains a $\frac{1}{8}$ of its value in each respective crypto.

plot_data = []

for index, coin in enumerate(rb):
    coin_chart_info = go.Scatter(
        x = rb.index,
        y = rb[coin].tolist(),
        mode = 'lines',
        name = coin[:-6] + "->\t\t$" + str(round(rb[coin][-1],3)))
    plot_data.append(coin_chart_info)

plot_data.sort(key = fundSort, reverse = True) 

layout = go.Layout(
    title = "Rebalanced Portfolio asset value over time",
    autosize=True,
    showlegend=True,
    legend=dict(x=0.05, y=0.95))

fig = dict(data=plot_data, layout = layout)
iplot(fig)

Next it would be useful to compare the final values for hodl vs rebalance to get a relative comparison of these two methods. Remember now that we are comparing two portfolios that start out with the exact same cryptos in it.

We now don't want to look at the value of each individual crypto within the fund, but rather the sum in value of all cryptos over the whole duration. To this end we sum each row along axis=1 indicating to sum along each rows.

Then, the performance of each fund is computed by taking the final value/initial value to get a simple relative change in value for the portfolio.

hodl_value = hodl.sum(axis=1)
rb_value = rb.sum(axis=1)

hodl_perf = hodl_value.iloc[-1] / hodl_value.iloc[0]
rb_perf = rb_value.iloc[-1] / rb_value.iloc[0]

Next we create the data objects for the two plots and sort.

hodlInfo = go.Scatter(
        x = hodl_value.index,
        y = hodl_value.tolist(),
        mode = 'lines',
        name = 'hodl {:.1f}%'.format(hodl_perf * 100))

rebalanceInfo = go.Scatter(
        x = rb_value.index,
        y = rb_value.tolist(),
        mode = 'lines',
        name = 'rebalance {:.1f}%'.format(rb_perf * 100))

plot_data = [hodlInfo, rebalanceInfo]
plot_data.sort(key = fundSort, reverse = True)

layout = go.Layout(
    title = "Portfolio value over time",
    autosize=True,
    showlegend=True,
    legend=dict(x=0.8, y=0.9))

fig = dict(data=plot_data, layout = layout)
iplot(fig)

Custom Portfolio Backtesting¶

At this point we have looked at implementing a simple portfolio backtesting setup. However it is not well suited to comparing multiple strategies against each other and as a result it makes it hard to find an ideal model. Next we will define a more general process for backtesting funds by defining a JSON object structure that can be used to specify all portfolio characteristics that are to be tested.

Primarily we are concerned with:

what tokens go into the portfolio
what weighting each token takes on
what frequency the rebalancing must occur at
the name of the portfolio so we can track it later.

All this information for a bunch of different portfolios is encoded as objects within an array below for 6 different funds. The process of backtesting is very similar to the specific process outlined before except we now iterate over all elements within the funds array and create a backtest for each one sequentially.

funds = [
    {
    'coins': ['eth_price', 'btc_price', 'usd_price'],
    'ratio': [0.4, 0.4, 0.2],
    'rebalance': 7,
    'name': 'ETH40, BTC40, USD20 @ 7',
    },
    {
    'coins': ["btc_price", "eth_price", "xrp_price", "ltc_price", "dash_price", "xmr_price", "doge_price"],
    'ratio': [0.25, 0.25, 0.10, 0.10, 0.10, 0.10, 0.10],
    'rebalance': 2,
    'name': 'ETH25, BTC25, OTHERS10 @ 2',
    },
    {
    'coins': ['eth_price', 'btc_price'],
    'ratio': [0.5, 0.5],
    'rebalance': 15,
    'name': 'ETH50, BTC50 @ 15',
    },
    {
    'coins': ['eth_price', 'btc_price'],
    'ratio': [0.5, 0.5],
    'rebalance': 0,
    'name': 'ETH50, BTC50 @ HODL',
    },
    {
    'coins': ['eth_price'],
    'ratio': [1],
    'rebalance': 0,
    'name': 'ETH100 @ HODL',
    },
    {
    'coins': ['btc_price'],
    'ratio': [1],
    'rebalance': 0,
    'name': 'BTC100 @ HODL',
    }
]

We we now do two nested loops:

First we iterate over all the funds specified in the json object and for each one calculate the key metrics for that specific fund, such as the initial allocations, initial weightings and initial prices. The logic and justification for each fund is the same as done before except now generalized.
Next, we iterate overall values in the coin_prices dataframe storing the pricing information. At each point we calculate the value of the specific fund and store the results.

At the end of this process we have one dictionary fund_returns that has time series information for all 6 funds specified.

fund_returns = {}
for index, fund in enumerate(funds): # for all funds in the list of funds we run a backtest
    
    # grab only the coin prices for the coins in the fund
    coin_prices = all_coins_df.loc[:,fund['coins']]
    
    # the initial weights are spesified by the fund JSON. read this in and store it
    fund_initial_weights = np.array(fund['ratio'])
    
    #calculate the initial prices of the coins in the fund
    fund_initial_price = coin_prices.head(1)
    fund_initial_price = np.array(fund_initial_price)[0]
    
    #calculate the initial allocation based on the weights and starting prices
    fund_initial_allocation = INVEST * fund_initial_weights / fund_initial_price
    fund_initial_allocation[np.isinf(fund_initial_allocation)] = 0
    
    #the initial position is defined by the initial allocation and initial price
    fund_initial_positions = fund_initial_allocation * fund_initial_price
    
    #lastly we define the current allocation as the initial allocation as we start at time = 0
    fund_current_allocation = fund_initial_allocation
    
    day_count = 1
    fund_returns[fund['name']] = {} #create the object to store all the results for this spesific fund
    for i, current_price in coin_prices.iterrows(): #for each day of pricing data generate a backtest result
        fund_current_positions = fund_current_allocation * current_price
        fund_returns[fund['name']][i] = fund_current_positions

        # rebalance
        current_portfolio_value = fund_current_positions.sum()
        if day_count >= fund['rebalance'] and current_price.min() > 0:
            #update the allocation for each token (preform the rebalance)
            fund_current_allocation = current_portfolio_value * fund_initial_weights / current_price
            #restart counting for next rebalance
            day_count = 1
        #increment time
        day_count += 1

Lets print the top 15 results from one of the funds to see that it makes sense given the portfolio strategy we had specified. Printing ETH40, BTC 40, USD 20 @ 7 shows 3 different coins (eth, btc & usd) in the specified starting ratio of eth: $40\%\times 1000 = \$400$, btc: $40\%\times 1000 = \$400$ and usd: $20\%\times 1000 = \$200$. At the end of a 7 day period we see a rebalencing where some ether is sold to buy btc and usd. then again on the 14th (two weeks after start) we see another trade occurring to reestablish the ratios.

pd.DataFrame(fund_returns['ETH40, BTC40, USD20 @ 7']).T.head(15)

Lets first plot the internal workings of one of the funds to see the evolution of the positions over time. We want to see the act of rebalancing and the effect that it has on the funds components in time.

The plot that is generated shows the value of eth and btc following closely, which is to be expected as they both have 40% of the fund allocated to them. USD has 20% allocated to it and we can see it's price following at $\approx \frac{1}{2}$ that of the other two. the jaggad steps in the usd plot are the weekly rebalancing intervals.

#extract the dataframe for this spesific portfolio and plot it's components over time
df = pd.DataFrame(fund_returns['ETH40, BTC40, USD20 @ 7']).T    

plot_data = []
for coin in df:
    coin_chart_info = go.Scatter(
        x = df.index,
        y = df[coin].tolist(),
        mode = 'lines',
        name = coin[:-6] + "->\t\t$" + str(round(df[coin][-1],3)))
    plot_data.append(coin_chart_info)

plot_data.sort(key = fundSort, reverse = True) 

layout = go.Layout(
    title = "ETH40, BTC40, USD20 @ 7 Components Value Over Time",
    autosize=True,
    showlegend=True,
    legend=dict(x=0.05, y=0.95))

fig = dict(data=plot_data, layout = layout)
iplot(fig)

Next we will compare all 6 models defined before in one plot. For each model we calculate the performance metrics in the same way we did before, except we now do it for each and every fund and store the results in an array called fund_computed_plot which is used later on in plotting.

fund_computed_plot = []
for fund in fund_returns:
    fund_df = pd.DataFrame(fund_returns[fund]).T
    
    fund_value = fund_df.sum(axis = 1)
    fund_performance = (fund_value.iloc[-1] / fund_value.iloc[0]) * 100
    fund_performance = round(fund_performance, 3)
    fund_plot = go.Scatter(
        x = fund_value.index,
        y = fund_value.tolist(),
        mode = 'lines',
        name = '{0} -> {1}%'.format(fund, fund_performance))
    fund_computed_plot.append(fund_plot)

Sort the funds to get the best at the top of the index.

fund_computed_plot.sort(key = fundSort, reverse = True)

And lastly plot them all on one figure

layout = go.Layout(
    title = "Time Based Rebalancing Strategies",
    autosize=True,
    showlegend=True,
    legend=dict(x=0.7, y=0.9))

fig = dict(data=fund_computed_plot, layout = layout)
iplot(fig)

Threshold Rebalancing Strategies¶

What if rather than rebalencing based on a time period we rebalanced base on a deviation from the desired ratios. This rebalencing method is very similar in practice and only slightly modifies the rebalencing condition.

We will start off by defining a threshold that will quantify the required deviation to preform a rebalance. To begin with we set this to 0.2 This means that if there is a 20% deviation from the desired ratio on any day then the portfolio will preform a rebalance.

THR = 0.2

The initial_allocation that we had calculated before based off the $\frac{1}{8}$ for each asset based off a 1000 usd starting value will be used again for the initial_allocation

initial_allocation

array([   125.        ,      0.12873486,     15.18280092,  19201.22887865,
           28.48029164,     11.00739697,      9.23736329, 558035.71428571])

As before we will loop through all price information in the all_coins_df and for each day we will calculate the current position and then make a decision about rebalancing. The key logic here about choosing to rebalance (or not as the case may be) is defined by the logical statement if any(weights_diff > THR) where weights_diff shows the difference between the desired weight and the actual weight at the end of each period. If there is any asset in the portfolio that exceeds the desired threshold then the whole fund rebalances to accommodate this.

hodl = {}
rb = {}

current_allocation = initial_allocation

for i, current_price in all_coins_df.iterrows(): #for all days price information, we calculate positions
    # hodl positions
    hodl[i] = initial_allocation * current_price
    
    # current positions
    current_positions = current_allocation * current_price
    rb[i] = current_positions
    
    # rebalance
    current_portfolio_value = current_positions.sum()
    current_weights = current_positions / current_portfolio_value
    weights_diff = np.abs(current_weights - initial_weights)
    if any(weights_diff > THR) and current_price.min() > 0:
        current_allocation = current_portfolio_value * initial_weights / current_price

hodl = pd.DataFrame(hodl).T
rb = pd.DataFrame(rb).T

As before we find some performance metrics to compare the portfolios

hodl_value = hodl.sum(axis=1)
rb_value = rb.sum(axis=1)

hodl_perf = hodl_value.iloc[-1] / hodl_value.iloc[0]
rb_perf = rb_value.iloc[-1] / rb_value.iloc[0]

And then plot in the same way as before. We can see that this simple threshold rebalencing strategy out preforms just hodling.

hodlInfo = go.Scatter(
        x = hodl_value.index,
        y = hodl_value.tolist(),
        mode = 'lines',
        name = 'hodl all tokens at  {:.1f}%'.format(hodl_perf * 100))

rebalanceInfo = go.Scatter(
        x = rb_value.index,
        y = rb_value.tolist(),
        mode = 'lines',
        name = 'Threshold rebalance {:.1f}%'.format(rb_perf * 100))

plot_data = [hodlInfo, rebalanceInfo]
plot_data.sort(key = fundSort, reverse = True) 

layout = go.Layout(
    title = "Threshold rebalencing",
    autosize=True,
    showlegend=True,
    legend=dict(x=0.8, y=0.9))

fig = dict(data=plot_data, layout = layout)
iplot(fig)

As we did with the time based rebalencing strategies it would be ideal if we could predefine a number of portfolios and backtest them as a bunch and compare them all at the end. To achieve this we implement a similar strategy as was used before, defining a JSON object to store all the funds we want to test. The only difference now is instead of defining a rebalance period we rather specify the threshold that is required to preform a rebalance.

funds = [
    {
    'coins': ['eth_price', 'btc_price', 'usd_price'],
    'ratio': [0.4, 0.4, 0.2],
    'threshold': 0.05,
    'name': 'ETH40, BTC40, USD20 @ 0.05'
    },
    {
    'coins': ["btc_price", "eth_price", "xrp_price", "ltc_price", "dash_price", "xmr_price", "doge_price"],
    'ratio': [0.25, 0.25, 0.10, 0.10, 0.10, 0.10, 0.10],
    'threshold': 0.05,
    'name': 'ETH20, BTC20, allOthers10 @ 0.05',
    },
    {
    'coins': ['eth_price', 'btc_price'],
    'ratio': [0.5, 0.5],
    'threshold': 0.10,
    'name': 'ETH50, BTC50 @ 0.10'
    },
    {
    'coins': ['eth_price', 'btc_price'],
    'ratio': [0.5, 0.5],
    'threshold': 0.01,
    'name': 'ETH50, BTC50 @ 0.01'
    },
    {
    'coins': ['eth_price'],
    'ratio': [1],
    'threshold': 0,
    'name': 'ETH100 @ HODL',
    },
    {
    'coins': ['btc_price'],
    'ratio': [1],
    'threshold': 0.,
    'name': 'BTC100 @ HODL',
    }
]

As before we now iterate over all funds and calculate and calculate the key fund values at the beginning of the period. We then use each time frames price information to backtest the fund over all time considering the rebalancing thresholds to perform trades.

fund_returns = {}
for index, fund in enumerate(funds):
    
    coin_prices = all_coins_df.loc[:,fund['coins']]
    
    fund_initial_weights = np.array(fund['ratio'])
    
    fund_initial_price = coin_prices.head(1)
    fund_initial_price = np.array(fund_initial_price)[0]
    
    fund_initial_allocation = INVEST * fund_initial_weights / fund_initial_price
    fund_initial_allocation[np.isinf(fund_initial_allocation)] = 0
    
    fund_initial_positions = fund_initial_allocation * fund_initial_price
        
    fund_current_allocation = fund_initial_allocation
    
    fund_returns[fund['name']] = {}
    for i, current_price in coin_prices.iterrows():
        fund_current_positions = fund_current_allocation * current_price
        fund_returns[fund['name']][i] = fund_current_positions

        # rebalance
        current_portfolio_value = fund_current_positions.sum()
        fund_current_weights = fund_current_positions / current_portfolio_value
        weights_diff = np.abs(fund_current_weights - fund_initial_weights)
        if any(weights_diff > fund['threshold']) and current_price.min() > 0:
            fund_current_allocation = current_portfolio_value * fund_initial_weights / current_price

It is again useful to look at some portfolio information for one of the funds generated over time and track how it evolves with the trades preformed. Looking at the ETH40, BTC40, USD20 @ 0.05 which starts off at the same allocations as the previously examined but now will only trade with there is a 5% deviation from the portfolio allocations. Looking at the time table below we can see that between 2017-01-06 and 2017-01-07 there was enough market movement to sufficiently justify a rebalance, as can be seen by the movement of the USD value.

pd.DataFrame(fund_returns['ETH40, BTC40, USD20 @ 0.05']).T.head(10)

Plotting this portfolios value over time is done in the same way as before. This plot showcases one of the advantages of a threshold based rebalancing strategy wherein when there is a lot of market movement the algorithm trades at a much higher frequency when compared to more side tracking movement when there is little to no trading done.

#extract the dataframe for this spesific portfolio and plot it's components over time
df = pd.DataFrame(fund_returns['ETH40, BTC40, USD20 @ 0.05']).T    

plot_data = []
for coin in df:
    coin_chart_info = go.Scatter(
        x = df.index,
        y = df[coin].tolist(),
        mode = 'lines',
        name = coin[:-6] + "->\t\t$" + str(round(df[coin][-1],3)))
    plot_data.append(coin_chart_info)

plot_data.sort(key = fundSort, reverse = True) 

layout = go.Layout(
    title = "ETH40, BTC40, USD20 @ 0.05 Components Value Over Time",
    autosize=True,
    showlegend=True,
    legend=dict(x=0.05, y=0.95))

fig = dict(data=plot_data, layout = layout)
iplot(fig)

We will now combined all daily information to form one plot to represent each fund and then compare them against each other over all time.

This plot shows that the best preforming portfolio out of any of those identified thus far: ETH20%, BTC20% and all other cryptos at 10% with a threshold based rebalancing set to 5%. This generated a return of almost 3000% from the start to the end which is considerably higher than any of the other strategies. Importantly it is more than both the BTC and ETH hodl strategy, showing the importance of having a spread in distributions with other less correlated assets.

fund_computed_plot = []
for fund in fund_returns:
    fund_df = pd.DataFrame(fund_returns[fund]).T
    
    fund_value = fund_df.sum(axis=1)
    fund_performance = (fund_value.iloc[-1] / fund_value.iloc[0]) * 100
    fund_performance = round(fund_performance, 3)
    fund_plot = go.Scatter(
        x = fund_value.index,
        y = fund_value.tolist(),
        mode = 'lines',
        name = '{0} -> {1}%'.format(fund, fund_performance))
    fund_computed_plot.append(fund_plot)
    
fund_computed_plot.sort(key = fundSort, reverse = True)

layout = go.Layout(
    title = "Threshold Based Rebalancing Strategies",
    autosize=True,
    showlegend=True,
    legend=dict(x=0.7, y=0.9))

fig = dict(data=fund_computed_plot, layout = layout)
iplot(fig)

Extending this Process To Other Models¶

The process outlined above for each model can be extended to numerous different trading strategies to create a more generic back tester. The introduction of fees is important to a final model but was excluded here as different strategies would require different fee considerations based off volume, market maker and taker fees. All these considerations will fundamentally change the way the models are designed.

Package Selection:¶

Why use `Cryptory`?¶

It's easy to work with and has lots of data sources. They have tones of code examples as well that makes processing data very easy. Their API also contains other market information like google search trends and other awesome things see here.

An alternative, also awesome package is cryptocompy from here which lets you access a whole bunch more APIs from cryptocompare.

Why use `Plotly`?¶

Interactive plots are really nice and look good! It also has a really simple API. It is a bit of a pain to get to work offline though but it's worth it I think.

Why use `pandas` & `numpy`?¶

There are not better packages for manipulating and working with data and maths!

Conclusion¶

The returns here are not meant to indicate trading strategies that should be followed as there are numerous other complexities that have been ignored here such as trading fees, price slipages, exchange front running and tax implications. All these things would need to be considered before a portfolio can be considered. This tutorial was more about working with python, jupyter, numpy, pandas & plotly all with time series data. It also included processes for reading and processing data.

	date	open	high	low	close	volume	marketcap
0	2019-05-05	5831.07	5833.86	5708.04	5795.71	14808830723	102494420158
1	2019-05-04	5769.20	5886.89	5645.47	5831.17	17567780766	103112021259
2	2019-05-03	5505.55	5865.88	5490.20	5768.29	18720780006	101986240859
3	2019-05-02	5402.42	5522.26	5394.22	5505.28	14644460907	97330112147
4	2019-05-01	5350.91	5418.00	5347.65	5402.70	13679528236	95501110091

	usd_price	btc_price	eth_price	xrp_price	ltc_price	dash_price	xmr_price	doge_price
date
2017-01-01	1	970.988	8.233	0.00651	4.389	11.356	13.532	0.000224
2017-01-02	1	1010.000	8.182	0.00640	4.539	11.593	14.671	0.000222
2017-01-03	1	1017.000	8.811	0.00632	4.525	12.383	16.125	0.000220
2017-01-04	1	1075.000	10.440	0.00642	4.585	14.748	16.807	0.000226
2017-01-05	1	1045.000	10.479	0.00650	4.404	14.815	16.713	0.000225

	usd_price	btc_price	eth_price	xrp_price	ltc_price	dash_price	xmr_price	doge_price
2017-01-01	125.0	125.000000	125.000000	125.000000	125.000000	125.000000	125.000000	125.000000
2017-01-02	125.0	130.022204	124.225677	122.887865	129.272044	127.608753	135.521357	123.883929
2017-01-03	125.0	130.923348	133.775659	121.351767	128.873320	136.304597	148.952483	122.767857
2017-01-04	125.0	138.389970	158.508442	123.271889	130.582137	162.337091	155.252365	126.116071
2017-01-05	125.0	134.527924	159.100571	124.807988	125.427204	163.074586	154.384053	125.558036
2017-01-06	125.0	119.463886	151.934289	119.623656	114.205969	145.176559	136.916199	123.883929
2017-01-07	125.0	111.405599	148.427062	120.775730	108.282069	131.527386	118.884866	122.767857
2017-01-08	125.0	116.992177	151.600267	122.119816	112.183869	139.078461	123.448123	124.441964
2017-01-09	125.0	114.667740	158.265517	119.815668	113.778765	136.733885	120.787762	123.325893
2017-01-10	125.0	115.744093	157.855581	119.047619	126.480975	136.260567	125.203222	123.883929

	eth_price	btc_price	usd_price
2017-01-01	400.000000	400.000000	200.000000
2017-01-02	397.522167	416.071053	200.000000
2017-01-03	428.082109	418.954714	200.000000
2017-01-04	507.227013	442.847903	200.000000
2017-01-05	509.121827	430.489357	200.000000
2017-01-06	486.189724	382.284436	200.000000
2017-01-07	474.966598	356.497918	200.000000
2017-01-08	421.406432	433.275455	206.292903
2017-01-09	439.933965	424.667005	206.292903
2017-01-10	438.794459	428.653228	206.292903
2017-01-11	423.516629	402.082696	206.292903
2017-01-12	401.064128	372.491865	206.292903
2017-01-13	404.398240	383.888907	206.292903
2017-01-14	398.911506	407.903354	198.916010
2017-01-15	403.935266	403.239730	198.916010

	eth_price	btc_price	usd_price
2017-01-01	400.000000	400.000000	200.000000
2017-01-02	397.522167	416.071053	200.000000
2017-01-03	428.082109	418.954714	200.000000
2017-01-04	507.227013	442.847903	200.000000
2017-01-05	509.121827	430.489357	200.000000
2017-01-06	486.189724	382.284436	200.000000
2017-01-07	417.523869	398.560629	213.694832
2017-01-08	426.450065	418.546967	213.694832
2017-01-09	445.199346	410.231147	213.694832
2017-01-10	444.046201	414.081865	213.694832