pandas
and numpy
into something that can be used in time series computations to implement different trading strategies. Interactive time series plotting is covered using plotly
. This guide provides a useful reference for creating your own crypto back tester as well as a general best practices when dealing the time series data in python.¶Portfolio rebalancing refers to the practice of keeping the original proportions of your investments intact regardless of individual asset price fluctuations. There is a plethora of reasons why rebalance your portfolio is a good idea, most of which revolve around diversification of risk through holding assets that have low correlation between each other.
The way this works in practice is simple. Letʼs suppose you decide to invest $100 in BTC, ETH and NEO. Your portfolio begins with three assets in equal proportions. A few days later, NEO takes off and outperforms the other assets in your portfolio. Now, your basket is disproportionately weighted in NEOʼs favor.
Rebalancing your portfolio, in this case, would entail selling some NEO and redistributing the profits across the rest of your portfolio to bring back the original proportions. Portfolio rebalancing spreads the wealth from your stronger assets, selling them at higher prices, and rebuys other assets in your portfolio at lower prices.
This guide will focus less on the economic reasoning and justification and more on the technical implementation details.
There are numerous different kinds of rebalancing strategies that could be implemented. This guide will focus on two popular methods, primarily time and threshold based rebalancing. Time based rebalancing will only consider to preform a rebalance if enough time has passed. Threshold rebalancing will only consider to preform a rebalnce if the allocations within the fund have deviated past a predefined threshold. There are tones of other options and flavours out there to explore, some of which can be found here. However the processes and principles explored with the simple time and threshold can be extended to any generic rebalancing strategy you can think of and so these are sufficient for this guide.
From a token perspective there is literally thousands of them that could be included in the models. For simplicity sake we will limit our exploration to a select few. More complex portfolios could include more tokens but again the basic ideas that are explored here remain the same.
The tokens explored are: ["btc", "eth", "xrp", "ltc", "dash", "xmr", "doge"]
as well as USD to represent a stable asset in the fund.
It is recomended that all packages are installed within a python virtual environment
if you want to run this notebook locally. This will ensure that packages are not installed to your local environment and makes the execution consistent between machines. This can be done as follows:
Install virtualenv if you dont already have it
pip install virtualenv
Create the virtual enviroment
virtualenv venv -p python3
Activate the virtual enviroment. You shell should now say (venv) on the left side.
source venv/bin/activate
Install packages
pip install -r requirments.txt
Ensure kernel can be accessed by Jupyter
python -m ipykernel install --user --name=CryptoBacktester
Start the notebook.
jupyter notebook
Now navigate to http://localhost:8888 where you can find the live notebook. From within the notebook your Kernel
should be set to CryptoBacktester
in the top right. If it's not click the Kernel
tab and change it to CryptoBacktester
.
We begin by loading in all the packages that we need. Nothing too out of the ordinary here except for Cryptory
which provided a collection of API end points for retrieving historic crypto (and other markets) price information. We also need to import the plotly.offline
to enable us to plot unlimited plot.ly figures without hitting into their limits. We also turn off the checking of ssl
certificates because this is required to pull the crypto prices from within the python virtualenv
which does not contain ssl
certificates.
# load package
from cryptory import Cryptory
from plotly.offline import init_notebook_mode, iplot #import offline mode
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.offline as pyo
import plotly.graph_objs as go
import ssl
#init notebook for plotly
init_notebook_mode()
#disable ssl for cryptory API & virtualenv
ssl._create_default_https_context = ssl._create_unverified_context
np.set_printoptions(suppress=True)
First we setup the Cryptory
object to pull data from 2017-01-01
to the present date and extract and print some sample data. Note that this means that the latest data is the most recent time the script was run, which is a strong advantage over using csv files.
# initialise object
# pull data from start of 2017 to present day
my_cryptory = Cryptory(from_date = "2017-01-01")
# get historical bitcoin prices from coinmarketcap
my_cryptory.extract_coinmarketcap("bitcoin").head()
The next step is to define all the tokens we want to get prices for and capture them within a dataframe
. We are using these tokens specifically as they are all high market cap and go back to 2017-01-01
.
The process of reading in the data involves iterating over all token's and calling the extract_bitinfocharts
on each symbol. This calls the Cryptory
package API to get the price & date information for this specific token in USDs. This is then joined onto the other sets of price records based on common date fields. This can be thought of as an inner join from SQL on the common date rows.
all_coins_df = my_cryptory.extract_bitinfocharts("btc") #start by filling the dataframe(df) with btc
# coins of interest
bitinfocoins = ["btc", "eth", "xrp", "ltc", "dash", "xmr", "doge"] #then fill it with all the others
for coin in bitinfocoins[1:]: # [1:] skips the first item(btc)
all_coins_df = all_coins_df.merge(my_cryptory.extract_bitinfocharts(coin), on="date", how="left")
We then replace all nan
data with zero, make the date
column the index so it is iterable later and inset a new column named usd_price
with a value of 1 for all rows. This is used later to represent having usd within the portfolio as well as the other cryptos. We then re-order the data so that the top (position 0) is the oldest data. This makes sense to do because when iterating it's more logical to work from the back (position 0) to the front (end position).
all_coins_df = all_coins_df.fillna(0) #remove nans
all_coins_df.set_index('date',inplace=True) # make date index
all_coins_df.insert(0,"usd_price",1) #add usd row
all_coins_df = all_coins_df.reindex(index=all_coins_df.index[::-1]) #re-order data
all_coins_df.head() #print sample rows
Next, we extract the names of all the tokens as the columns from the dataframe. we can use this later to itterate over.
tokens = all_coins_df.columns.tolist()
tokens
At this point we have created a pandas
dataframe that contains daily price data for 7 different crypto currencies against the USD. Next we will visualize this data in a simple time series plot.
This process involves iterating over all token names and using this as the key to extract column data from the pandas dataframe
. This information is stored in a go.Scatter
object which is appended to a plot_data
array which is used in the plotting of the price information. Other information is also appended such as the name of the crypto as well as the type of plot to generate.
plot_data = []
for index, coin in enumerate(tokens):
coin_chart_info = go.Scatter(
x = all_coins_df.index, #set all the x's to the index from the dataframe. This was the date
y = all_coins_df[coin].tolist(), #set all the y's to the current coins magnitude in the dataframe
mode = 'lines',
name = coin[:-6]) # the [:-6] is to remove the `_price` part of the name from the dataframe
plot_data.append(coin_chart_info) #add this coins chart info to the plot_data array
tokens[index] = coin[:-6]
The iplot
function can also take a layout
parameter which we use to define the heading for the plot as well as moving the legend to the top left. We then generate the iplot
.
And now we have a beautiful interactive plot! You can zoom in on it by dragging your mouse. Double click to zoom out again. You can also toggle coins from the legend. Try turning btc off by clicking it.
layout = go.Layout(
title = "Historic Price Over Time",
autosize=True,
showlegend=True,
legend=dict(x=0.05, y=0.95))
fig = dict(data=plot_data, layout = layout)
iplot(fig)
Next we jump right into build a rebalancing strategy. This initial implementation will not consider fees or rebalancing intervals; it will simply take whatever the distribution within the portfolio is at the end of the day and compare this to what it should be. It will then “execute the trades” to bring the portfolio back to equilibrium. After this, we will consider how to define a collection of different portfolio strategies and compare and contrast them on the same plot.
The first thing we do is grab top row from the all_coins_df
dataframe that stores all the historic information. Then, this row from the dataframe
is converted to a np.array
where each position in the array corresponds to a diffrent column from the dataframe
. We will require this information to calculate the initial allocation within the fund.
initial_price = all_coins_df.head(1)
initial_price = np.array(initial_price)[0]
initial_price
For the simplest case we will set all tokens in our dataset to be part of the fund. Each token will get 1/8 of the total value.
initial_weights = np.ones(len(tokens)) / len(tokens)
initial_weights
Next, the amount of capital the fund starts off with is set. We set this to 1000 usd as an arbitrary number.
INVEST = 1000
The initial allocations are now calculated. These allocations are in the respective currencies base price. For example the initial allocation of bitcoin is 0.12873486. This is 1/8 of the 1000 usd, at the given starting price of bitcoin of 970.988.
initial_allocation = INVEST * initial_weights / initial_price
initial_allocation[np.isinf(initial_allocation)] = 0
initial_allocation
We can also work out the initial position. This is the value of each set of tokens purchased, based off the allocations we defined. This should logically be the total investment $\times$ the fraction per token and works out to 125 per token.
initial_positions = initial_allocation * initial_price
initial_positions
At any point in time the value of the portfolio is the sum of a row in the pandas dataframe. Therefore the starting value initial_portfolio_value
is simply the sum of the initial_positions
row.
initial_portfolio_value = initial_positions.sum().sum()
initial_portfolio_value
We now know all the information we need to start doing some rebalancing simulations. This process will happen as follows:
all_coins_df
dataframe we itterate over and do a series of calculations. Each row corisponds to a collection of prices over all crypto pairs listed for a given day. These prices are stored in the current_price
variable on each loop. This amounts to preforming some logic at every day in the data. At each day the following is calculated and stored:hodl
as the value of not trading
at all based on: $CurrentAllocation\times CurrentPrice$. At each day this information is stored in a new slot within the hodl
variable.rb
as the value of the portfolio based on rebalancing calculation: $CurrentAllocation \times CurrentPrice$. At each day the current value of the rebalancing portfolio is stored in rb
.current_allocation
is updated and will be used in the next day to preform "trades" based on $CurrentPortfolioValue \times \frac{InitialWeights}{CurrentPrice}$Note that at no point are we considering trading fees.
hodl = {}
rb = {}
current_allocation = initial_allocation
for i, current_price in all_coins_df.iterrows():
# hodl positions
hodl[i] = initial_allocation * current_price
# current positions
current_positions = current_allocation * current_price
rb[i] = current_positions
# rebalance
current_portfolio_value = current_positions.sum()
if(current_price.min() > 0):
current_allocation = current_portfolio_value * initial_weights / current_price
Next we convert the generated dictionaries to panda DataFrames
so we can do computation over them and use them in plotting more easily.
hodl = pd.DataFrame(hodl).T
rb = pd.DataFrame(rb).T
At this point we have two dataframes (hodl
and rb
) which represent the values of each crypto at each day over the whole period. We will print out the top 15 rows of one of these frames to see what it looks like.
In the table below each row shows the value of each crypto in time, starting from the same $\frac{1}{8}\times1000$. Every day The value of each crypto is changing with the change in price. As this is the hodl strategy, no balancing is done at all as can be seen by the fixed usd_price table.
hodl.head(10)
We can also plot the hodl portfolio over time. Remember that at the beginning we had 12.5% of each asset (125 usd). We can see how these have evolved to the end of time. We will also define a useful function fundSort
which we can use to sort the array of dataframes
such that the put tokens that had the highest value first after the period.
This figure is really interesting as we can see how much 125 usd could have grown to, if invested in all the cryptos from the beginning of the period. We can see some absolutely crazy numbers here; XRP grew from 125 USD in jan 2017 and was worth at a maximum ~70k USD about a year later! That's a growth of ~583 in about a year.
def fundSort(fund):
return fund['y'][-1]
plot_data = []
for index, coin in enumerate(hodl):
coin_chart_info = go.Scatter(
x = hodl.index, #set all the x's to the index from the dataframe. This was the date
y = hodl[coin].tolist(), #set all the y's to the current coins magnitude in the dataframe
mode = 'lines',
# the [:-6] is to remove the `_price` part of the name from the dataframe
name = coin[:-6] + "->\t\t$" + str(round(hodl[coin][-1],3)))
plot_data.append(coin_chart_info) #add this coins chart info to the plot_data array
#sort the data to be plotted based on the total value of the holdings after the period
plot_data.sort(key = fundSort, reverse = True)
#spesify the layout for the plot
layout = go.Layout(
title = "Hodl Portfolio asset value over time",
autosize=True,
showlegend=True,
legend=dict(x=0.05, y=0.95))
#display the plot inline
fig = dict(data=plot_data, layout = layout)
iplot(fig)
Next, we can generate a similar plot but for the rebalanced fund values over time. These should all follow a similar value as we've enforced that at all periods in time the fund maintains a $\frac{1}{8}$ of its value in each respective crypto.
plot_data = []
for index, coin in enumerate(rb):
coin_chart_info = go.Scatter(
x = rb.index,
y = rb[coin].tolist(),
mode = 'lines',
name = coin[:-6] + "->\t\t$" + str(round(rb[coin][-1],3)))
plot_data.append(coin_chart_info)
plot_data.sort(key = fundSort, reverse = True)
layout = go.Layout(
title = "Rebalanced Portfolio asset value over time",
autosize=True,
showlegend=True,
legend=dict(x=0.05, y=0.95))
fig = dict(data=plot_data, layout = layout)
iplot(fig)
Next it would be useful to compare the final values for hodl vs rebalance to get a relative comparison of these two methods. Remember now that we are comparing two portfolios that start out with the exact same cryptos in it.
We now don't want to look at the value of each individual crypto within the fund, but rather the sum in value of all cryptos over the whole duration. To this end we sum each row along axis=1
indicating to sum along each rows.
Then, the performance of each fund is computed by taking the final value/initial value to get a simple relative change in value for the portfolio.
hodl_value = hodl.sum(axis=1)
rb_value = rb.sum(axis=1)
hodl_perf = hodl_value.iloc[-1] / hodl_value.iloc[0]
rb_perf = rb_value.iloc[-1] / rb_value.iloc[0]
Next we create the data objects for the two plots and sort.
hodlInfo = go.Scatter(
x = hodl_value.index,
y = hodl_value.tolist(),
mode = 'lines',
name = 'hodl {:.1f}%'.format(hodl_perf * 100))
rebalanceInfo = go.Scatter(
x = rb_value.index,
y = rb_value.tolist(),
mode = 'lines',
name = 'rebalance {:.1f}%'.format(rb_perf * 100))
plot_data = [hodlInfo, rebalanceInfo]
plot_data.sort(key = fundSort, reverse = True)
layout = go.Layout(
title = "Portfolio value over time",
autosize=True,
showlegend=True,
legend=dict(x=0.8, y=0.9))
fig = dict(data=plot_data, layout = layout)
iplot(fig)
At this point we have looked at implementing a simple portfolio backtesting setup. However it is not well suited to comparing multiple strategies against each other and as a result it makes it hard to find an ideal model. Next we will define a more general process for backtesting funds by defining a JSON object structure that can be used to specify all portfolio characteristics that are to be tested.
Primarily we are concerned with:
All this information for a bunch of different portfolios is encoded as objects within an array below for 6 different funds. The process of backtesting is very similar to the specific process outlined before except we now iterate over all elements within the funds
array and create a backtest for each one sequentially.
funds = [
{
'coins': ['eth_price', 'btc_price', 'usd_price'],
'ratio': [0.4, 0.4, 0.2],
'rebalance': 7,
'name': 'ETH40, BTC40, USD20 @ 7',
},
{
'coins': ["btc_price", "eth_price", "xrp_price", "ltc_price", "dash_price", "xmr_price", "doge_price"],
'ratio': [0.25, 0.25, 0.10, 0.10, 0.10, 0.10, 0.10],
'rebalance': 2,
'name': 'ETH25, BTC25, OTHERS10 @ 2',
},
{
'coins': ['eth_price', 'btc_price'],
'ratio': [0.5, 0.5],
'rebalance': 15,
'name': 'ETH50, BTC50 @ 15',
},
{
'coins': ['eth_price', 'btc_price'],
'ratio': [0.5, 0.5],
'rebalance': 0,
'name': 'ETH50, BTC50 @ HODL',
},
{
'coins': ['eth_price'],
'ratio': [1],
'rebalance': 0,
'name': 'ETH100 @ HODL',
},
{
'coins': ['btc_price'],
'ratio': [1],
'rebalance': 0,
'name': 'BTC100 @ HODL',
}
]
We we now do two nested loops:
First we iterate over all the funds specified in the json object and for each one calculate the key metrics for that specific fund, such as the initial allocations, initial weightings and initial prices. The logic and justification for each fund is the same as done before except now generalized.
Next, we iterate overall values in the coin_prices dataframe storing the pricing information. At each point we calculate the value of the specific fund and store the results.
At the end of this process we have one dictionary fund_returns
that has time series information for all 6 funds specified.
fund_returns = {}
for index, fund in enumerate(funds): # for all funds in the list of funds we run a backtest
# grab only the coin prices for the coins in the fund
coin_prices = all_coins_df.loc[:,fund['coins']]
# the initial weights are spesified by the fund JSON. read this in and store it
fund_initial_weights = np.array(fund['ratio'])
#calculate the initial prices of the coins in the fund
fund_initial_price = coin_prices.head(1)
fund_initial_price = np.array(fund_initial_price)[0]
#calculate the initial allocation based on the weights and starting prices
fund_initial_allocation = INVEST * fund_initial_weights / fund_initial_price
fund_initial_allocation[np.isinf(fund_initial_allocation)] = 0
#the initial position is defined by the initial allocation and initial price
fund_initial_positions = fund_initial_allocation * fund_initial_price
#lastly we define the current allocation as the initial allocation as we start at time = 0
fund_current_allocation = fund_initial_allocation
day_count = 1
fund_returns[fund['name']] = {} #create the object to store all the results for this spesific fund
for i, current_price in coin_prices.iterrows(): #for each day of pricing data generate a backtest result
fund_current_positions = fund_current_allocation * current_price
fund_returns[fund['name']][i] = fund_current_positions
# rebalance
current_portfolio_value = fund_current_positions.sum()
if day_count >= fund['rebalance'] and current_price.min() > 0:
#update the allocation for each token (preform the rebalance)
fund_current_allocation = current_portfolio_value * fund_initial_weights / current_price
#restart counting for next rebalance
day_count = 1
#increment time
day_count += 1
Lets print the top 15 results from one of the funds to see that it makes sense given the portfolio strategy we had specified. Printing ETH40, BTC 40, USD 20 @ 7
shows 3 different coins (eth, btc & usd) in the specified starting ratio of eth: $40\%\times 1000 = \$400$, btc: $40\%\times 1000 = \$400$ and usd: $20\%\times 1000 = \$200$. At the end of a 7 day period we see a rebalencing where some ether is sold to buy btc and usd. then again on the 14th (two weeks after start) we see another trade occurring to reestablish the ratios.
pd.DataFrame(fund_returns['ETH40, BTC40, USD20 @ 7']).T.head(15)
Lets first plot the internal workings of one of the funds to see the evolution of the positions over time. We want to see the act of rebalancing and the effect that it has on the funds components in time.
The plot that is generated shows the value of eth and btc following closely, which is to be expected as they both have 40% of the fund allocated to them. USD has 20% allocated to it and we can see it's price following at $\approx \frac{1}{2}$ that of the other two. the jaggad steps in the usd plot are the weekly rebalancing intervals.
#extract the dataframe for this spesific portfolio and plot it's components over time
df = pd.DataFrame(fund_returns['ETH40, BTC40, USD20 @ 7']).T
plot_data = []
for coin in df:
coin_chart_info = go.Scatter(
x = df.index,
y = df[coin].tolist(),
mode = 'lines',
name = coin[:-6] + "->\t\t$" + str(round(df[coin][-1],3)))
plot_data.append(coin_chart_info)
plot_data.sort(key = fundSort, reverse = True)
layout = go.Layout(
title = "ETH40, BTC40, USD20 @ 7 Components Value Over Time",
autosize=True,
showlegend=True,
legend=dict(x=0.05, y=0.95))
fig = dict(data=plot_data, layout = layout)
iplot(fig)
Next we will compare all 6 models defined before in one plot. For each model we calculate the performance metrics in the same way we did before, except we now do it for each and every fund and store the results in an array called fund_computed_plot
which is used later on in plotting.
fund_computed_plot = []
for fund in fund_returns:
fund_df = pd.DataFrame(fund_returns[fund]).T
fund_value = fund_df.sum(axis = 1)
fund_performance = (fund_value.iloc[-1] / fund_value.iloc[0]) * 100
fund_performance = round(fund_performance, 3)
fund_plot = go.Scatter(
x = fund_value.index,
y = fund_value.tolist(),
mode = 'lines',
name = '{0} -> {1}%'.format(fund, fund_performance))
fund_computed_plot.append(fund_plot)
Sort the funds to get the best at the top of the index.
fund_computed_plot.sort(key = fundSort, reverse = True)
And lastly plot them all on one figure
layout = go.Layout(
title = "Time Based Rebalancing Strategies",
autosize=True,
showlegend=True,
legend=dict(x=0.7, y=0.9))
fig = dict(data=fund_computed_plot, layout = layout)
iplot(fig)
What if rather than rebalencing based on a time period we rebalanced base on a deviation from the desired ratios. This rebalencing method is very similar in practice and only slightly modifies the rebalencing condition.
We will start off by defining a threshold that will quantify the required deviation to preform a rebalance. To begin with we set this to 0.2 This means that if there is a 20% deviation from the desired ratio on any day then the portfolio will preform a rebalance.
THR = 0.2
The initial_allocation
that we had calculated before based off the $\frac{1}{8}$ for each asset based off a 1000 usd starting value will be used again for the initial_allocation
initial_allocation
As before we will loop through all price information in the all_coins_df
and for each day we will calculate the current position and then make a decision about rebalancing. The key logic here about choosing to rebalance (or not as the case may be) is defined by the logical statement if any(weights_diff > THR)
where weights_diff
shows the difference between the desired weight and the actual weight at the end of each period. If there is any asset in the portfolio that exceeds the desired threshold then the whole fund rebalances to accommodate this.
hodl = {}
rb = {}
current_allocation = initial_allocation
for i, current_price in all_coins_df.iterrows(): #for all days price information, we calculate positions
# hodl positions
hodl[i] = initial_allocation * current_price
# current positions
current_positions = current_allocation * current_price
rb[i] = current_positions
# rebalance
current_portfolio_value = current_positions.sum()
current_weights = current_positions / current_portfolio_value
weights_diff = np.abs(current_weights - initial_weights)
if any(weights_diff > THR) and current_price.min() > 0:
current_allocation = current_portfolio_value * initial_weights / current_price
hodl = pd.DataFrame(hodl).T
rb = pd.DataFrame(rb).T
As before we find some performance metrics to compare the portfolios
hodl_value = hodl.sum(axis=1)
rb_value = rb.sum(axis=1)
hodl_perf = hodl_value.iloc[-1] / hodl_value.iloc[0]
rb_perf = rb_value.iloc[-1] / rb_value.iloc[0]
And then plot in the same way as before. We can see that this simple threshold rebalencing strategy out preforms just hodling.
hodlInfo = go.Scatter(
x = hodl_value.index,
y = hodl_value.tolist(),
mode = 'lines',
name = 'hodl all tokens at {:.1f}%'.format(hodl_perf * 100))
rebalanceInfo = go.Scatter(
x = rb_value.index,
y = rb_value.tolist(),
mode = 'lines',
name = 'Threshold rebalance {:.1f}%'.format(rb_perf * 100))
plot_data = [hodlInfo, rebalanceInfo]
plot_data.sort(key = fundSort, reverse = True)
layout = go.Layout(
title = "Threshold rebalencing",
autosize=True,
showlegend=True,
legend=dict(x=0.8, y=0.9))
fig = dict(data=plot_data, layout = layout)
iplot(fig)
As we did with the time based rebalencing strategies it would be ideal if we could predefine a number of portfolios and backtest them as a bunch and compare them all at the end. To achieve this we implement a similar strategy as was used before, defining a JSON object to store all the funds we want to test. The only difference now is instead of defining a rebalance period we rather specify the threshold that is required to preform a rebalance.
funds = [
{
'coins': ['eth_price', 'btc_price', 'usd_price'],
'ratio': [0.4, 0.4, 0.2],
'threshold': 0.05,
'name': 'ETH40, BTC40, USD20 @ 0.05'
},
{
'coins': ["btc_price", "eth_price", "xrp_price", "ltc_price", "dash_price", "xmr_price", "doge_price"],
'ratio': [0.25, 0.25, 0.10, 0.10, 0.10, 0.10, 0.10],
'threshold': 0.05,
'name': 'ETH20, BTC20, allOthers10 @ 0.05',
},
{
'coins': ['eth_price', 'btc_price'],
'ratio': [0.5, 0.5],
'threshold': 0.10,
'name': 'ETH50, BTC50 @ 0.10'
},
{
'coins': ['eth_price', 'btc_price'],
'ratio': [0.5, 0.5],
'threshold': 0.01,
'name': 'ETH50, BTC50 @ 0.01'
},
{
'coins': ['eth_price'],
'ratio': [1],
'threshold': 0,
'name': 'ETH100 @ HODL',
},
{
'coins': ['btc_price'],
'ratio': [1],
'threshold': 0.,
'name': 'BTC100 @ HODL',
}
]
As before we now iterate over all funds and calculate and calculate the key fund values at the beginning of the period. We then use each time frames price information to backtest the fund over all time considering the rebalancing thresholds to perform trades.
fund_returns = {}
for index, fund in enumerate(funds):
coin_prices = all_coins_df.loc[:,fund['coins']]
fund_initial_weights = np.array(fund['ratio'])
fund_initial_price = coin_prices.head(1)
fund_initial_price = np.array(fund_initial_price)[0]
fund_initial_allocation = INVEST * fund_initial_weights / fund_initial_price
fund_initial_allocation[np.isinf(fund_initial_allocation)] = 0
fund_initial_positions = fund_initial_allocation * fund_initial_price
fund_current_allocation = fund_initial_allocation
fund_returns[fund['name']] = {}
for i, current_price in coin_prices.iterrows():
fund_current_positions = fund_current_allocation * current_price
fund_returns[fund['name']][i] = fund_current_positions
# rebalance
current_portfolio_value = fund_current_positions.sum()
fund_current_weights = fund_current_positions / current_portfolio_value
weights_diff = np.abs(fund_current_weights - fund_initial_weights)
if any(weights_diff > fund['threshold']) and current_price.min() > 0:
fund_current_allocation = current_portfolio_value * fund_initial_weights / current_price
It is again useful to look at some portfolio information for one of the funds generated over time and track how it evolves with the trades preformed. Looking at the ETH40, BTC40, USD20 @ 0.05
which starts off at the same allocations as the previously examined but now will only trade with there is a 5% deviation from the portfolio allocations. Looking at the time table below we can see that between 2017-01-06
and 2017-01-07
there was enough market movement to sufficiently justify a rebalance, as can be seen by the movement of the USD value.
pd.DataFrame(fund_returns['ETH40, BTC40, USD20 @ 0.05']).T.head(10)
Plotting this portfolios value over time is done in the same way as before. This plot showcases one of the advantages of a threshold based rebalancing strategy wherein when there is a lot of market movement the algorithm trades at a much higher frequency when compared to more side tracking movement when there is little to no trading done.
#extract the dataframe for this spesific portfolio and plot it's components over time
df = pd.DataFrame(fund_returns['ETH40, BTC40, USD20 @ 0.05']).T
plot_data = []
for coin in df:
coin_chart_info = go.Scatter(
x = df.index,
y = df[coin].tolist(),
mode = 'lines',
name = coin[:-6] + "->\t\t$" + str(round(df[coin][-1],3)))
plot_data.append(coin_chart_info)
plot_data.sort(key = fundSort, reverse = True)
layout = go.Layout(
title = "ETH40, BTC40, USD20 @ 0.05 Components Value Over Time",
autosize=True,
showlegend=True,
legend=dict(x=0.05, y=0.95))
fig = dict(data=plot_data, layout = layout)
iplot(fig)
We will now combined all daily information to form one plot to represent each fund and then compare them against each other over all time.
This plot shows that the best preforming portfolio out of any of those identified thus far: ETH20%, BTC20% and all other cryptos at 10% with a threshold based rebalancing set to 5%. This generated a return of almost 3000% from the start to the end which is considerably higher than any of the other strategies. Importantly it is more than both the BTC and ETH hodl strategy, showing the importance of having a spread in distributions with other less correlated assets.
fund_computed_plot = []
for fund in fund_returns:
fund_df = pd.DataFrame(fund_returns[fund]).T
fund_value = fund_df.sum(axis=1)
fund_performance = (fund_value.iloc[-1] / fund_value.iloc[0]) * 100
fund_performance = round(fund_performance, 3)
fund_plot = go.Scatter(
x = fund_value.index,
y = fund_value.tolist(),
mode = 'lines',
name = '{0} -> {1}%'.format(fund, fund_performance))
fund_computed_plot.append(fund_plot)
fund_computed_plot.sort(key = fundSort, reverse = True)
layout = go.Layout(
title = "Threshold Based Rebalancing Strategies",
autosize=True,
showlegend=True,
legend=dict(x=0.7, y=0.9))
fig = dict(data=fund_computed_plot, layout = layout)
iplot(fig)
The process outlined above for each model can be extended to numerous different trading strategies to create a more generic back tester. The introduction of fees is important to a final model but was excluded here as different strategies would require different fee considerations based off volume, market maker and taker fees. All these considerations will fundamentally change the way the models are designed.
Cryptory
?¶It's easy to work with and has lots of data sources. They have tones of code examples as well that makes processing data very easy. Their API also contains other market information like google search trends and other awesome things see here.
An alternative, also awesome package is cryptocompy
from here which lets you access a whole bunch more APIs from cryptocompare.
Plotly
?¶Interactive plots are really nice and look good! It also has a really simple API. It is a bit of a pain to get to work offline though but it's worth it I think.
pandas
& numpy
?¶There are not better packages for manipulating and working with data and maths!
The returns here are not meant to indicate trading strategies that should be followed as there are numerous other complexities that have been ignored here such as trading fees, price slipages, exchange front running and tax implications. All these things would need to be considered before a portfolio can be considered. This tutorial was more about working with python
, jupyter
, numpy
, pandas
& plotly
all with time series data. It also included processes for reading and processing data.