His Majesty Back Testing 

  1. Introduction

  2. Simplest verification of your model

  3. Definitions

  4. Example: Spectrum Model

 

Introduction

The Back Testing (BT) is devoted to answer the main question of forecasting: "Can we rely on this forecast?". It is not an easy question. Not easy - because it has a hidden agenda. There is the answer itself there, and there is the mass psychology involved. I develop the software for financial analysis more than 10 years already, and I often face two opposite opinions regarding this issue. The most popular opinion among the beginners is:  "Look at these guys, their system provides 80% accurate forecast and up!". More experienced people often say to me: "I have already spent 20K bucks on books/software/seminars for and about trading, and I am disappointed". I think the truth is somewhere in the middle. The market forecast is possible, however the percentage is not so big. This is a cold wind of reality that we have to accept. You need to decide yourself what you are looking for - the unremarkable truth or fairy tales about super trading systems with super accurate forecast. The Back Testing at least gives us a base for such a decision. 

The Back Testing is the most advanced module in Timing Solution software (TS). It is based on the approach that is totally different from other software packages: they make a forecast based on Technical Analysis (and there are so many programs that can do this) while Timing Solution allows to perform the Back Testing for a projection line. This is a totally different project, and this is much more complicated than Technical Analysis (TA)  Back Testing. 
The Status Quo stated by TA adepts can be formulated by these words: "for financial data series, the forecast is impossible, so our main goal (i.e., TA goal) is to find a strategy of reacting to stock market movement now". Thus, according to this concept, we need to find the optimal combination of TA indicators to optimize our profit (profit factor, win/loss ratio) and minimize or exclude our risk (drawdown). As an example, we may use buy/sell signals generated by the intersection of fast and slow moving averages to create a trading system. The back test allows to evaluate the performance of this system. Using other or more advanced TA indicators, we can get another trading system, etc. This is a usual way and one possible approach. Its capabilities and restrictions are described in the book by Robert W, Colby "The Encyclopedia of technical market indicators". 

Another approach is creating the projection line for the chosen financial instrument. We can create some model of the stock market, adjust this model to available price history data and observe how this model forecasts the future. Instead of TA indicators, Timing Solution creates forecasting models. The program is able to deal with different types of models: it can be a cyclic model, or a model based on astronomical cycles or any astrological phenomena, or one of  the auto regression models (AR), and many others. The benefit of this approach is that it allows to see the future further than just one price bar. Due to the differences between these two approaches, the technology of verification of the models' performance differs significantly from TA Back Testing.

Simplest verification of your model

Before explaining the advanced technology of Back Testing procedure, I would like to discuss some basic principles you need to follow while verifying the models that you create with Timing Solution. 

You can easily find in the Internet statements like this one:  "I have found a very good forecast model, here it is":  

Is it really so? There are several questions that you always need to ask. The first question is: what price history was used to calculate this curve? Asking this question, we try to find out whether the author of this model keeps some amount of "untouched" price history data invisible for this model. That part of the data can serve as a first test of the model; a good model at least should be good on that time interval.
Second question is: what will happen to this model if we apply it to another time interval? It is a second test of the model; we try to model the situation when new price history is coming. 
In other words you should follow two main principles: a) verify model on "untouched" price history data; b) repeat this procedure on different intervals.

I will demonstrate how it practically works. I have created some Neural Net model "X". I would like to know the prognostic ability of this model. This is just some model I am working on now, the eve of November 14, 2007. See how this model forecasts now (November 2007):

I used all available price data to create this model. There is no "untouched" data to check the model, so I skip the first question. 

Now let us check this model using another price interval. I do not need to find some other data file to do that. Actually, I can use the data that I have already and select some parts of it. In order to do that, set the Learning Border Cursor (LBC) at some data point. Let it be the end of the year 2004 (do it through right mouse click):

Now randomize Neural Network clicking this button:

Thus we ask the Neural Network to "forget" the knowledge that it already has from the first piece of data (which was all available price history). Train the Neural Network once again. It uses the same model for a different data set.

This is forecast that I have got, and it is not so good after LBC:

When I did the forecast for the first data set (all available price), I had no data to compare. And now I can see how the model performs.

Then I set LBC on the beginning of 2001 year, randomize Neural Network and train it again. This is what I have got this time:

LBC set on 1998:

I recommend to vary LBC several times.  Do not worry about "bad" looking projection lines (like our first forecast above), 80% accurate forecast exists on advertising articles only. However, the better models show more often "good" projection lines than bad ones. 

 

Definitions

Before any discussion, let us agree on some definitions.

The price history is divided on two intervals, Training Interval and Testing Interval. The border between them is called Learning Border Cursor or LBC:

When I run the Neural Net module of Timing Solution, the program calculates the projection line for any model. It looks like this:

To train the Neural Net (NN), the price history from the training interval is used. The program does not use price points from the testing interval. This approach allows to avoid future leaks (which may occur when we use the same price history data to train the Neural Net and estimate the model's performance).

Thus the performance of the model on the testing (red) interval allows to estimate the real workability of the chosen model. It shows how it will work in a real life. In Back Testing theory, the testing interval is called out of sample

There are simple rules in Back Testing procedure regarding the testing interval:

1) Do not use price points from the testing/(out of sample) interval to optimize the model;

2) To estimate the model's performance, use the points from the testing (out of sample) interval

The Timing Solution Back Testing module applies these rules automatically.

The next thing is a number of samples. It is a very important parameter. The idea is very simple: we can get a very good projection line for one LBC position. See this example:

This projection line fits the real performance rather well: it follows major trend and has just a few inversions. However, applying this model to real trading, we can easily face the fact that this model does not work. Why? The answer is obvious: this is just some occasional fact, for this particular model and specific conditions (partially defined by LBC position). This is very important to understand: if you are looking for a reliable model, it should work under various conditions. Thus, it is necessary to apply this model for different LBC positions. It gives us a scheme of the true Back Testing process of the forecast based on different models.. 

The scheme looks like this:

Here is the illustration for some typical example:

Step 1: The program trains the Neural Net using the price points from the yellow region (see the upper picture). After training, it checks the Neural Net performance using the points from the red region. In other words, the program creates the Neural Net projection line using the points from the yellow interval and then it observes how well this projection line fits to the real price data from the red interval. These two intervals (yellow and red) are independent, so any "future leaks" are excluded.

Step 2: We shift the Learning Border Cursor (LBC) to several price bars ahead. Therefore, we shift the yellow and red regions as well and repeat the whole procedure again (the optimization and the performance evaluation).

Step 3: Shift LBC once more and do the same again.

The amount of LBC shifts used for Back Testing is called "a number of samples". To be sure that this model is really working, we need to use as many samples as possible - at least 50. If you observe just a few projection lines, there is no guarantees that this model works; it might be just a game of His Majesty Chaos. For example click here to see three nice projection lines received during the Back Testing procedure. To your surprise, this model dos not work! To get an objective picture, you should use at least 50 LBC shifts. Usually we use 200-500 shifts. This sample size allows us to make some conclusions about the analyzed model, I agree that this is a very time consuming task. However, it has to be done, and Timing Solution is the only one program that performs this task properly.

One more important thing is to decide how we will estimate the model's performance. And here I would like to remind you that we need to use different criteria to analyze the projection line, different from the criteria used in Technical Analysis. Timing Solution allows to work with them.

For example, we can calculate the correlation coefficient between the real price (or any price oscillator) and the  projection line using 20 price bars after LBC ("20" is just an example):

 

The bigger correlation means that this projection lines fits the price better.

Or we can adjust our model to predict the next day movement - Up or Down. Or we can analyze the price movement several days ahead (like comparing Close today and Close in five days). The program gives you many possibilities here. The choice is yours.

 

 

Example: Spectrum Model

Spectrum Model is a forecasting model based on fixed cycles. The idea of this approach is pretty simple: we use fixed cycles to create the forecasting model. When we do the analysis, the first question comes as to what are the cycles to use for our research and which ones are more important than others. The best way to answer is using a spectrogram (periodogram) to reveal these cycles. It looks like this:

The peaks here indicate the most powerful cycles for the chosen financial instrument (it is Euro/USD daily). The program extracts these cycles and uses them as inputs for Neural Net module. The program is able to perform this procedure automatically. Here is the sequence of operations that will be performed:

1) calculate spectrum;
2) extract the most powerful cycles;
3) define these cycles as inputs for the Neural Net module;
4) train NN to get the projection line based on these cycles;
5) estimate the model's performance using price bars from the testing interval (out of sample);
6) shift LBC and repeat steps 1 to 5.

Before doing anything, download the price history data and set LBC somewhere. I have downloaded Euro/USD from the year 2000 up to the year 2007 and set LBC at the end of  the year 2005. Thus we have 2 years of price history to test our model.

Run this:

 

The window will appear where you should set the main parameters for Back Testing:

Let's start filling out this form.

As an output, I use the relative price oscillator with the smoothing period of 50 bars (we will forecast this oscillator):

To define the forecasting model, click this button:

 

You will get the list of models. For each model, it is necessary to define some parameters like the length of the training interval etc. Also you can easily create your own models using Model Editor or through the Neural Net module. 

Here I have chosen the Spectrum NN model:

Under the criteria, point out how you will estimate the performance of this model:

I will calculate the correlation on 10, 20 and 30 bars after LBC. However, you can define any other criteria.

Here you should define the amount of LBC shifts:

For a real Back Testing, "16" is not enough for sure. This is shown just to demonstrate the general idea. After some minutes of calculation, you will get this report:

 It shows that the average correlation of this model on the training interval of this length is 0.22 (which is good!). The correlation was positive 8 times against 2 times when it was negative. If you prefer, you can get the detailed report with pictures illustrating the projection lines. Click here to see it.

Just to give you an impression, see how the results change if I use 50 LBC shifts:

 

 

The Back Testing is a powerful feature that gives you a useful information regarding the model's performance.