Timing Solution: Back Testing Report

Price Bar Patterns (Japanese Candlesticks) model for Dow Jones Industrial Index

 

Executive Summary
Core Idea
Numeric Experiment Description
Artifacts Study 
Parameters to Vary
 

Executive Summary

The Back Testing definitely shows that the forecasting model based on proportions of price bar elements (Open, High, Low, Close) allows to predict the future price movement. We call this model Japanese Candle stick model. This fact is in contradiction to Random Walk Theory that states "... past movement or direction of the price of a stock cannot be used to predict the future movement." 

The statistical  report is here. Also, you can see the the forecasting curves produced by this model. This is the part of the full report that contains about 600 pictures and takes 10 Mbytes. The forecasting curves are picked up randomly. The full report is available for everybody, at their first demand (contact e-mail: tarassov@rogers.com).

All calculations are made for Dow Jones Industrial index (the data set covers time period from 1970 to 2005).

This is the summary table with the information regarding the best models:

Model Training Interval Prediction Horizon
Neural Net Model 5.000 price bars (20 years) 7 price bars
Neural Net Model

Linear Model

1.000 price bars (4 years) 5 price bars

 

Core Idea

Mathematically, this approach is very close to Fuzzy Neural Network technique. The strongest side of this method is: it allows to reveal the connection between several last price bars and future price movements. These connections are revealed during so called training procedure. The program compares the structure of these price bars (in this example, we use 15 trade days - regression order) and future price movements and creates a model. During the training procedure, the program adjusts parameters of the Neural Net (its neurons) to get the best fitness between the model's projection and the real price.
In other words, if there is any relation between the price chart and future price movement, this Neural Net allows to find it. You can accept this Neural Net as a very ambitious expert who analyzes a lot of price charts as well as all their possible combinations trying to see the future. It is exactly what any professional does in any field: first he/she spends time and energy learning the subject, then he/she applies the knowledge and gains the experience; only after that he/she becomes a professional - a person who can make an executive decision regarding the ongoing  process (and its future!) based on the knowledge and expertise. 

How does this core idea work in Timing Solution program? To define any price bar, we use the special language - Universal Language of Events (ULE).
To describe the price bar, we need to consider these 3 parameters:

1) the difference between High and Low (True range);
2) the difference between Close and Open (so called "real body");
3) the difference between Open and Low (so called "lower shadow").

All these values are normalized, to exclude the trend effect. To describe these proportions, we use Fuzzy Logic math. In other words, the program deals with events like this: "five days before the real body was big while the lowed shadow was very high". We work not with digits, but with so called membership functions (i.e., "real body was big").

For example, let us look at the figure called "Shaven bottom":

Here is how it sounds on the language of Fuzzy Logic: "the real body is negative and medium while the lower shadow is zero". It is the event to be considered. This event and other similar events can be described in the Timing Solution program.

Now the time comes for the Object Oriented Neural Net (which is "know how" of the Timing Solution group).
Like any regular Neural Net, this system is looking for any possible connections between the price configuration in the past and the future price movement. But, apart of that, we provide a very powerful mechanism  of sub-optimization. In general, it means that during the training procedure the program itself makes a conclusion regarding to what should be taken as "wide body" or "medium lower shadow". This procedure allows to avoid a very unpleasant effect of many Neural Nets - its over training

 

Numeric Experiment Description

These are steps of  Back Testing procedure:

  1. Data: this example is made for Dow Jones Index daily data for the years 1970 - 2005.

  2. Choosing training/testing intervals: in this data set, we choose some point that divides the price data on two parts: a training interval used for training the Neural Net and immediately following it a testing interval used to estimate the model's performance. To exclude "memory leaks", these two intervals do not contain any mutual points. The border between the training and testing intervals is called Learning Border Cursor (LBC). 

  3. Training the Neural Net: we use the Back Propagation procedure to train Neural Net (at the same time, we have created the simple linear model based on the same events; its purpose is explained below). As a target function, we use a detrended price oscillator to exclude the trend and see the short swing movements. The oscillator is calculated as (Close-MA(Close,Period=5))/MA(Close,Period=5), where MA is the exponential moving average of Close index with a period of 5 price bars. The Neural Net has 32 hidden units, it is trained on 25.000 steps.

  4. Varying Training/Testing Intervals: to train the Neural Net, we use different training intervals - 1000 price bars (4 years), 2500 bars (10 years) and 5000 bars (20 years). For the estimation , we have calculated the correlation coefficient between the price oscillator and the projection line generated by the Neural Net, like this:

         For testing we used the price bars only that did not use in Neural Net training, thus we avoid any kind of future leaks. The forecasting horizon is restricted by 7 bars ahead, because the autoregression order is 15.

  1. Randomizing Neural Net: we set the Neural Net at some initial position, setting the weights to a small random value. Thus we prepare the Neural Net for the next piece of data set.

  2. Repeating steps 2,3,4,5 for another LBC: we choose another piece of data set. It is possible to do this in two ways: shifting the LBC forward for a few price bars or setting it in a random way. We use the second algorithm because it allows to exclude some artifacts (see Artifacts Study).

  3. Sample Size = 200: we repeat this procedure 200 times (200 LBC x 3 training interval x 3 testing intervals).

  4. Null Hypothesis: we state that if this system does not produce the real forecast, the average correlation between the projection line and the oscillator should be equal to zero (see Artifacts Study for more details), but the numeric experiment rejects this statement.

  5. Statistical Results: The  numeric experiment shows that the best projection line is produced by the Neural Net that uses 5000 price bars (20 years of price history) as a training interval. This model provides the forecast for 7 price bars ahead. The average correlation is 0.1125. The positive correlation has occurred 124 times while negative one has happened  76 times. The control group (null hypothesis) should give these results: 100 times for a positive correlation and 100 times for a negative one. The Pearson chi square criterion states that, with the probability of 98%, this result is not accidental.

  6. Detailed Analysis: For more close forecast (5 price bars ahead), it is possible to use only 1000 price bars (4 years of price history) to train the Neural Net. In this case, the average correlation is 0.1019 (121 times the correlation is positive while 79 times it is negative). You can also use the simple linear model for this configuration (1000 price bar training interval/5 price bar forecast). The linear model provides the average correlation 0.060 (113 times for the positive correlation and 87 times for the negative one). But it looks like the Neural Net provides the better forecast. It means that nonlinear effects are important here. Practically, it means that for the future price movement the combination of price bars provides more effect than the price bar itself.  To predict the future price, the combinations like "morning star formation" are more powerful than any single bar.  

 

Artifacts Study

There are three artifacts to be considered:

    1) Seasonal Cycles: this effect is excluded because we analyze the short terms price movements (7-10 days). As a target function, we use the very short term oscillator with the period of 5 days. To exclude any questions regarding seasonality, we provide the additional Back Testing for the Spectrum model. In other words, we have calculated the spectrum using %x price bars before LBC, extracted the most strong cycles and created the projection line based on these cycles. The average correlation is 0.051. This is all that cycle based models can provide regarding 7 days ahead forecast. Compare it to the Japanese Candlestick model that provides the average correlation 0.112. It means that with probability at least 95% the Japanese Candlestick model reflects more than any fixed cycle model (t-statistic, t=1.8). The statistical report for spectrum based model is here.  

    2) Normalization of the LBC: one more source of artifact lies in the algorithm of choosing the Learning Border Cursor (LBS). Sometimes the future of stock market movement is obvious without any sophisticated math. For example, if the price goes today up 4% (due to some fundamental factor), the next few days the stock market will be not so active. See the efficiency test calculated for DJI (1970-2004 data set) for the event "the price goes up at least 4%": 

In other words, this LBC itself contains some information regarding the future price movement. To exclude this effect, we provide the special algorithm of choosing the LBC (it is called "a normalization of the Learning Border Cursor").  But the numerical experiments show that we can neglect this effect as both algorithms (with and without the normalization of LBC) provide very close results.


3) Why we apply the Pearson's correlation as a measure of fitness between the price and the forecast: Generally speaking, the market movement  is not normally distributed. But in this particular task we analyze the short term price movement. As an inputs for Neural Net, we use the short term oscillator with the period=5 price bars. It is represented by a red curve:

This oscillator: a) reflects the short term price movements; b) avoids the unpleasant effects related to the non normal distribution.

Here is the histogram for this oscillator:

      

 

Parameters to Vary

Right now, we recommend these parameters to vary:

The auto regression order:

Practically it means the time of impact of any particular price bar.

Another parameter to vary is the number of hidden neurons:

Practically, the bigger number of hidden neurons means the possibility to reveal the complex figures. 

 

February 24, 2005

Toronto, Canada

© Timing Solution