GIGO versus Back Testing


There is a phenomenon in the human activity that comes through centuries. The problem that philosophers have faced in 14th century is pretty much the same as  the problem introduced by Hegel in 18th century. And it is pretty much the same problem that mathematicians face with while modeling the stock market behavior in the beginning of 21st century.

What is the problem?

In 14th century, the philosopher Ockham formed it as "Entities should not be multiplied unnecessarily" (it is called "Ockham's Razor" principle).

In 18th century, Hegel used the definition of "Bad infinity" (schlechte Unendlichkeit).

In 21st century, the specialists in Neural Networks technology and programmers often use the definition "Garbage in, garbage out" (briefly GIGO principle).

We will use the last definition while modeling the stock market behavior. The most frequently asked questions here are:

"Why do we know so much about the stock market and have no luck in forecasting its behavior?"

"Why do we have so many techniques/programs/books about the stock market and still have no luck in forecasting it?"

"Why do the forecasting techniques that worked very well for the past fail to forecast the stock market"

In my opinion. the root of this problem is: people usually do not make the distinguish between the EXPLANATION and the FORECASTING. For some unknown reason, people mix these two totally different human activities.

Logics of the explanation

Put yourself in the shoes of the guy whose job is to explain the past stock market behavior from the astrological point of view. He starts this job using seven classical planets and five major aspects. This is about 100 aspects in total; let's call it "100 aspects theory". Monitoring them for several months, our guy finds out that classical aspects and planetary lines do not explain everything that has happened to the stock market during that period. So he adds other aspects from his arsenal: 30, 45, 135 and 150 degrees. This is already about 200 aspects. The life of this guy becomes a lot easier, now he can explain practically all major stock market movements. Things go well - till some sudden huge drop occurs (like in September 2008). Some explanation has to be found, and our guy is convinced to include slow planets (Uranus, Neptune and Pluto) and Chiron into consideration. Now there are almost 500 aspects in his arsenal, it is now "500 aspects theory". |From now on, this guy performs his task perfectly; any stock market movement is connected to something in the sky. And among thousands of planetary, there always is the one that explains it.

Everything looks so good. However, one new problem appears. The theory used by this guy explains too much, so when  the stock market is calm, the theory generates different alerts - because too many important aspects take place. Instead of being prepared for the real move, our guy has to react on every whisper. Too many signals confuse him.

Logics of the forecast

Now, ask yourself what is better: "100 aspects" theory that misses many important stock market movements or "500 aspects" theory that explains all important movements and also generates many signals while the stock market is quiet? It depends on what you are looking for. If you write a review of market behavior in the past, i.e. look back at each important day on the stock market trying to find the appropriate aspect, the "500 aspects" theory is better in this case. Your main goal is to have an appropriate aspect for the appropriate day, and "500 aspects" theory performs this task perfectly. 

Your approach should be totally different if you really need a forecast. First of all, you have to separate important and non important aspects. Several non important aspects can easily spoil your forecast, the "500 aspects" theory can play a role of  Pandora Box in this case. Including non important factors, we open doors for the Chaos.
Compare these things: a) adding one really working aspect to your model, you INCREASE the forecasting ability of your system to one point; b) while adding a non working aspect to your model, you DECREASE the forecasting ability to one point. Thus if your "500 aspects" model contains 200 working and 300 not working aspects, the forecasting ability of this system will be negative - minus one hundred points. This is a feature of noisy systems (that stock market belongs to): the effect of not working factors is negative, here his majesty Chaos plays his games with us. The more non working aspects we include in our models, the more our models look like the Chaos. This is what philosophers have written seven hundreds years ago, this is what scientists face in 21 century. This is the principle GIGO.

So the best way in creating a forecast that works is to analyze all 500 aspects one by one and keep only those (let say 50 instead of 500) aspects that really work. If you plan to apply your knowledge in the real life, you should be very picky and responsible for your statements. 


Practically every week I receive an e-mail that contains the information about some article/website where somebody proposes some technology to forecast stock market. And every week I have to write the same answer (and this is the main reason why I write this article). So below are my recommendations how to estimate the real performance of any ideas/models/theories (I mean models that provide mid and long term forecast):

a) First of all, simply ignore all statements about 80% and more accuracy of this forecasting system. Consider them not as an information, but as an alert. The best professional in this field, Doyne Farmer, claimed 55% only (do not be frustrated: the common sense and good risk management allow to increase this percentage, he was referring to automated systems), check this article: cracking_wall_street

b) Do not believe any "post factum" explanation. If the guy tells you in 2009 that his model has predicted all major turning points of the year 2005, ask him to provide the whole picture, ask him what his model has forecasted in the years 2006, 2007..  And the most important thing: ask him to provide a forecast for the next year (or the next 2-3 months), print it and watch it from time to time to figure out how it works. One,. two, three ... ten impressive examples from the past do not mean anything.

c) Apply non future leaks technology, see these articles


October 24, 2009

Toronto, Canada