Beware of overfitting

Knowledge versus Information


This article represents the dialog between three persons. Each person has his own interests. Let me introduce:

User - this guy trades, and he is interested n getting the trading techniques that really work;

Software Developer - he makes the software for User; he interested n selling as many copies of the software as it is possible;

Mathematician - he has a good proper education, makes his money educating others in the field of mathematics and has no personal interest in this situation, his role in the dialogue is justice. (My personal position is presented by this person, though I really do the software for you.).


User: There is a module in Timing Solution software called Trading Strategy Constructor. It looks for optimal moving average crossover for any financial instrument So far it analyses simple and exponential moving averages. What do you think  about the possibility to improve this module adding different variations of moving averages? There are at least two dozens variations of moving averages that I have heard as promising ones. Thus we can adjust a special moving average for each financial instrument.

Software Developer: This is a very good idea! Right now the system is based on two variations of moving averages and it allows to analyze 1.000 moving averages crossovers. Adding more moving averages, we can analyze 20.000 combinations of moving averages crossovers. Thus in my next advertising I can write:  "...the program analyzes 20.000 different trading strategies and finds the only one which fits to your financial instrument the best way."

Mathematician: You should decide yourself what you are looking for. If you are looking for the system that was able to provide a big profit for some specific time interval, it is one story. If you are looking for the system you can trust and make some profit with it in the future, this is totally another story.

Developer: How come? The program analyses 20.000 strategies and finds the only one that gives very good equity curve for 10 years of price history. It worked extremely well for 10 years, all 10 years one-by-one. Why should not it keep working the next year as well? 

 Mathematician: The main issue here is 20.000 strategies; it is too many of them. When you increase the amount of moving averages, you actually increase the degrees of freedom of your system. In other words, you increase the amount of possibilities to vary your system. Is it good or bad?

It depends on available price history. In your particular case, we have 10 years of daily price history, i.e. 2.500 price bars. To model these 2.500 bars, you have 20.000 models, or for every bar you have 8 models.  In  Neural Net technology they call it "memorize effect". It means that among these  20.000 we definitely will find the model that describes our 2.500 bars the best way. And this is not because the model is good,,  this is just an occasional fact.

We call this phenomena over fitting.

User: How to avoid over fitting?

Mathematician: Always watch over fitting factor, this is the ratio between the amount of analyzed models and the amount of price bars in your data:

Better keep this value less than 30%. If this value is higher than 100% we can face the overfitting effect.

Another method is walk forward analysis. Try to conduct your analysis using a part of available price history, i.e. 8 years instead of all 10 years of price history in our case. In other words, find the model that works the best way since 2000 till 2008 year. After that try this model for the next 2 years, i.e. check how that model works in 2009 and 2010. This parameter is called Walk Forward Efficiency.

Usually over fitted models work very well on so called optimized interval (years 2000-2008 in our example) and do not work on the out of sample interval (2009-2010).

This is the weakness of the most of available trading systems: they provide a lot of strategies, they keep their promises of having a special strategy for each financial instrument. These strategies might work well in the past. And the mistake of traders is believing in the ability of that strategy working with the same efficiency in the future. This is the problem of all complex systems: the creator thinks that he/she improves the workability of his/her system, while this is nothing but playing with Chaos. The complex system make gates to Chaos much more wider, and you should be very cautious with that. 

Developer: How could we know what moving average works better?

Mathematician: To avoid the mess, it is better not to analyze these moving averages all together in one box. Simply try them one by one. For example, start with the simple moving averages and find 10 best strategies based on the simple moving averages crossover, and calculate the average performance for these 10  best strategies. Do the same for all moving averages. And watch what moving average (i.e. simple, exponential, adaptive ..) gives better result. You should analyze 10 best strategies to avoid the occasional error.