Beyond Trading System Optimization

As anyone who has ever tried to do optimization on automated trading systems will tell you, there are many problems. The optimal numbers from the past almost certainly won’t be the optimal numbers for the future. In fact, every data set you test will produce different numbers. Why then would you do system optimization?

We believe that the most important output from these efforts is the discovery of relationships between decision factors. It’s not the price you pay when you buy or the price you get when you sell that determines profit, it’s the difference between the two.

That means that the most important relationships we are looking for are the relationships between selection factors and exit factors. We like using response surface graphs to show these because they show the relationships clearly, and the general picture they show differs little when we try it on different data sets. This leads us to believe that the relationships are robust even though the optima are often simply coincidental.

Response surface graphs also show which settings are clearly dysfunctional. It becomes pretty clear after looking at just a few sets of results that there is a range of settings that might work, and a range that is clearly disastrous. This is true for both the individual factor settings and for the relationships between them. Note that findings of dysfunctional settings tend to be far more robust than findings of optima. Knowing what to avoid doesn’t tell you how to do the job right, but there is clearly value in avoiding known disaster area.

In addition to interactions between selection factors and exit factors, there are also clearly interactions between various selection factors and interactions between various exit factors. For example, an either OR rule for selection factors means that you get into more trades while an AND rule reduces the number of trades. We know that without performing any tests, but more trades or less trades is not the critical answer we need. We need to know whether an OR rule gives us a higher winning percentage and greater profit, and whether an AND rule gets us into more trades that lead to more profits.

The same is true with exit rules. Do they work well together or do they get in each other’s way? Generally speaking, there is a desired exit process and a set of fall-back exit processes. You don’t want the fall-back processes to unduly interfere with the desired process, but you can’t let your primary exit process to be so dominant that the backup plans don’t go into effect until it is too late. Response surface graphs show these relationships very clearly.

Not all of the critical interactions are two-factor. There can be three, four, or even more factors that are working together. The more complex an interaction, the less likely it is to be robust, but many three-factor reactions are robust. Showing three-factor interactions is a minor problems that we address currently with animations. Here is an example.

We show the most powerful factors on the axes. Here this is the critical selection factor and the primary exit factor, and that is generally the case. The third factor, the backup exit strategy, we show with the animation. The animation shows that the third factor changes the degree but not the nature of the reaction between the first two factors. This presents a fairly clear picture of what is going on in the trading system, and we can see that these pictures will look very similar if we apply them to a variety of data sets.

It is important to see whether the charts you get out of the data support or contradict your basic theories. The basic theory here is that a move in one direction is most likely to be reverse if it swings far enough, and most likely to make a solid profit when you are looking for the swing back to be half as large as the initial move. The size of the swings varies considerably, but the relationship between size of the initial swing and size of the secondary swing is pretty consistent.

The fallback exit factor is of less importance. By itself, it never determines whether the system will win or lose, but does affect how much it will win or lose.

What settings should we choose for our working system? It’s still something of a crap shoot, but we have a reasonable idea of the direction to shoot in. The chances of picking the best values is only slightly more than zero, but the chances of picking pretty good values is very high. There may be other response surfaces that you can look at like days in position or winning percentage that will push you in one direction or another, but you still are moving from analysis of the past to prediction of the future. The stock market will never be a chemistry problem.


A Brief History of Our Application of DoE to Automated Trading Systems

Although we have been believers in Design of Experiments for nearly twenty years and builders of automated trading systems for nearly ten, we never put the two together until recently. A designed experiment simply our trading system would have taken years as recently as three years ago. As a result, we did like everybody else does. That is, we set our system parameters using intuition and experience, and tweaked those each time we reviewed the findings from any test run. We made significant progress in that way, but it was plain to see that we could learn more faster.

We started applying Design of Experiments to automated trading systems about a year ago, but the tests were taking a long time. We were working with intraday data, and that is really dense. It took many months to complete a study that used just two years of market data. Even at that cost, we were able to run a couple of studies and became convinced that the approach was valid. Unfortunately, the systems were proprietary and secret so we were having a little trouble talking about our approach in any detail without betraying a confidence.

A couple of months ago, Ernest Chan suggested that we test something from the book Short Term Trading Strategies That Work. We needed to talk about something other than our own systems. There were two major surprises.

The first was the speed of the tests. Our tests for the options trading systems that we built took six hours for each trial for each expiry (month). An eighteen month study takes about three months running ten of these tests at a time in parallel. The strategies in the book used only end-of-day data. As a result, a test on a twenty-year set of data takes one quarter of a second! We had the study completed, and article on it accepted by a publisher in less than a week.

This is mind-boggling because in most experimentation the cost of the trials is the overwhelming majority of the cost of the experiment. The challenge of most of Design of Experiments is to get as much information out with as few trials as possible. One of the problems with our business model when we built Strategy was that there very few people who needed enough experiment designs to keep us in business because they would get a design, take months or even years to run the experiment, then need our help or software for about two hours to interpret the results. When the trials are free and run this fast, the design of the experiment and the analysis become significantly more difficult and time-consuming. That’s fine with us because we have some unique advantages there.

The response to our efforts was the second surprise. We had never heard of this book, but it turns out to be very popular in the trading community. Many people have read, and those that haven’t, had at least heard of it. Many people had a common basis for understanding our work, and we are starting to have real conversations about applying this technology. We are moving on to models that we couldn’t even consider when the cost of trials was an issue.

A Thousand Trials

Usually, when you run an experiment, the primary cost of the experiment is the cost of running the trials. Designing the experiment is usually fairly simple, and never very expensive. The analysis requires some moderately specialized skills, but you can usually afford the analysis if you can afford the trials. Now, for the first time in our experience, we have discovered a field of investigation where the cost of the trials is almost nothing. This has profound side effects that we did not anticipate.

We should have because the most complex experiments were always run by people who had the least expensive trials, but even testing print cartridges makes a noticeable dent in a budget when you want to run a hundred trials. At the very least, the tests will still take hours, perhaps days, to run.

When we started applying Design of Experiments to financial engineering, we were strictly interested in intraday trading systems. Testing these systems had a moderate cost. For example, if we wanted to run sixteen trials on three years of data, it did not cost us much money, but it did take several months. That has a significant cost when you want to know the answers right now.

We wanted to talk about those experiments, but a number of factors prevented that We looked around for another system that we could experiment on. Ernest Chan recommended that we look at systems in Short Term Trading Strategies That Work and High Probability ETF Trading, two books by Larry Connors and Cesar Alvarez. As Ernie correctly perceived, Connors and Alvarez described systems that lent themselves very well to a Design of Experiments. We came up with lots of ideas for experiments right away, and we picked one out to try it out.

My partner Ron is very fast at this kind of thing, and he had a simulator ready to run the test after about a day of work. He had to organize the data because we had never worked with an end-of-day system before. Then he started running the test. We knew they would go quickly, but they only took about a quarter of a second to run a trial on fifteen years of data. It took far less than half a minute to run 56 trials.

This is as close to instant gratification as an experimenter will ever get! It took a few hours to analyze the results, but it was essentially “same day service”. You usually have months separating the planning of the experiment and the analysis.

Immediately, we began to think of more complex experiments looking at more factors and using higher order models. Since the higher order models required more trials, we rarely even thought in that direction. The cubic model was the highest order model supported by the software we were using, and we very rarely had the opportunity to use that. Most people are looking for experiment designs with less trials, not more. In fact, the point of most experiment design software on the market today is to get the most information out of the fewest number of trials.

Notice how the goal changes when the cost of the trials approaches zero. The goal in this environment is to get the best possible predictions without regard to the number of trials it requires. Another fifty trials? Another hundred trials? How about another thousand trials? We don’t care when the trials are free.