The primary cost of doing most trading system research is the cost of the time. Time is used to plan an experiment, to run an experiment, and to analyze an experiment. When you need answers in two days, you simply cannot use an experiment that takes six weeks to run.
In an ordinary research environment, the time to design an experiment uses a tiny fraction of the time. Analysis takes slightly longer. Time to set up and execute the experiment takes far more than ninety percent of the total time in the process. When a problem requires an answer in significantly shorter amount of time, the only hope is to reduce the time it takes to execute the tests. If you need an answer within two days, you need to reduce the execution time for each trial to about a minute or less.
You can get incremental improvements in the speed of trading system tests by getting faster and faster computers, but the real secret of speeding up trials is to consider less data. Each piece of data requires processing, even if the process is simply to ignore that piece of information. Do you need or even want ten years of data if you are testing a system that will trade a thousand times a day? Do you need six months of test data? Can you find what you are looking for in the last week’s data?
A week is the shortest time frame that really makes sense in the context of the problems we have looked at, but your problem might allow even shorter time frames. If you are talking about tick data, even a common PC can process a week of tick data is less than a minute. While you might be able to find your answers with even less data than this, it is hard to see why you might want to.
Once you get the total execution time of an experiment down in the neighborhood of a single day, the dynamics change. If you are taking a couple of days to design the experiment and a week to analyze it, that becomes the place to save time. An experiment that takes six months to run demands two days of design and at least a week of analysis.
Any environment like high frequency trading that have severe enough time demands to drive the setup and execution time for experiments down to a scale of hours will also demand rapid design and analysis.
More rapid design is fairly easy to achieve. The painful and tedious cost of most experiment design is deciding on the compromises. You need to compromise because of the cost of executing each trial of an experiment. For a trading system, the cost of executing these trials is strictly the cost of the time. Once that problem is solved, additional trials do not impose additional costs. Actually generating a design once you have decided on the goals of the experiment only takes a few minutes with today’s tools.
Some research organizations devote weeks or months to analysis. This is entirely justified when an experiment takes six months to run because you are obligated to wring every possible bit of information from each experiment. Experiments using trials that have little or no costs to them generate far more data than experiment
However, everyone who has ever analyzed an experiment will tell you that some results are obvious at a glance and other results are only obvious after a particular perspective has been imposed on the data. Finding an answer to a particular question is usually not that time consuming. Primary analysis does not have to take days or weeks. Even the mountain of data generated by a high speed experiment can be parsed rapidly if that is done with a narrow focus.
Secondary analysis still can and should be done. Unexpected results are often only found in the secondary analysis, and discovery is all about unexpected results.