How To Develop An AI Trading Strategy System


29th March 2018

Background

There was a time when trading strategies were abundant and easy to find, a time when a lone trader could come up with a strategy, trade it on the same day and make money from the comfort of their bedroom.

Unfortunately, times have changed, and with them has the opportunity landscape. Just as the oil rush in the United Stated in the late 1800's; back then, with an abundance of oil in the ground, anybody could get together a few simple pieces of equipment and start drilling for oil. But as the easy-to-reach oil got consumed, getting the same yield of oil got more difficult. To drill for oil required one to have ever more complicated and expensive equipment and a tight production process to minimise waste. Skip forwards to today's world and the difficulty of oil drilling has grown exponentially. These days the oil wells are so hard to find and access, and the margins so tight that you need a professional production process with hundreds of millions of dollars in equipment, research, knowledge and development that it would be laughable for an inexperienced 2-man company with a 50ft drill to try and find oil in today's landscape.

The above scenario however is a stark echo of what has been happening in the investment alpha-generation world over the last 20 years. Back then it was reasonably simple for individuals or small teams with no budget to find alpha generating strategies using simple statistics or old-fashioned econometrics. Today these opportunities are far more granular, their yield considerably lower, and the process required to find and implement them considerably more rigorous.

We are however finding that many people are still looking to go down the old-fashioned routes of alpha generation, believing that with a good investment idea or a single clever algorithm they will be able to easily generate profits.

This article explains how, as with the process of drilling for oil, a dedicated and robust framework is required to harness alpha from today's markets. This framework is the ecosystem required to find, develop, test and execute trading strategies in a profitable manner. We will specifically focus on the creation of artificial intelligence/machine learning trading strategies as these are ones that we believe have the power to be futureproofed.

The Process Chain

Let us address the first and biggest difference we've seen between amateur investors and professional investment companies (asset managers, hedge funds and proprietary traders):

Amateurs seek to find and profit from that one big idea, the one secret formula for profit. Professionals on the other hand do not spend their time hunting for single ground-breaking strategies, instead they spend their time hunting methods to mass-produce investment strategies.

Let us say that again, because it is the single most important advice one can heed when looking for alpha: successful professionals do not develop individual strategies, they develop the methods for mass producing individual strategies. This is an incredibly powerful idea and concept and one that can be found in many successful businesses as well as successful natural processes that have survived and evolved over millennia.

The concept of having a single individual point of success is something that naturally appeals to the human brain, a singular source of success strikes one as being simple to reach as well as easy to measure. Unfortunately, the reality success is directly orthogonal to this approach, very few things can be accounted purely due to a single ground-breaking idea or strategy. Google's PageRank algorithm, developed in 1996 when Google was still a tiny growing company, was the first of its kind and is what many people instinctively think of that caused the initial and ultimate success of Google. What people fail to grasp is that Google's initial success was only partially down to their work on PageRank and in-fact the surrounding process chain is what ultimately allowed PageRank to make such a vast impact in the world of search engines.

Nature has many examples of the kind of process chain we are talking about here as well, from mass evolution on a high level, to the humble dandelion. The dandelion flower is a great example, it could have evolved to spend most of its living life creating one perfect seed, the holy grail of seeds, a seed that germinates in all conditions and deals with all adversaries. However all it takes is for that one seed to land in the wrong place, or be defective only one time and the flower will cease to live on. Even if this seed is able to plant and germinate every time, ultimately there will always be one flower for one seed, so the rate of dandelions will never increase. Now consider what nature has actually done with our humble dandelion. Instead of placing its bets on one seed, a single dandelion plant is able to produce around 2000 seeds! Using the laws of probability that gives a far greater chance of some seeds germinating, even if only 1% of seeds germinate, that will still produce 20 plants. And using the law of exponential growth that also means the original plant that produced 20 new plants now has those plants producing new plants of their own. If we assume a small 1% success of a particular seed germinating that already gives 20 plants by the 1st iteration, 400 by the 2nd, 160 Thousand by the 4th and 64 Million by the 6th iteration of this chain!

Clearly then, one should not seek to pour their efforts into one individual strategy, but instead seek to mass-produce strategies within a robust framework. This requires an investor or investment company to utilise a repeatable production line style process in which an idea can be consistently generated, developed, tested and executed in a uniform and hence dependable way.

The sections below will outline each stage of this production process in the reference of creating successful quantitative trading strategies. Each stage can be thought of as necessary independent function. This function could be performed alongside several functions by a single person in a small company, or each function can represent a full 50+ people strong team, as can be the case in larger hedge funds. The key is that these functions, whilst forming part of the overall process chain, are independent in their own right.

Data Gathering

The basis for any machine learning strategy is data. Often overlooked by novices who have the thought of a great machine learning algorithm, but do not consider the data implications required, both in terms of data volume and data quality. This function is responsible for gathering, cleaning, transforming and storing large amounts of relevant data. This is the first important stage in the process chain and a mistake here can echo false or misleading results up the chain, so the team at this stage must be experts in understanding the data and the data sources (markets, vendors, exotic data sources etc.). Understanding how the data is derived is as much of a key as gathering the data. For example: equity prices can have splits that are not obvious in the price, futures and options are subject to quarterly rolling and each vendor has their own slightly different ways of calculating different fundamental factors which must be taken into account and echoed up the chain as it will influence the outcome of the strategy process.

The team members in this function will usually be experts (BSc/BEng, MSc/MEng level) in dealing with raw data and storage solutions and will usually be from the fields of computer science, physics and IT.

Feature Detection

This is where the first part of machine learning comes in, feature detection takes the data from the first stage (which should now be in a reasonably clean and accessible format) and manipulates it to extract hidden information in the form of features or signals. Feature detection can either be a black box model (done by a machine learning algorithm itself) or a manual/semi-manual process by a team of signal processing experts.

The output of the feature detection stage is either stored feature-tagged data, or in itself a process for extracting features from data on the fly. The features extracted at this stage are further refined and used upstream to feed into the investment strategy/algorithms as signals for decision making. It is important to note that features/signals on their own have no value until placed into a strategy and successfully executed. Hence at this point no strategies, models or algorithms are yet formed.

Team members in this function will consist of experts (MSc/MEng, PhD level) in signal processing, information theory, machine learning and capital markets and will usually be from the fields of computer science, electronic engineering, physics, maths and statistics.

Strategy Creation

This function is where what most people think of when envisaging creating a trading strategy. With the data collected, cleaned, normalised and the features analysed, extracted and stored in an accessible way the creation of a strategy hypothesis can be started. And that is what this function does, formulate hypothesis models using the available features combined with the team members intrinsic knowledge of the market the hypothesis is being formulated for.

The features collected in the previous stage usually indicate some kind of signal and effect. The goal of strategy creation is both to explain the effects observed by formulating a general theory and to create a methodical way of monetising these effects. In the world of AI/ML trading strategies this can mean creating a model which has been trained on a large dataset of features (which themselves could have been extracted by a AI/ML model) which will then output a result that can be interpreted by an execution layer; either a human to trade on behalf of the machine (the quantamental approach), or directly into a separate automated execution engine to place automatic trades (the algo trading approach).

The results of this model could be as simple as showing the relative strength probabilities of a basket of assets, issuing alerts for particular asset events, or issuing a trade directly. This function can have several layers in itself, from a self-contained system to a pure model, interpreter, execution layer and position manager bundled up as one final model depending on how automated the strategy will be.

Team members here will be experts (MSc/MEng, PhD, Prof. level) in data science, economics, computer science, maths, statistics and capital markets and will be from the fields of economics, maths, statistics, computer science with relevant backgrounds in the markets.

Backtesting

Once a strategy hypothesis is created that is deemed worthy of further testing and deployment it is created as a proof of concept prototype and passed onto the backtesting function. This function takes the prototype and tests it using various methods and data to get a performance indication of the strategy and ultimately get an indication of how profitable this strategy might be.

One common misconception is that backtesting analyses only the strategy performance in a historical setting, using historical prices to see how the strategy would have performed had it been deployed in the past. Whilst this is certainly a large part of backtesting, expert backtesters will know that historical performance is not necessarily an indication of future performance, and hence will test it using more complex assumptions and scenarios to find the strategy's strengths, weaknesses and limitations.

The results of the backtest are analysed, to check for performance, the problem of overfitting and any other inconsistencies, usually by senior management or the head portfolio manager and this drives the decision whether to deploy the strategy further or scrap it.

The team members in a backtest function consist of experts (MSc/MEng, PhD level) in economics, data science, applied maths, statistics, computer science and physics and will be from the fields of operational research, maths, statistics, economics, computer science and physics.

Productionisation

Once a trading strategy is approved for productionisation it has to be refined to be a robust system. The proof of concept developed by the strategy creation function might have held together in backtesting using clean and well-refined data, but when real capital is involved the system has to stand up to adverse market events, bad/incorrect data, speed requirements, memory requirements and other real-world challenges. This is where the prototype model, held together by duct tape so to speak, gets turned into a shiny carbon fibre workhorse. This usually involves a lot of heavy mathematical optimisations, top of the line programming principles and heavy development work followed by careful integration into the current trading system (be that co-located trading servers, or an external production system hosting other trading strategies already running).

With AI/ML trading strategies in particular this could also involve setting up high-performance systems to periodically re-train the models with the latest market data so that the models are constantly improving and learning from new market conditions. As the process of re-training is computationally heavy and expensive this is usually piped to a separate system, either internal or a cloud outsourced API such as the Altum Intelligence Platform or the more complex Google Machine Learning offering to train models and serve real-time predictions from them.

Team members in this function consist of experts (BSc/BEng, MSc/MEng, PhD level) in computer science, software engineering and maths from those respective fields.

Portfolio Management

Finally the trading strategy is ready to be deployed in all its glory. But despite all the backtesting and careful development there is still a risk the strategy might perform badly in real-world conditions. This is why the portfolio management function employs a lifecycle approach to running a strategy. The lifecycle looks roughly like this:

Demo Trading

The strategy is deployed in the real world, however all trades it places are not real capital trades. The strategy itself doesn't know any different but the trades it places are not submitted to market but monitored internally for their performance.

Staging

Once satisfied with the demo trades a strategy is given a staging allocation of capital. This a minimal capital allocation which is closely monitored for performance. If at this point the strategy turns out to perform badly in the real market with real capital it is pulled with minimal capital loss.

Rebalancing

Over time, if the performance of a strategy is within reason it is reallocated more capital until the portfolio managers capital balance requirements are satisfied (or until the strategy is capital saturated to the point where more capital would start to affect its performance).

Performance Monitoring

The portfolio manager now monitors the strategy, alongside their basket of other strategies for constant performance feedback. Overtime the performance of the strategy will vary and eventually will start to decline. As this happens the portfolio manager will constantly rebalance the capital allocation to the strategy in respect to its performance.

Termination

Eventually the performance or capital of the strategy will decline below acceptable limits for a sustained period of time and this will signal the strategy's end of life. At this point it will be terminated and shelved. All strategies eventually end up at this stage due to constantly evolving dynamics of the real world. Be that market behaviour changes that no longer match the strategy model, regulatory changes, advances in technology making a strategy obsolete or competitors adopting similar and better strategies.


The process chain for a trading strategy lifecycle

Conclusion

By now you should have the basic high-level grasp of what it takes to create a trading strategy in today's landscape. Whereas several decades ago many of the above stages were not required, or required to a very minimal amount, these days each stage is a vitally important component in the creation of a trading strategy system.

Furthermore as mentioned initially, this process is set up to be a constant production line of trading strategies. Trying to work on making that one miracle trading strategy will prove fruitless, as even if a successful strategy is found and implemented, it will eventually succumb to the strategy lifecycle and its performance will decrease overtime. Hence a constant iteration of new trading strategies should be in various stages of creation at all times. This is exactly what the process chain allows; rather than the production of one strategy, it gives the tools and framework to help automate the creation of quantitative and systematic trading strategies.

A final note: what is covered here is just a small high-level subset of everything that goes into creating a successful investment company, the core strategy process one might say. What is out of scope in this article is the further ecosystem that goes around managing a successful investment company, from the risk layer, to individual IT systems to fund accounting and other processes that make up an investment manager as a whole.

We hope this article helps fill the void between what non-professionals believe makes up a trading strategy system versus what is actually required for success.