CMSC 150 - Fall 2016 - The Dark Side of Wall Street: November 2016

Friday, November 25, 2016

Autonomous Trading: Language Wars

In my 7th blog post, I talked about a man named William Mok who created an algorithmic trading system unlike any other (or so he claimed). But today, I want to talk more about how a beginner would approach picking a platform for actual implementation. With a plethora of languages to choose like the cult classics (Python, R, Matlab, SAS) or newer languages with a lot of potential (Julia, Scala, OCaml), it can be difficult to pick a platform for both analysis and implementation. It is first important to think about what part of the trading development process we are concerned with (Data Pipeline vs. implementation/brokerage API).

Today the focus will be on the data pipeline; how are we going to bring in, clean and analyse data? Languages such as R, Python and Julia are useful to bring data for analysis. R also has many libraries specifically made for finance such as quantmod (used for data cleaning, financial forecasting, and plotting tools) or quantlib (option pricing functions).

The issue with many of these languages (R included) is its scalability. When dealing with commercial use of trading algorithms (think billion dollar quantitative hedge funds and designated financial market makers), these algorithms do not perform with the necessary latency as is the problem with many higher level languages. One reason why this is the case is due their garbage collector features (a feature many programming languages have that automatically manages memory no longer used by a program). The solution to the low latency problem would be to use C++ instead but it is by no means an easy language to learn and even harder for rapid prototype building (important when developing financial systems).

My personal favourite right now is also gaining traction as the data analysis language to use. Julia is a relatively new language (first appearing only 4 years ago) but claims to combine both the speed of C++ with the data analysis capabilities of popular languages such as R or Python. One reason why this is possible is because Julia's core is implemented in C/C++. Below is a picture of its relative performance against other languages (where C's speed is benchmarked at a value of 1) where smaller is better.

Fortran	Julia	Python	R	Matlab	Octave	Mathe-matica	JavaScript	Go	LuaJIT	Java
	gcc 5.1.1	0.4.0	3.4.3	3.2.2	R2015b	4.0.0	10.2.0	V8 3.28.71.19	go1.5	gsl-shell 2.3.1	1.8.0_45
fib	0.70	2.11	77.76	533.52	26.89	9324.35	118.53	3.36	1.86	1.71	1.21
parse_int	5.05	1.45	17.02	45.73	802.52	9581.44	15.02	6.06	1.20	5.77	3.35
quicksort	1.31	1.15	32.89	264.54	4.92	1866.01	43.23	2.70	1.29	2.03	2.60
mandel	0.81	0.79	15.32	53.16	7.58	451.81	5.13	0.66	1.11	0.67	1.35
pi_sum	1.00	1.00	21.99	9.56	1.00	299.31	1.69	1.01	1.00	1.00	1.00
rand_mat_stat	1.45	1.66	17.93	14.56	14.52	30.93	5.95	2.30	2.96	3.27	3.92
rand_mat_mul	3.48	1.02	1.14	1.57	1.12	1.12	1.30	15.07	1.42	1.16	2.36

Figure: benchmark times relative to C (smaller is better, C performance = 1.0).

More details regarding the specifics of these performance tests can be found in my image reference.

Writing References:
http://www.kdnuggets.com/2014/08/four-main-languages-analytics-data-mining-data-science.html

Image References:
http://julialang.org

Friday, November 18, 2016

Machine Learning: Deep Learning

Another field of research that has grown increasingly popular in ML is deep learning. It is a supervised learning technique but has unsupervised learning features. They are useful particularly for tasks where an observation or basic unit has little meaning in and of itself but a collection of these units or certain combination has very useful meaning. These come from its ability to learn from the data that is passed to it. It is inspired by the structure of a brain much like neural network, where each node or neuron has a set of inputs with specific weights and computes some function (such as linear but can be anything we want really). The network is created when we start connecting neurons to each other, the input data and the outlets (where we store the analysed information, the "answer" to our problem). Deep networks have a specific number of neurons in each layer where each subsequent layer is able to learn the best possible representation for data from the previous one. Take the example below of a deep neural network for facial recognition:

The figure on the right shows what a simple neural network
would look like. We have the input nodes(or neutrons), a hidden layer of neurons that process the data, and finally the output nodes that essentially give us a some sort of solution.

Image Reference:
http://stats.stackexchange.com/questions/114385/what-is-the-difference-between-convolutional-neural-networks-restricted-boltzman

Writing Reference:
http://www.kdnuggets.com/2015/01/deep-learning-explanation-what-how-why.html

Friday, November 11, 2016

Machine Learning: Bayesian Networks

Last week, I introduced the topic of Machine learning and how it was possible to train algorithms to learn from a set of data. A common technique used for supervised learning is a bayesian network. Simply, a tool to analyse large sets of multivariate probability models to discern relationships between models. What is most interesting about bayesian networks is that it combines both quantitative analysis and user intuition to "learn". These may come in the form of graphical representations such as the network seen below.

Although it is not often accurate to model stock price behaviour in this way, for the sake of simplicity, let us consider the network below. Each nodes represents a random variable. The arrow pointed from stock 1 to stock 2 and stock 4 essentially mean that stock 2 and 4 are conditionally independent given stock 1. Given a movement in the price of Stock1 (increase or decrease in price for example), if the price of Stock2 moves, it does not tell us anything about how the price of Stock4 moves. It is easy to see how extending the model to only 5 stocks starts creating fairly complex networks.

In recent years, the quantitative hedge fund industry has shifted away from using traditional predictive tools in favour for adopting more complex models such as bayesian networks and other machine learning algorithms in the hope of achieving a deeper understanding of inter market relationships. They then use these patterns to trade in inefficient financial markets. The main goal of algorithmic trading is to find statistical anomalies in the data and determine profitable ways of exploiting them. It has become increasingly difficult in the past decade however, as the rise in algorithmic trading participants have removed many of these market anomalies, making many of them unprofitable.

The potential of ML algorithms such as bayesian networks in the finance industry extends much beyond this particular example. It can be recorded to detect fraud, assist in risk management and even create models to price insurance policies.

Writing References:
https://kuscholarworks.ku.edu/bitstream/handle/1808/161/CF99.pdf;jsessionid=8EE7C78BF26BD0F349409F5AEFFE91EE?sequence=1
https://www.ics.uci.edu/~rickl/courses/cs-171/cs171-lecture-slides/cs-171-17-BayesianNetworks.pdf

Image Reference:
http://www.qminitiative.org/UserFiles/files/S_Clémençon_ML.pdf

Friday, November 4, 2016

Machine learning (A brief Introduction)

Machine learning is what we call the process of algorithms to learn from large amounts of data without being explicitly told to do so (as in from a rules based framework). One of the reasons why I'l be devoting a lot of time to talking about this concept is because the field of artificial intelligence has become much more of a reality with the recent innovations in computing power like the GPU (graphics processing unit, known mainly for their ability to make parallel computing faster, more efficient and even more powerful).

In my last post, I talked about Tech Trader Fund, a trading system developed based on multiple layers of artificial intelligence (one of which was a form of machine learning). Professor Tom Mitchell's definition of machine learning best encapsulates the process:

"A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance E on T, as measured by P, improves with experience E. "
--Tom Mitchell, Carnegie Mellon University

The two broad fields within Machine Learning are supervised (trained on pre-defined set of data) and unsupervised (program is given data and must generate relationships with little-to-no guidance) learning. Machine learning is a form of supervised learning, and its usefulness is primarily seen in analysing very large data sets (with millions of variables one would potentially want to study and see relationships between) where traditional computational methods are no longer feasible.

Some extensions of Machine learning include decision trees, induction programming, bayesian networks and reinforcement learning, which all help the system learn from the data passed to it. But these frameworks for development do have its drawbacks. Even today, it is still prone to a wide variety of errors which need a significant amount of computing power to overcome. For example, financial markets tend to be very noisy, especially on a day-to-day timescale where volatility is multiples higher than average returns. High noise environments would potentially need overly complicated models to find relationships between asset prices for example (or risk overfitting). If the model is too simple, it may just give us plain wrong or misleading results.

Next week we will dive in to how traders in financial markets can more specifically use ML algo's to create an edge for themselves in what many are calling a crowded space.

Writing References:
https://www.toptal.com/machine-learning/machine-learning-theory-an-introductory-primer

Image References:
http://www.nsightfortravel.com/wp-content/uploads/LEARNING-MACHINE.jpg