Words like Machine Learning and AI and hot topics today. On one hand, we are generating data sets that are larger than our imaginations can grasp and the only way we can properly use them surely must involve supercomputing power. On the other hand, we read headlines about Facebook turning off their AI bots because they developed a language of their own and the humans could not decipher it. While bots that learn how to play video games on their own are still a novelty, machine learning methods are being utilized with increasing success in many industries that generate large amounts of data, such as FinTech.
Many large hedge funds are several years into using machine learning algorithms to model and predict market behavior. But can these same types of methods be used in a different kind of market, such as cryptocurrency?
Cryptocurrency markets are generally considered to be more volatile than traditional financial markets. Large, institutional investors are still avoiding moving into cryptocurrency, and smaller volumes of movement can impact cryptocurrencies significantly. When trying to understand how many factors could be involved in predicting movement of such volatile markets, the brain begins to get scattered. What types of news headlines could move the price of Bitcoin up or down? Is there any overlap between traditional US markets and the patterns of the Bitcoin market? Let’s get wild: could weather patterns, such as natural disasters, have a measurable impact?
Our team spent several months writing a machine learning algorithm that would collect data from a huge array of sources from across the globe, much more than any human could reasonably ingest and process, to see if we could discover any specific types of data that were incredibly highly correlated with the price of Bitcoin. Our hope was that we could use this information to be able to predict upcoming movements in the market and capitalize on them.
The first phase was to determine exactly what types of data to collect, and from where. The final set of data resulted in 924 unique data factors, and included data from the following types of sources:
- News Headlines: we performed sentiment analysis of data headlines from newspapers around the world, including Al Jazeera, BBC, CCTV, CNBC, CNN, DW, EuroNews, France24, JapanTimes, JPostNews, NYT, RussiaToday, SkyNews, and TeleSur. We searched for terms such as “bitcoin” and “cryptocurrency”, ingested the headlines, and ran them through a program that assigns a score, between 0 and 1, representing negativity or positivity.
- Natural Disaster Data: we collected raw data from EMDAT.
- Conflict Data: we collected data regarding conflicts around the world, including factors like which country, number of military deaths, number of civilian deaths, etc.
- Political Data: we grabbed data from across the globe about elections, at the country level, and at individual local levels.
- Financial Market Data: hopefully the most obvious, we ingested data from traditional stock markets all over the world, and of course, bitcoin. For those familiar with stock market analysis, within this data are hundreds, if not thousands, of possible derivative data, such as moving averages over different periods of time.
What a machine learning algorithm does (and this is oversimplifying), is collect all of this data, learn which data factors are the most important, and then use those data factors to provide a guess of the price of Bitcoin. It is capable of shifting its understanding of the most important data factors over time, adapting to changing conditions.
When we created data models using all of this data, and had the machine learning algorithm predict the price of bitcoin at the traditional time of closing of the US markets (5pm ET) on each day as a result, the algorithm’s prediction was 99.58% accurate.
So, can machine learning predict the price of Bitcoin? It looks like it. In future articles, we will explore this in more depth and examine the advantages and limitations of this approach.