{"id":1631,"date":"2019-01-16T20:50:09","date_gmt":"2019-01-16T20:50:09","guid":{"rendered":"https:\/\/www.codeastar.com\/?p=1631"},"modified":"2019-01-17T02:59:31","modified_gmt":"2019-01-17T02:59:31","slug":"get-rich-stock-trading-machine-learning","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/get-rich-stock-trading-machine-learning\/","title":{"rendered":"Stock Trading with Machine Learning and Get Rich"},"content":{"rendered":"\n

Okay, I admit it, it looks like a clickbait headline :]] (yes, we did the similar thing<\/a> before :]] ). But this is not a clickbait at all, as we are actually discussing this topic this time. There is a Kaggle’s challenge on predicting stock trading trend<\/a>, which is a good fit for our topic. So we use this challenge to start our journey to get rich! (it always feels good to use “encouraging” line :]] )<\/p>\n\n\n\n\n\n\n\n

Stock Trading Datasets<\/h3>\n\n\n\n

Likes all our previous machine learning projects, we start our journey by getting the related datasets. Then this time, we have encountered a situation. In this stock trading challenge, we can only use APIs and kernel provided by Kaggle, i.e. we can only load the datasets through Kaggle’s APIs. For the usage of this specific API, we can take a look on Kaggle’ stock trading challenge official getting started kernel<\/a>.<\/p>\n\n\n\n

Once we have loaded the datasets, “market_train_df<\/em>” and “news_train_df<\/em>“, with Kaggle’s API, we can take a look on their content:<\/p>\n\n\n\n

\"Market
market_train_df<\/em><\/figcaption><\/figure>\n\n\n\n
\"News
news_train_df<\/em><\/figcaption><\/figure>\n\n\n\n

market_train_df<\/em>” is a dataframe that contains market information such as stock code, open price, close price, trading volume, etc. While “news_train_df<\/em>” is a dataframe that stores stocks related news information, such as headline, tag, word counts, the probability of rather the news is positive or negative, etc..<\/p>\n\n\n\n

Every data science challenge comes with a target to solve. What is the target in this challenge then? Since this challenge is about stock trading, in order to get rich, we have to predict the stock price in future. In this challenge, we are going to predict the probability of a stock going up or down in next 10 days.<\/p>\n\n\n\n

EDA on Stock Trading <\/h3>\n\n\n\n

“A picture is worth a thousand words”. That is why we always use EDA (Exploratory Data Analysis) to visualize our findings. First, let’s take a look on our market training dataset. Since there are about 3800 stock codes in 2007 to 2016 date range, we pick 5 stock codes we have mentioned in CodeAStar blog previously for our EDA. So we have: Alphabet<\/em> (Google<\/em>), Amazon<\/em>, Apple<\/em>, eBay<\/em> and Microsoft<\/em>.<\/p>\n\n\n\n

import plotly.graph_objs as go\nimport plotly.offline as py\npy.init_notebook_mode(connected=True)\n\ndata = []\nstock_name_arr = ['Microsoft Corp', 'Amazon.com Inc', 'Alphabet Inc', 'Apple Inc', 'eBay Inc']\nfor stock_name in stock_name_arr:\n    trace = go.Scatter(\n            x = market_train_df.loc[market_train_df['assetName'] == stock_name]['time'].dt.strftime(date_format='%Y-%m-%d').values,\n            y = market_train_df.loc[market_train_df['assetName'] == stock_name]['close'].values,\n            name=stock_name\n        )\n    data.append(trace)\n\nlayout = go.Layout(\n                title = \"Stocks Price Chart\",\n                xaxis=dict(\n                  title='Date',\n                  rangeslider=dict(visible=True),\n                  type='date'\n                ),\n                yaxis=dict(\n                  title=\"Price (USD)\",\n                  type='log',\n                  autorange=True\n                )               \n             )\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)\n<\/pre>\n\n\n\n

Here we go:<\/p>\n\n\n\n