{"id":282,"date":"2017-07-15T09:11:40","date_gmt":"2017-07-15T09:11:40","guid":{"rendered":"http:\/\/www.codeastar.com\/?p=282"},"modified":"2017-07-16T10:57:52","modified_gmt":"2017-07-16T10:57:52","slug":"choose-machine-learning-models-python","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/","title":{"rendered":"How to choose a machine learning model in Python?"},"content":{"rendered":"
\"Machine
Machine Learning model selection technique : K-Fold Cross Validation<\/figcaption><\/figure>\n

We have tried our <\/span>first ever Data Science project<\/a> from last post.<\/span><\/p>\n

Do you feel excited? Yeah, we should! But we have also omitted several details on the Data Science Life Cycle<\/a>. Do you remember why I picked Support Vector Machine as our machine learning model last time? I give you 3 seconds to answer.<\/p>\n

3……<\/p>\n

2…..<\/p>\n

1…<\/p>\n

<\/p>\n

I choose Support Vector Machine,\u00a0SVC()<\/em>, as our model, as it is shorter to type for our tutorial ( :]] ) .<\/p><\/blockquote>\n

Yes, it is for tutorial purpose only.\u00a0If you try to answer like that to your employer or client in real life, well, you might be fired or kicked by them. (Again, don’t try this at work!)<\/p>\n

The proper way to pick a model<\/h3>\n

A good model in data science is the model which can provide more accurate predictions. In order to find a accurate model, the most popular technique is using k-fold cross validation<\/strong>.<\/p>\n

K-fold cross validation is the way to split our sample data into number(the k) of testing sets. And K testing sets cover all samples in our data. The validation process runs K times, on each time, it validates one testing set with training data set gathered from K-1 samples. Thus all data is used for testing and training, and each data has been tested exactly once.<\/p>\n

K-fold cross validation in action<\/h3>\n

Let’s use our Iris data set as an example:<\/p>\n

import pandas as pd\r\nimport numpy as np\r\nfrom sklearn import model_selection\r\n\r\ndf = pd.read_csv(\"http:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/iris\/bezdekIris.data\",\r\nnames = [\"Sepal Length\", \"Sepal Width\", \"Petal Length\", \"Petal Width\", \"Class\"])\r\n\r\n#shuffle our data and we use 121 out of 150 as training data\r\ndata_array = df.values\r\nnp.random.shuffle(data_array)\r\nX_learning = data_array[:121][:,0:4]\r\nY_learning = data_array[:121][:,4]\r\n\r\n#split our data in 10 folds\r\nkfold = model_selection.KFold(n_splits=10)\r\n<\/pre>\n

In Python, K-fold\u00a0cross validation can be done using model_selection.KFold()<\/em> from sklearn<\/em>. We take 121 records as our sample data and splits it into 10 folds as kfold<\/em>.<\/p>\n

So what is inside the kfold? We can examine the kfold content by typing:<\/p>\n

for train_index, test_index in kfold.split(X_learning):\r\n  print(\"Train Index:\")\r\n  print(train_index)\r\n  print(\"Test Index:\")\r\n  print(test_index)\r\n<\/pre>\n

Below are the first 2 folds of the kfold:
\n\"\"<\/p>\n

We can notice that a testing set is excluded from a training set on each fold, then the previous testing set would be put back in the next training set and a new training set is used. At the end of 10 folds, all data would be used as training data set and be a testing data set once.<\/p>\n

Now we have k-fold of data set (actually 10 folds, if you ask). It it the time to test our models. Then, another problem has arisen.<\/p>\n

Which machine learning models should we choose?<\/h3>\n

Luckily there is a cheat sheet from Scikit Learn to save our day:<\/p>\n

\"Machine<\/p>\n

(source: http:\/\/scikit-learn.org\/stable\/tutorial\/machine_learning_map\/<\/a>)<\/p>\n

For our Iris class data science project, we teach a computer with data-real life relationships. And the computer makes certain decisions because it is taught to do so. We call this as supervised learning.<\/p>\n

In Sciki Learn library, we pick\u00a0some typical models from its supervised learning<\/a> list:<\/p>\n

    \n
  1. Logistic Regression (LoR)<\/li>\n
  2. Linear Discriminant Analysis (LDA)<\/li>\n
  3. Quadratic Discriminant Analysis (QDA)<\/li>\n
  4. Support Vector Classification (SVC)<\/li>\n
  5. Linear SVC (LSVC)<\/li>\n
  6. Stochastic Gradient Descent (SGD)<\/li>\n
  7. K-Nearest Neighbors Classifier (KNN)<\/li>\n
  8. Gaussian Naive Bayes (GNB)<\/li>\n
  9. Decision Tree Classifier (DT)<\/li>\n
  10. Random Forest Classifier (RF)<\/li>\n<\/ol>\n
    from sklearn.linear_model import LogisticRegression\r\nfrom sklearn.naive_bayes import GaussianNB\r\nfrom sklearn.discriminant_analysis import LinearDiscriminantAnalysis\r\nfrom sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis\r\nfrom sklearn.svm import SVC\r\nfrom sklearn.svm import LinearSVC\r\nfrom sklearn.linear_model import SGDClassifier\r\nfrom sklearn.neighbors import KNeighborsClassifier\r\nfrom sklearn.tree import DecisionTreeClassifier\r\nfrom sklearn.ensemble import RandomForestClassifier\r\n<\/pre>\n

    Put them in our model array.<\/p>\n

    models = []\r\nmodels.append((\"LoR\", LogisticRegression()) )\r\nmodels.append((\"LDA\", LinearDiscriminantAnalysis()) )\r\nmodels.append((\"QDA\", QuadraticDiscriminantAnalysis()) )\r\nmodels.append((\"SVC\", SVC()) )\r\nmodels.append((\"LSVC\", LinearSVC()) )\r\nmodels.append((\"SGD\", SGDClassifier()) )\r\nmodels.append((\"KNN\", KNeighborsClassifier()) )\r\nmodels.append((\"GNB\", GaussianNB() ))\r\nmodels.append((\"DT\", DecisionTreeClassifier()) )\r\nmodels.append((\"RF\", RandomForestClassifier()) )\r\n<\/pre>\n

    And let computer calculate the k-fold cross validation score.<\/p>\n

    for name, model in models:\r\n     #cross validation among models, score based on accuracy\r\n     cv_results = model_selection.cross_val_score(model, X_learning, Y_learning, scoring='accuracy', cv=kfold )\r\n     print(\"\\n\"+name)\r\n     model_names.append(name)\r\n     print(\"Result: \"+str(cv_results))\r\n     print(\"Mean: \" + str(cv_results.mean()))\r\n     print(\"Standard Deviation: \" + str(cv_results.std()))\r\n     means.append(cv_results.mean())\r\n     stds.append(cv_results.std())\r\n<\/pre>\n

    A result list like below would be shown, with each model’s results, mean and stand deviation.
    \n\"\"<\/p>\n

    Since we have matpotlib, we can use it to visualize the k-fold cross validation results.<\/p>\n

    x_loc = np.arange(len(models))\r\nwidth = 0.5\r\nmodels_graph = plt.bar(x_loc, means, width, yerr=stds)\r\nplt.ylabel('Accuracy')\r\nplt.title('Scores by models')\r\nplt.xticks(x_loc, model_names) # models name on x-axis\r\n\r\n#add valve on the top of every bar\r\ndef addLabel(rects):\r\nfor rect in rects:\r\nheight = rect.get_height()\r\nplt.text(rect.get_x() + rect.get_width()\/2., 1.05*height,\r\n'%f' % height, ha='center',\r\nva='bottom')\r\n\r\naddLabel(models_graph)\r\n\r\nplt.show()\r\n<\/pre>\n

    Then we can get a graph like below:
    \n\"\"<\/p>\n

    I blindly picked SVC as our learning model last time, but according to the scoring, it is not bad for the Iris classification actually.<\/h4>\n

    Finding the model<\/h3>\n

    Models scoring will be different based on sample size and feature size of your data. So the point of k-fold cross validation is not to find an all-time ultimate model, but a suitable model for your current data set.<\/p>\n

     <\/p>\n

    The complete source can be found at\u00a0https:\/\/github.com\/codeastar\/python_data_science_tutorial<\/a>\u00a0.<\/p>\n

     <\/p>\n","protected":false},"excerpt":{"rendered":"

    We have tried our first ever Data Science project from last post. Do you feel excited? Yeah, we should! But we have also omitted several details on the Data Science Life Cycle. Do you remember why I picked Support Vector Machine as our machine learning model last time? I give you 3 seconds to answer. […]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[18],"tags":[19,26,27,8],"jetpack_publicize_connections":[],"yoast_head":"\nHow to choose a machine learning model in Python? ⋆ Code A Star<\/title>\n<meta name=\"description\" content=\"In order to find a suitable machine learning model for our data, the most popular technique is using k-fold cross validation.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to choose a machine learning model in Python? ⋆ Code A Star\" \/>\n<meta property=\"og:description\" content=\"In order to find a suitable machine learning model for our data, the most popular technique is using k-fold cross validation.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/\" \/>\n<meta property=\"og:site_name\" content=\"Code A Star\" \/>\n<meta property=\"article:publisher\" content=\"codeastar\" \/>\n<meta property=\"article:author\" content=\"codeastar\" \/>\n<meta property=\"article:published_time\" content=\"2017-07-15T09:11:40+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2017-07-16T10:57:52+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/kfold_5.png\" \/>\n<meta name=\"author\" content=\"Raven Hon\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@codeastar\" \/>\n<meta name=\"twitter:site\" content=\"@codeastar\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Raven Hon\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/\"},\"author\":{\"name\":\"Raven Hon\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"headline\":\"How to choose a machine learning model in Python?\",\"datePublished\":\"2017-07-15T09:11:40+00:00\",\"dateModified\":\"2017-07-16T10:57:52+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/\"},\"wordCount\":637,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"keywords\":[\"Data Science\",\"k-fold cross validation\",\"learning algorithm\",\"Python\"],\"articleSection\":[\"Learn Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/\",\"url\":\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/\",\"name\":\"How to choose a machine learning model in Python? ⋆ Code A Star\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/#website\"},\"datePublished\":\"2017-07-15T09:11:40+00:00\",\"dateModified\":\"2017-07-16T10:57:52+00:00\",\"description\":\"In order to find a suitable machine learning model for our data, the most popular technique is using k-fold cross validation.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.codeastar.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How to choose a machine learning model in Python?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.codeastar.com\/#website\",\"url\":\"https:\/\/www.codeastar.com\/\",\"name\":\"Code A Star\",\"description\":\"We don't wish upon a star, we code a star\",\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.codeastar.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\",\"name\":\"Raven Hon\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"width\":70,\"height\":70,\"caption\":\"Raven Hon\"},\"logo\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\"},\"description\":\"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.\",\"sameAs\":[\"https:\/\/www.codeastar.com\",\"codeastar\",\"https:\/\/twitter.com\/codeastar\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"How to choose a machine learning model in Python? ⋆ Code A Star","description":"In order to find a suitable machine learning model for our data, the most popular technique is using k-fold cross validation.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/","og_locale":"en_US","og_type":"article","og_title":"How to choose a machine learning model in Python? ⋆ Code A Star","og_description":"In order to find a suitable machine learning model for our data, the most popular technique is using k-fold cross validation.","og_url":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/","og_site_name":"Code A Star","article_publisher":"codeastar","article_author":"codeastar","article_published_time":"2017-07-15T09:11:40+00:00","article_modified_time":"2017-07-16T10:57:52+00:00","og_image":[{"url":"https:\/\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/kfold_5.png"}],"author":"Raven Hon","twitter_card":"summary_large_image","twitter_creator":"@codeastar","twitter_site":"@codeastar","twitter_misc":{"Written by":"Raven Hon","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/#article","isPartOf":{"@id":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/"},"author":{"name":"Raven Hon","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"headline":"How to choose a machine learning model in Python?","datePublished":"2017-07-15T09:11:40+00:00","dateModified":"2017-07-16T10:57:52+00:00","mainEntityOfPage":{"@id":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/"},"wordCount":637,"commentCount":0,"publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"keywords":["Data Science","k-fold cross validation","learning algorithm","Python"],"articleSection":["Learn Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/","url":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/","name":"How to choose a machine learning model in Python? ⋆ Code A Star","isPartOf":{"@id":"https:\/\/www.codeastar.com\/#website"},"datePublished":"2017-07-15T09:11:40+00:00","dateModified":"2017-07-16T10:57:52+00:00","description":"In order to find a suitable machine learning model for our data, the most popular technique is using k-fold cross validation.","breadcrumb":{"@id":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.codeastar.com\/"},{"@type":"ListItem","position":2,"name":"How to choose a machine learning model in Python?"}]},{"@type":"WebSite","@id":"https:\/\/www.codeastar.com\/#website","url":"https:\/\/www.codeastar.com\/","name":"Code A Star","description":"We don't wish upon a star, we code a star","publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.codeastar.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd","name":"Raven Hon","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","width":70,"height":70,"caption":"Raven Hon"},"logo":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/"},"description":"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.","sameAs":["https:\/\/www.codeastar.com","codeastar","https:\/\/twitter.com\/codeastar"]}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8PcRO-4y","jetpack-related-posts":[{"id":1066,"url":"https:\/\/www.codeastar.com\/bartener-machine-learning\/","url_meta":{"origin":282,"position":0},"title":"“Do you have a dog?” explained in Machine Learning","author":"Raven Hon","date":"May 19, 2018","format":false,"excerpt":"You have probably read the above comic in 9gag or imgur before. It is a funny joke, but on the other hand, it is also a material for our Machine Learning topic. It sounds weird? Oh yeah, sometimes knowledge comes from strange ideas. The Comic Here is the comic, for\u2026","rel":"","context":"In "Learn Machine Learning"","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"\"Do you have a dog?\" in Machine Learning","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/05\/dyhad.png?fit=377%2C221&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1941,"url":"https:\/\/www.codeastar.com\/recurrent-neural-network-rnn-in-nlp-and-python-part-2\/","url_meta":{"origin":282,"position":1},"title":"RNN (Recurrent Neural Network) in NLP and Python – Part 2","author":"Raven Hon","date":"May 15, 2019","format":false,"excerpt":"From our Part 1 of NLP and Python topic, we talked about word pre-processing for a machine to handle words. This time, we are going to talk about building a model for a machine to classify words. We learned to use CNN to classify images in past. Then we use\u2026","rel":"","context":"In "Learn Machine Learning"","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Recurrent Neural Network","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":203,"url":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/","url_meta":{"origin":282,"position":2},"title":"Data Science Tutorial for Absolutely Python Beginners","author":"Raven Hon","date":"July 9, 2017","format":false,"excerpt":"My anaconda don't,\u00a0My anaconda don't,\u00a0My anaconda don't want none, unless you've got.... Yes, you are still reading Code A Star blog. In this post we are going to try our Data Science tutorial in Python. Since we are targeting Python beginners for this hands-on, I would like to introduce Anaconda\u2026","rel":"","context":"In "We code therefore we are"","block_context":{"text":"We code therefore we are","link":"https:\/\/www.codeastar.com\/category\/we-code-therefore-we-are\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":548,"url":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/","url_meta":{"origin":282,"position":3},"title":"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling","author":"Raven Hon","date":"December 5, 2017","format":false,"excerpt":"Previously on CodeAStar: The data alchemist wannabe opened the first door to \"the room for improvement\", where he made better prediction potions. His hunger for the ultimate potions became more and more. He then discovered another door inside the room. The label on the door said, \"Ensemble Modeling\". This is\u2026","rel":"","context":"In "Learn Machine Learning"","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Data Science Technique: Ensemble Modeling","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":469,"url":"https:\/\/www.codeastar.com\/win-big-real-estate-market-data-science\/","url_meta":{"origin":282,"position":4},"title":"To win big in real estate market using data science – Part 1","author":"Raven Hon","date":"November 7, 2017","format":false,"excerpt":"Okay, yes, once again, it is a catchy topic. BUT, this post is indeed trying to help people (including me) to gain an upper hand in real estate market, using data science. From our last post (2 months ago, I will try to update this blog more frequently :]] ),\u2026","rel":"","context":"In "Learn Machine Learning"","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"To win big in real estate market using data science","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/cas_regression_model.png?fit=744%2C524&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/cas_regression_model.png?fit=744%2C524&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/cas_regression_model.png?fit=744%2C524&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/cas_regression_model.png?fit=744%2C524&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":764,"url":"https:\/\/www.codeastar.com\/convolutional-neural-network-python\/","url_meta":{"origin":282,"position":5},"title":"Python Image Recognizer with Convolutional Neural Network","author":"Raven Hon","date":"February 11, 2018","format":false,"excerpt":"On our data science journey, we have solved classification and regression problems. What's next? There is one popular machine learning territory we have not set feet on yet --- the image recognition. But now the wait is over, in this post we are going to teach our machine to recognize\u2026","rel":"","context":"In "Learn Machine Learning"","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Teach our machine with Convolutional Neural Network","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=1050%2C600 3x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/282"}],"collection":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/comments?post=282"}],"version-history":[{"count":27,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/282\/revisions"}],"predecessor-version":[{"id":316,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/282\/revisions\/316"}],"wp:attachment":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media?parent=282"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/categories?post=282"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/tags?post=282"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}