{"id":548,"date":"2017-12-05T11:21:11","date_gmt":"2017-12-05T11:21:11","guid":{"rendered":"http:\/\/www.codeastar.com\/?p=548"},"modified":"2018-05-09T09:34:57","modified_gmt":"2018-05-09T09:34:57","slug":"data-science-ensemble-modeling","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/","title":{"rendered":"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling"},"content":{"rendered":"<p><a href=\"https:\/\/www.codeastar.com\/win-big-real-estate-market-data-science\/\">Previously<\/a> on CodeAStar: <i>The data alchemist wannabe opened the first door to &#8220;the room for improvement&#8221;, where he made better prediction potions. His hunger for the ultimate potions became more and more. He then discovered another door inside the room. The label on the door said, &#8220;Ensemble Modeling&#8221;.<\/i><\/p>\n<p>This is our last chapter on the &#8220;win big in real estate market with data science&#8221; series. You can click <a href=\"https:\/\/www.codeastar.com\/win-big-real-estate-market-data-science\/\">Part 1<\/a> and <a href=\"https:\/\/www.codeastar.com\/data-science-model-params-tuning\/\">Part 2<\/a> for previous chapters respectively. Last time we made better predictions using model params tuning, now we want something &#8220;better than better&#8221;. And the ensemble technique is the one we are looking for.<\/p>\n<p><!--more--><\/p>\n<h3>What is Ensemble Modeling?<\/h3>\n<p>Let&#8217; start with the very beginning, yes, what is ensemble modeling? In machine learning, ensemble is a term of methods running different models, then synthesizing a single and more accurate result. There are several types of ensemble modeling, like bagging, boosting and stacking. Some of the data models are already a form of ensemble as well, like Random Forest and AdaBoost models. But in this post, we will focus on stacking in ensemble.<\/p>\n<h3>Why Ensemble Modeling matters?<\/h3>\n<p>From our last &#8220;episodes&#8221;, we got different improved predictions from different models:<\/p>\n<pre>[Ridge Tuned Rid_t] Mean: 0.11255269 Std. Dev.: 0.012144\r\n[Lasso Tuned Las_t] Mean: 0.11238117 Std. Dev.: 0.011936\r\n[ELasticNet Tuned ElN_t] Mean: 0.11233786 Std. Dev.: 0.011963\r\n[LassoLars Tuned LaLa_t] Mean: 0.11231273 Std. Dev.: 0.012701\r\n[XGBoost Tuned XGB_t] Mean: 0.11190687 Std. Dev.: 0.015171<\/pre>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"535\" data-permalink=\"https:\/\/www.codeastar.com\/data-science-model-params-tuning\/tuned_m\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/tuned_m.png?fit=652%2C427&amp;ssl=1\" data-orig-size=\"652,427\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"tuned_m\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/tuned_m.png?fit=300%2C196&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/tuned_m.png?fit=652%2C427&amp;ssl=1\" class=\"alignnone wp-image-535 size-full\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/tuned_m.png?resize=652%2C427&#038;ssl=1\" alt=\"\" width=\"652\" height=\"427\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/tuned_m.png?w=652&amp;ssl=1 652w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/tuned_m.png?resize=300%2C196&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/tuned_m.png?resize=650%2C427&amp;ssl=1 650w\" sizes=\"auto, (max-width: 652px) 100vw, 652px\" \/><\/p>\n<p>They all claim that they are the best models you&#8217;ll ever have. In the past, we would run a <a href=\"https:\/\/www.codeastar.com\/choose-machine-learning-models-python\/\">K-fold Cross Validation<\/a> and decide the best model among models. Likes running a battle royale, you let your finest warriors fight each other, until there is a sole survivor standing in the ring. What if, what if you hire your finest warriors as trainers and let them train a new warrior instead? The new warrior can learn the best parts from his\/her trainers and become the even better warrior!<\/p>\n<p>Does it sound like a good idea? Let&#8217;s find out from the following demonstration:<\/p>\n<pre>The ground truth is: [100, 123, 150]<\/pre>\n<p>We have our finest <del>warriors<\/del> models, and here are their predictions and <a href=\"https:\/\/www.codeastar.com\/regression-model-rmsd\/\">RMSDs<\/a>:<\/p>\n<pre>Model A's prediction: [98, 110, 138]\r\nRMSD: 10.27943<\/pre>\n<p>Model B&#8217;s prediction: [99, 120, 140]<br \/>\nRMSD: 6.0553<\/p>\n<p>Model C&#8217;s prediction: [107, 123, 153]<br \/>\nRMSD: 4.39697<\/p>\n<p>Model D&#8217;s prediction: [104, 130, 158]<br \/>\nRMSD: 6.55744<\/p>\n<p>When we do our old K-Fold CV way, we can declare <b>model C<\/b> as our champion as it got the best <b>4.39697<\/b> RMSD score.<\/p>\n<p>But now we use all four models to train a new model by averaging their predictions, then we have:<\/p>\n<pre>New model's prediction: [102, 120.75, 147.25]\r\nRMSD: 2.35407<\/pre>\n<p>After ensembling 4 models, the new model got a much better <b>2.35407<\/b> RMSD score!<\/p>\n<h3>Stacking in action<\/h3>\n<p>Stacking is a type of ensemble, we stack predicted results from different models to form a new data set. Then we use another model to learn from the new data set and make its own prediction.<\/p>\n<p>To illustrate the stacking process:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"566\" data-permalink=\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/stacking\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/Stacking.png?fit=929%2C286&amp;ssl=1\" data-orig-size=\"929,286\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Stacking\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/Stacking.png?fit=300%2C92&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/Stacking.png?fit=929%2C286&amp;ssl=1\" class=\"alignnone size-full wp-image-566\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/Stacking.png?resize=929%2C286&#038;ssl=1\" alt=\"\" width=\"929\" height=\"286\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/Stacking.png?w=929&amp;ssl=1 929w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/Stacking.png?resize=300%2C92&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/11\/Stacking.png?resize=768%2C236&amp;ssl=1 768w\" sizes=\"auto, (max-width: 929px) 100vw, 929px\" \/><\/p>\n<p>Talk is cheap, let&#8217;s put our models in stacking action.<\/p>\n<p>Last time, we tuned our models with parameters to get better results. This time, we stack our tuned models and a create new data set.<\/p>\n<pre lang=\"python\" line=\"1\">#our tuned models\r\nlinearM = LinearRegression()\r\nridM = Ridge(alpha=0.01)\r\nlasM = Lasso(alpha=0.00001)\r\nelnM = ElasticNet(l1_ratio=0.8, alpha=0.00001)\r\nlalaM = LassoLars(alpha=0.000037)\r\nxgbM =  xgb.XGBRegressor(n_estimators=470,max_depth=3, min_child_weight=3, \r\n                        learning_rate=0.042,subsample=0.5, reg_alpha=0.5,reg_lambda=0.8)<\/pre>\n<p>We select 5 out of 6 models as our base models, then we pick the tuned LassoLars model as our meta learning model. i.e. The model that learns from the 5 base models. LassoLars model is selected because of its highest CV score.<\/p>\n<pre lang=\"python\" line=\"9\">base_models = []\r\nbase_models.append(lasM)\r\nbase_models.append(ridM)\r\nbase_models.append(xgbM)\r\nbase_models.append(elnM)\r\nbase_models.append(linearM)<\/pre>\n<p>meta_model = lalaM<\/p>\n<p>First, we train our base models in 10 folds, get their predictions and use them as training data for our meta model.<\/p>\n<pre lang=\"python\" line=\"17\">stack_kfold = KFold(n_splits=10, shuffle=True)\r\n#fill up all zero\r\nkf_predictions = np.zeros((X_learning.shape[0], len(base_models)))\r\n\r\n#get the X, Y values\r\nX_values = X_learning.values\r\nY_values = Y_learning.values\r\n\r\nfor i, model in enumerate(base_models):\r\n  for train_index ,test_index in stack_kfold.split(X_values):\r\n    model.fit(X_values[train_index], Y_values[train_index])\r\n    model_pred = model.predict(X_values[test_index])\r\n    kf_predictions[test_index, i] = model_pred\r\n\r\n#teach the meta model\r\nmeta_model.fit(kf_predictions, Y_values)\r\n<\/pre>\n<p>After the training of our meta model, we let base models predict with testing data. Then use our meta model to predict from base models&#8217; predictions.<\/p>\n<pre lang=\"python\" line=\"34\">preds = []\r\nfor model in base_models:\r\n  model.fit(X_learning, Y_learning)\r\n  pred = model.predict(X_test)\r\n  preds.append(pred)\r\n\r\nbase_predictions = np.column_stack(preds)\r\n\r\nstacked_predict = meta_model.predict(base_predictions)\r\n<\/pre>\n<p>Now we get the prediction from the meta learning model as <b>stacked_predict<\/b>.<\/p>\n<h3>One Step Further<\/h3>\n<p>The new meta learning kid trained by our top warrior models can predict better than his masters. What if we want an even better performance? Let&#8217;s the kid and his masters join force and predict together. We make one step further for ensembling meta learning&#8217;s result with our other top models&#8217; results. Their ensembling proportions are based on their CV performances.<\/p>\n<pre lang=\"python\" line=\"44\">stack_n_trainers_prediction = stacked_predict *0.5 + xgb_pred * 0.3 + eln_pred *0.1+ rid_pred *0.1<\/pre>\n<p>Then we submit the <i>stack_n_trainers_prediction<\/i> result to Kaggle. Ding! We got RMSD <b>0.11847<\/b> from official <a href=\"https:\/\/www.kaggle.com\/c\/house-prices-advanced-regression-techniques\/leaderboard\" target=\"_blank\" rel=\"noopener\">Kaggle Leaderboard<\/a>, not bad.<\/p>\n<h3>Future of Data Science<\/h3>\n<p>Do you remember, on <a href=\"https:\/\/www.codeastar.com\/data-science-model-params-tuning\/\">last post<\/a>, we talked about tuning model parameters but without explaining the details? Similar thing happened on this ensembling topic also (although we did explain a bit, just not that deep). The reason behind this is simple, we don&#8217;t really need it.<\/p>\n<p>GridSearchCV and ensemble modeling are repeatedly process for machine to find a better match. Since it is &#8220;<a href=\"https:\/\/en.wikipedia.org\/wiki\/No_free_lunch_in_search_and_optimization\" target=\"_blank\" rel=\"noopener\">no free lunch<\/a>&#8221; for which model parameters and combinations are &#8220;best for science&#8221;. It turns out we keep validating different parameters and ensembling to find a better solution. For such trial and error task, it is more productive for using machine to handle it. When computing on machine becomes faster and cheaper time after time, we can assume machine will take over repeatedly tasks. Would data science be more like <a href=\"https:\/\/en.wikipedia.org\/wiki\/One-Punch_Man\" target=\"_blank\" rel=\"noopener\">One-Punch Man<\/a> story in the future? i.e. All of the stuff is finished by one <del>punch<\/del> click. We pass a data set to a machine, we click a button, sit back, let the machine do those trial and error tasks, and get the result.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"583\" data-permalink=\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/opm\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/opm.jpg?fit=800%2C1120&amp;ssl=1\" data-orig-size=\"800,1120\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"opm\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/opm.jpg?fit=214%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/opm.jpg?fit=731%2C1024&amp;ssl=1\" class=\"aligncenter wp-image-583 size-medium\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/opm.jpg?resize=214%2C300&#038;ssl=1\" alt=\"\" width=\"214\" height=\"300\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/opm.jpg?resize=214%2C300&amp;ssl=1 214w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/opm.jpg?resize=768%2C1075&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/opm.jpg?resize=731%2C1024&amp;ssl=1 731w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/opm.jpg?w=800&amp;ssl=1 800w\" sizes=\"auto, (max-width: 214px) 100vw, 214px\" \/><\/p>\n<p>It sounds attractive but it is possibly not the case we can see. We expect machine can help us find a better solution by sophisticated models and a lot of tries. It is only valid when you already have a solution for the machine to try and correct its predictions. Remember, it is no free lunch in optimization. So for our data alchemists, let&#8217;s try brewing harder!<\/p>\n<h3>What have we learnt in this post?<\/h3>\n<ol>\n<li>Introduction of ensemble modeling<\/li>\n<li>Usage of stacking in data science<\/li>\n<li>Possible future of data science<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<p>The whole source code of this trilogy can be found at\u00a0<a href=\"https:\/\/github.com\/codeastar\/house_prices_regression_techniques\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/codeastar\/house_prices_regression_techniques<\/a>\u00a0.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Previously on CodeAStar: The data alchemist wannabe opened the first door to &#8220;the room for improvement&#8221;, where he made better prediction potions. His hunger for the ultimate potions became more and more. He then discovered another door inside the room. The label on the door said, &#8220;Ensemble Modeling&#8221;. This is our last chapter on the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":594,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[18],"tags":[19,41,26,34,40],"class_list":["post-548","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-data-science","tag-ensemble","tag-k-fold-cross-validation","tag-rmsd","tag-stacking"],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling &#8902; Code A Star<\/title>\n<meta name=\"description\" content=\"An introduction of Ensemble Modeling technique in Data Science, which is a popular technique for making better prediction in data science competitions.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling &#8902; Code A Star\" \/>\n<meta property=\"og:description\" content=\"An introduction of Ensemble Modeling technique in Data Science, which is a popular technique for making better prediction in data science competitions.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/\" \/>\n<meta property=\"og:site_name\" content=\"Code A Star\" \/>\n<meta property=\"article:publisher\" content=\"codeastar\" \/>\n<meta property=\"article:author\" content=\"codeastar\" \/>\n<meta property=\"article:published_time\" content=\"2017-12-05T11:21:11+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-05-09T09:34:57+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"727\" \/>\n\t<meta property=\"og:image:height\" content=\"567\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Raven Hon\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@codeastar\" \/>\n<meta name=\"twitter:site\" content=\"@codeastar\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Raven Hon\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/\"},\"author\":{\"name\":\"Raven Hon\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"headline\":\"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling\",\"datePublished\":\"2017-12-05T11:21:11+00:00\",\"dateModified\":\"2018-05-09T09:34:57+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/\"},\"wordCount\":951,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"image\":{\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1\",\"keywords\":[\"Data Science\",\"ensemble\",\"k-fold cross validation\",\"RMSD\",\"stacking\"],\"articleSection\":[\"Learn Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/\",\"url\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/\",\"name\":\"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling &#8902; Code A Star\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1\",\"datePublished\":\"2017-12-05T11:21:11+00:00\",\"dateModified\":\"2018-05-09T09:34:57+00:00\",\"description\":\"An introduction of Ensemble Modeling technique in Data Science, which is a popular technique for making better prediction in data science competitions.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#primaryimage\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1\",\"width\":727,\"height\":567,\"caption\":\"Data Science Technique: Ensemble Modeling\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.codeastar.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.codeastar.com\/#website\",\"url\":\"https:\/\/www.codeastar.com\/\",\"name\":\"Code A Star\",\"description\":\"We don&#039;t wish upon a star, we code a star\",\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.codeastar.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\",\"name\":\"Raven Hon\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"width\":70,\"height\":70,\"caption\":\"Raven Hon\"},\"logo\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\"},\"description\":\"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.\",\"sameAs\":[\"https:\/\/www.codeastar.com\",\"codeastar\",\"https:\/\/x.com\/codeastar\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling &#8902; Code A Star","description":"An introduction of Ensemble Modeling technique in Data Science, which is a popular technique for making better prediction in data science competitions.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/","og_locale":"en_US","og_type":"article","og_title":"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling &#8902; Code A Star","og_description":"An introduction of Ensemble Modeling technique in Data Science, which is a popular technique for making better prediction in data science competitions.","og_url":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/","og_site_name":"Code A Star","article_publisher":"codeastar","article_author":"codeastar","article_published_time":"2017-12-05T11:21:11+00:00","article_modified_time":"2018-05-09T09:34:57+00:00","og_image":[{"width":727,"height":567,"url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1","type":"image\/png"}],"author":"Raven Hon","twitter_card":"summary_large_image","twitter_creator":"@codeastar","twitter_site":"@codeastar","twitter_misc":{"Written by":"Raven Hon","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#article","isPartOf":{"@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/"},"author":{"name":"Raven Hon","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"headline":"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling","datePublished":"2017-12-05T11:21:11+00:00","dateModified":"2018-05-09T09:34:57+00:00","mainEntityOfPage":{"@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/"},"wordCount":951,"commentCount":0,"publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"image":{"@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1","keywords":["Data Science","ensemble","k-fold cross validation","RMSD","stacking"],"articleSection":["Learn Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/","url":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/","name":"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling &#8902; Code A Star","isPartOf":{"@id":"https:\/\/www.codeastar.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#primaryimage"},"image":{"@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1","datePublished":"2017-12-05T11:21:11+00:00","dateModified":"2018-05-09T09:34:57+00:00","description":"An introduction of Ensemble Modeling technique in Data Science, which is a popular technique for making better prediction in data science competitions.","breadcrumb":{"@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#primaryimage","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1","width":727,"height":567,"caption":"Data Science Technique: Ensemble Modeling"},{"@type":"BreadcrumbList","@id":"https:\/\/www.codeastar.com\/data-science-ensemble-modeling\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.codeastar.com\/"},{"@type":"ListItem","position":2,"name":"To win big in real estate market using data science \u2013 Part 3: Ensemble Modeling"}]},{"@type":"WebSite","@id":"https:\/\/www.codeastar.com\/#website","url":"https:\/\/www.codeastar.com\/","name":"Code A Star","description":"We don&#039;t wish upon a star, we code a star","publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.codeastar.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd","name":"Raven Hon","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","width":70,"height":70,"caption":"Raven Hon"},"logo":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/"},"description":"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.","sameAs":["https:\/\/www.codeastar.com","codeastar","https:\/\/x.com\/codeastar"]}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/12\/ensembling.png?fit=727%2C567&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8PcRO-8Q","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/548","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/comments?post=548"}],"version-history":[{"count":40,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/548\/revisions"}],"predecessor-version":[{"id":1064,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/548\/revisions\/1064"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media\/594"}],"wp:attachment":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media?parent=548"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/categories?post=548"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/tags?post=548"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}