K-Fold Cross Validation<\/a>.<\/p>\nWe can reuse our code from previous post on the CV part and add a list of alpha parameter to Ridge:<\/p>\n
kfold = KFold(n_splits=10)\r\n\r\ndef getCVResult(models, X_learning, Y_learning):\r\n rmsds = []\r\n\r\n for name, model in models:\r\n cv_results = cross_val_score(model, X_learning, Y_learning, scoring='neg_mean_squared_error', cv=kfold )\r\n rmsd_scores = np.sqrt(-cv_results)\r\n print(\"\\n[%s] Mean: %.8f Std. Dev.: %8f\" %(name, rmsd_scores.mean(), rmsd_scores.std()))\r\n rmsds.append(rmsd_scores.mean())\r\n return rmsds \r\n\r\nalphas = [0.00001, 0.00005, 0.0001, 0.0005, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 1.5]\r\nmodels_R = []\r\n\r\nfor alpha in alphas:\r\n models_R.append((\"Rid_\"+str(alpha), Ridge(alpha=alpha) ))\r\n\r\nrmsds = getCVResult(models_R, X_learning, Y_learning)\r\n<\/pre>\nAnd get following results:<\/p>\n
[Rid_1e-05] Mean: 0.11348263 Std. Dev.: 0.012197\r\n[Rid_5e-05] Mean: 0.11336354 Std. Dev.: 0.012281\r\n[Rid_0.0001] Mean: 0.11333158 Std. Dev.: 0.012282\r\n[Rid_0.0005] Mean: 0.11320434 Std. Dev.: 0.012232\r\n[Rid_0.005] Mean: 0.11268473 Std. Dev.: 0.012124\r\n[Rid_0.01] Mean: 0.11255269 Std. Dev.: 0.012144 <---<\/em>\r\n[Rid_0.05] Mean: 0.11313334 Std. Dev.: 0.012147\r\n[Rid_0.1] Mean: 0.11388355 Std. Dev.: 0.012075\r\n[Rid_0.5] Mean: 0.11617453 Std. Dev.: 0.011926\r\n[Rid_1] Mean: 0.11696996 Std. Dev.: 0.011962\r\n[Rid_1.5] Mean: 0.11732840 Std. Dev.: 0.012024\r\n<\/pre>\nLet’s put our results in a data frame.<\/p>\n
df_ridge = pd.DataFrame(alphas, columns=['alpha'])\r\ndf_ridge['rmsd'] = rmsds\r\nsns.pointplot(x=\"alpha\", y=\"rmsd\", data=df_ridge)\r\nplt.show()\r\n<\/pre>\nA picture is worth a thousand words (although I rarely post more than 1000 words here):<\/p>\n
<\/p>\n
Let’s back up, a larger alpha brings out underfitting and a smaller one brings out overfitting. When we apply the K-Fold CV, we can get the most suitable value we want.<\/p>\n
The Lasso of Truth<\/h3>\n Wonder Woman’s golden lasso can make people confess and tell the truth. There is a “Lasso” model in data science, but it is nothing related to the Wonder Woman’s weapon. Although it is no the Lasso of Truth, it does help us to get better prediction on our subjects.<\/p>\n
Likes Ridge model, Lasso model is a regression model with regularization. But this time, it is L1 regularization, i.e. with the sum of absolute value of coefficients. Theoretically, Lasso should be a better model as it performs feature selection. It ignores features with zero coefficient to prevent overfitting. But, we don’t have million features for Lasso to select. So it is no much difference for using either L1 or L2 regularization,\u00a0at least in current data set.<\/p>\n
We then do the same routine as Ridge model, by applying a set of alpha values, and let CV handle the rest:<\/p>\n
alphas = [0.000001, 0.000005,0.00001, 0.00005, 0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1]\r\nmodels_las = []\r\n\r\nfor alpha in alphas:\r\n models_las.append((\"Las_\"+str(alpha), Lasso(alpha=alpha) ))\r\n<\/pre>\nHere come the scores:<\/p>\n
[Las_1e-06] Mean: 0.11310858 Std. Dev.: 0.012213\r\n[Las_5e-06] Mean: 0.11258957 Std. Dev.: 0.011937\r\n[Las_1e-05] Mean: 0.11238117 Std. Dev.: 0.011936 <---<\/em>\r\n[Las_5e-05] Mean: 0.11334478 Std. Dev.: 0.012809\r\n[Las_0.0001] Mean: 0.11526842 Std. Dev.: 0.012386\r\n[Las_0.0005] Mean: 0.11833705 Std. Dev.: 0.012650\r\n[Las_0.001] Mean: 0.11961042 Std. Dev.: 0.012739\r\n[Las_0.005] Mean: 0.12906609 Std. Dev.: 0.012915\r\n[Las_0.01] Mean: 0.13737188 Std. Dev.: 0.011742\r\n[Las_0.05] Mean: 0.17137819 Std. Dev.: 0.008359\r\n[Las_0.1] Mean: 0.19586111 Std. Dev.: 0.009045\r\n<\/pre>\nLet’s visualize the output again:<\/p>\n
<\/p>\n
We then apply the same routine on ElasticNet and LassoLars models to find the best parameters:<\/p>\n
#ElasticNet with alpha = 0.00001 and L1 ratio = 0.8\r\n[ELN_L1_0.8] Mean: 0.11219824 Std. Dev.: 0.012191\r\n\r\n#LassoLars with alpha = 0.000037\r\n[LaLa_3.7e-05] Mean: 0.11207374 Std. Dev.: 0.012852\r\n<\/pre>\nCross Validation checks model’s parameter one by one. What if we want to tune more than one parameter a time? No problem, we can use grid search in finding the best parameter combination.<\/p>\n
Grid Search Tuning<\/h3>\n Let’ start our grid search tuning with XGBoost model. First, we get our\u00a0estimator value from the Cross Validation method, i.e. n_estimators = 470. Then we try to find the best max_depth<\/strong> and min_child_weight<\/strong> values using GridSearchCV()<\/em>.<\/p>\nfrom sklearn.model_selection import cross_val_score, GridSearchCV\r\n\r\nparam_test = \r\n{\r\n 'max_depth':[3,4,5,7],\r\n 'min_child_weight':[2,3,4]\r\n}\r\ngsearch = GridSearchCV(estimator = xgb.XGBRegressor(n_estimators=470), \r\n param_grid = param_test, scoring='neg_mean_squared_error', cv=kfold)\r\ngsearch.fit(X_learning,Y_learning)\r\n\r\nprint(gsearch.best_params_ )\r\nprint(np.sqrt(-gsearch.best_score_ ))\r\n<\/pre>\nWe put tuning parameters into param_test<\/em> array and let GridSearchCV() do the validation job.\u00a0The program will then print out the best parameter combination and the best RMSD score.<\/p>\n{'max_depth': 3, 'min_child_weight': 3}\r\n0.115714387592\r\n<\/pre>\nNow we put\u00a0n_estimators<\/em>, max_depth<\/em> and min_child_weight<\/em> into XGBRegressor,\u00a0 and run CV to find the best gamma<\/em> value.<\/p>\ngammas = [0.0002, 0.0003, 0.00035, 0.0004, 0.0005]\r\nmodels_xgb_gamma = []\r\n\r\nfor gamma in gammas:\r\n models_xgb_gamma.append((\"XGB_\"+str(gamma), xgb.XGBRegressor(n_estimators=470,max_depth=3, min_child_weight=3, gamma=gamma ) ))\r\n\r\ngetCVResult(models_xgb_gamma, X_learning, Y_learning)\r\n<\/pre>\nWe pick the best result from CV:<\/p>\n
[XGB_0.0003] Mean: 0.11366855 Std. Dev.: 0.012560\r\n<\/pre>\nAfter that, we keep running GridSearchCV() <\/em>and CV\u00a0with other parameters:\u00a0subsample, learning_rate, reg_alpha and reg_lambda. Thus, we can find the best parameter combination for XGBRegressor model.<\/p>\nIt’ Show Time<\/h3>\n We have tuned our models, it is the time to see how it can improve our models’ performances.<\/p>\n
tuned_models = []\r\ntuned_models.append((\"Rid_t\", Ridge(alpha=0.01) ))\r\ntuned_models.append((\"Las_t\", Lasso(alpha=0.00001) ))\r\ntuned_models.append((\"ElN_t\", ElasticNet(l1_ratio=0.8, alpha=0.00001) ))\r\ntuned_models.append((\"LaLa_t\", LassoLars(alpha=0.000037) ))\r\ntuned_models.append((\"XGB_t\", xgb.XGBRegressor(n_estimators=470,max_depth=3, min_child_weight=3, \r\n learning_rate=0.042,subsample=0.5, \r\n reg_alpha=0.5,reg_lambda=0.8) ))\r\n\r\ngetCVResult(tuned_models, X_learning, Y_learning)\r\n<\/pre>\nHere they come:<\/p>\n
[Ridge Tuned Rid_t] Mean: 0.11255269 Std. Dev.: 0.012144\r\n[Lasso Tuned Las_t] Mean: 0.11238117 Std. Dev.: 0.011936\r\n[ELasticNet Tuned ElN_t] Mean: 0.11233786 Std. Dev.: 0.011963\r\n[LassoLars Tuned LaLa_t] Mean: 0.11231273 Std. Dev.: 0.012701\r\n[XGBoost Tuned XGB_t] Mean: 0.11190687 Std. Dev.: 0.015171\r\n<\/pre>\nWith new chart:<\/p>\n
<\/p>\n
We find that all the tuned models perform better than before! We are getting closer and closer to the “room” for improvement.<\/p>\n
But, there is something missing in our post. We have used CV and GridSearchCV for getting the best parameters, however other than the alpha parameter, the detail of each parameter is omitted. What is going on here? Well, we will leave this topic to next post, the final chapter<\/a> of our housing regression model\u00a0trilogy :]] .<\/p>\nWhat have we learnt in this post?<\/h3>\n\nApply model params tuning can get better prediction<\/li>\n Use cross validation for getting the best single parameter in a model<\/li>\n Use GridSearchCV() method for getting the best combination among parameters<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"Previously on CodeAStar: A data alchemist wannabe tried to win big in real estate market, he then used Kaggle’s Housing Regression data set, engineered the features and fit them in a bunch of models. Dang! Nothing fancy happened. But he then discovered “the room”, the room for improvement — model params tuning.<\/p>\n","protected":false},"author":1,"featured_media":544,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[18],"tags":[19,26,37,38,28],"jetpack_publicize_connections":[],"yoast_head":"\n
To win big in real estate market using data science - Part 2: Model Params Tuning ⋆ Code A Star<\/title>\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\t \n\t \n\t \n \n \n \n \n \n\t \n\t \n\t \n