{"id":418,"date":"2017-08-14T11:01:14","date_gmt":"2017-08-14T11:01:14","guid":{"rendered":"http:\/\/www.codeastar.com\/?p=418"},"modified":"2018-01-24T13:32:16","modified_gmt":"2018-01-24T13:32:16","slug":"regression-model-rmsd","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/regression-model-rmsd\/","title":{"rendered":"What are Regression model and RMSD?"},"content":{"rendered":"

We have learnt how to use machine learning to find an object’ status, like identifying an iris specie<\/a> or a Titanic passenger’s condition<\/a>. It is called classification in machine learning. If we want to use machine learning to predict a trend, like a stock price<\/b>, then what should we do? We go for regression in machine learning.<\/p>\n

<\/p>\n

What is regression?<\/h3>\n

Regression is a technique to find the relationship between an output and one or more dependent variables. I always think visual learning is good for topics of statistics, so let’s visualize the regression by using seaborn from Python.<\/p>\n

First, we need to import required modules:<\/p>\n

import seaborn as sns\r\nimport matplotlib.pyplot as plt<\/pre>\n

Get the bundled data set<\/a>, “tips”, from seaborn and take a look on its content.<\/p>\n

df_tips = sns.load_dataset(\"tips\")\r\ndf_tips.head(5)<\/pre>\n
      total_bill tip\tsex\tsmoker\tday\ttime\tsize\r\n0\t16.99\t 1.01\tFemale\tNo\tSun\tDinner\t2\r\n1\t10.34\t 1.66\tMale\tNo\tSun\tDinner\t3\r\n2\t21.01\t 3.50\tMale\tNo\tSun\tDinner\t3\r\n3\t23.68\t 3.31\tMale\tNo\tSun\tDinner\t2\r\n4\t24.59\t 3.61\tFemale\tNo\tSun\tDinner\t4<\/pre>\n

Then we plot a regression graph to display the relationship of tip<\/i> and total_bill<\/i>:<\/p>\n

sns.regplot(x=\"total_bill\", y=\"tip\", data=df_tips);\r\nplt.show()<\/pre>\n

\"\"<\/p>\n

According to the Tip<\/i> and Total Bill<\/i> spots distribution, we can find out the linear relationship between those values. Thus we can predict the amount of tip (output) based on the total bill that customers have paid (dependent variable).<\/p>\n

How good is our regression?<\/h3>\n

We have made a regression model, but how good is the model? Here comes the Root-Mean-Square Deviation (RMSD)<\/b>\u00a0[or Root-Mean-Square Error (RMSE)<\/b>]. The RMSD is an indicator of difference between predicted and actual values. It is calculated by:<\/p>\n

\"{displaystyle<\/p>\n

where \"{hat is our predicted value, \"y_{i}\"is the actual value in observation i<\/i> , and n<\/i> is the number of observation.<\/p>\n

So a prefect model means a 0 in RMSD and a less effective model means a larger RMSD.<\/p>\n

Again, let’s try to understand RMSD in a visual learning way.<\/p>\n

RMSD in action<\/h3>\n

We keep using the “tips” data set from the above section, get the first 200 records as learning values and the last 44 records as testing values.<\/p>\n

learning_x = df_tips[['total_bill', 'size']].values [:200]\r\nlearning_y = df_tips['tip'].values [:200]\r\ntesting_x = df_tips[['total_bill', 'size']].values[-44:]\r\ntesting_y = df_tips['tip'].values[-44:]<\/pre>\n

Then we call out a machine learning model. Since we are doing regression, so we use the Linear Regression<\/i> model.<\/p>\n

from sklearn.linear_model import LinearRegression\r\nlreg = LinearRegression()\r\nlreg.fit(learning_x, learning_y)\r\nprediction = lreg.predict(testing_x)<\/pre>\n

Now we have our predicted output, let’s compare it with the actual output.<\/p>\n

import pandas as pd\r\ndf_output_compare = pd.DataFrame({'predicted':prediction, 'actual':testing_y})\r\nsns.regplot(x=\"actual\", y=\"predicted\", data=df_output_compare)\r\nplt.show()<\/pre>\n

\"\"<\/p>\n

Because we use only 2 features to predict the tip outcome, the predicted values are hard to correlate to actual ones. We can observe this situation from the graph above, but how hard do these 2 values correlate in term of figures? The sklearn library provides a mean_squared_error<\/i> function which helps us to find the MSE of RMSE(RMSD). Then we can apply a square root on the MSE to get our RMSD.<\/p>\n

from sklearn.metrics import mean_squared_error\r\nfrom math import sqrt\r\nrmsd = sqrt(mean_squared_error(testing_y, prediction))\r\nprint(rmsd)<\/pre>\n
1.189772487870686<\/pre>\n

Now we use another learning set, the only first 5 records from the “tips” data set.<\/p>\n

learning_x_5 = df_tips[['total_bill', 'size']].values [:5]\r\nlearning_y_5 = df_tips['tip'].values [:5]<\/pre>\n

And get our new predication:<\/p>\n

lreg.fit(learning_x_5), learning_y_5)\r\nprediction_5 = lreg.predict(testing_x)<\/pre>\n

We use our new prediction to compare with the actual output.<\/p>\n

df_output_compare_5 = pd.DataFrame({'predicted':prediction_5, 'actual':testing_y})\r\nsns.regplot(x=\"actual\", y=\"predicted\", data=df_output_compare_5, color=\"g\")\r\nplt.show()<\/pre>\n

\"\"
\nThen calculate the new RMSD:<\/p>\n

rmsd_5 = sqrt(mean_squared_error(testing_y, prediction_5))\r\nprint(rmsd_5)<\/pre>\n
1.3766746955609788<\/pre>\n

As expected, there is a larger RMSD for a less effective model.<\/p>\n

Congratulation! Now you can spot the effectiveness of a regression model from graphs and figures.<\/p>\n

 <\/p>\n

What have we learnt in this post?<\/h3>\n
    \n
  1. the use of regression<\/li>\n
  2. how to rate a model from a scatter chart<\/li>\n
  3. the meaning of RMSD \/ RMSE<\/li>\n
  4. how to rate a model from its RMSD<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"

    We have learnt how to use machine learning to find an object’ status, like identifying an iris specie or a Titanic passenger’s condition. It is called classification in machine learning. If we want to use machine learning to predict a trend, like a stock price, then what should we do? We go for regression in […]<\/p>\n","protected":false},"author":1,"featured_media":443,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[18],"tags":[19,36,34,35],"jetpack_publicize_connections":[],"yoast_head":"\nWhat are Regression model and RMSD? ⋆ Code A Star<\/title>\n<meta name=\"description\" content=\"In machine learning, we can predict a status in classification. If we want to predict a trend, like a stock price, then regression is our answer.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.codeastar.com\/regression-model-rmsd\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"What are Regression model and RMSD? ⋆ Code A Star\" \/>\n<meta property=\"og:description\" content=\"In machine learning, we can predict a status in classification. If we want to predict a trend, like a stock price, then regression is our answer.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.codeastar.com\/regression-model-rmsd\/\" \/>\n<meta property=\"og:site_name\" content=\"Code A Star\" \/>\n<meta property=\"article:publisher\" content=\"codeastar\" \/>\n<meta property=\"article:author\" content=\"codeastar\" \/>\n<meta property=\"article:published_time\" content=\"2017-08-14T11:01:14+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-01-24T13:32:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/08\/teach.png?fit=700%2C537&ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"700\" \/>\n\t<meta property=\"og:image:height\" content=\"537\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Raven Hon\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@codeastar\" \/>\n<meta name=\"twitter:site\" content=\"@codeastar\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Raven Hon\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.codeastar.com\/regression-model-rmsd\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/regression-model-rmsd\/\"},\"author\":{\"name\":\"Raven Hon\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"headline\":\"What are Regression model and RMSD?\",\"datePublished\":\"2017-08-14T11:01:14+00:00\",\"dateModified\":\"2018-01-24T13:32:16+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/regression-model-rmsd\/\"},\"wordCount\":501,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"keywords\":[\"Data Science\",\"Regression\",\"RMSD\",\"RMSE\"],\"articleSection\":[\"Learn Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.codeastar.com\/regression-model-rmsd\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.codeastar.com\/regression-model-rmsd\/\",\"url\":\"https:\/\/www.codeastar.com\/regression-model-rmsd\/\",\"name\":\"What are Regression model and RMSD? ⋆ Code A Star\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/#website\"},\"datePublished\":\"2017-08-14T11:01:14+00:00\",\"dateModified\":\"2018-01-24T13:32:16+00:00\",\"description\":\"In machine learning, we can predict a status in classification. If we want to predict a trend, like a stock price, then regression is our answer.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.codeastar.com\/regression-model-rmsd\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.codeastar.com\/regression-model-rmsd\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.codeastar.com\/regression-model-rmsd\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.codeastar.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"What are Regression model and RMSD?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.codeastar.com\/#website\",\"url\":\"https:\/\/www.codeastar.com\/\",\"name\":\"Code A Star\",\"description\":\"We don't wish upon a star, we code a star\",\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.codeastar.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\",\"name\":\"Raven Hon\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"width\":70,\"height\":70,\"caption\":\"Raven Hon\"},\"logo\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\"},\"description\":\"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.\",\"sameAs\":[\"https:\/\/www.codeastar.com\",\"codeastar\",\"https:\/\/twitter.com\/codeastar\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"What are Regression model and RMSD? ⋆ Code A Star","description":"In machine learning, we can predict a status in classification. If we want to predict a trend, like a stock price, then regression is our answer.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.codeastar.com\/regression-model-rmsd\/","og_locale":"en_US","og_type":"article","og_title":"What are Regression model and RMSD? ⋆ Code A Star","og_description":"In machine learning, we can predict a status in classification. If we want to predict a trend, like a stock price, then regression is our answer.","og_url":"https:\/\/www.codeastar.com\/regression-model-rmsd\/","og_site_name":"Code A Star","article_publisher":"codeastar","article_author":"codeastar","article_published_time":"2017-08-14T11:01:14+00:00","article_modified_time":"2018-01-24T13:32:16+00:00","og_image":[{"width":700,"height":537,"url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/08\/teach.png?fit=700%2C537&ssl=1","type":"image\/png"}],"author":"Raven Hon","twitter_card":"summary_large_image","twitter_creator":"@codeastar","twitter_site":"@codeastar","twitter_misc":{"Written by":"Raven Hon","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.codeastar.com\/regression-model-rmsd\/#article","isPartOf":{"@id":"https:\/\/www.codeastar.com\/regression-model-rmsd\/"},"author":{"name":"Raven Hon","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"headline":"What are Regression model and RMSD?","datePublished":"2017-08-14T11:01:14+00:00","dateModified":"2018-01-24T13:32:16+00:00","mainEntityOfPage":{"@id":"https:\/\/www.codeastar.com\/regression-model-rmsd\/"},"wordCount":501,"commentCount":0,"publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"keywords":["Data Science","Regression","RMSD","RMSE"],"articleSection":["Learn Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.codeastar.com\/regression-model-rmsd\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.codeastar.com\/regression-model-rmsd\/","url":"https:\/\/www.codeastar.com\/regression-model-rmsd\/","name":"What are Regression model and RMSD? ⋆ Code A Star","isPartOf":{"@id":"https:\/\/www.codeastar.com\/#website"},"datePublished":"2017-08-14T11:01:14+00:00","dateModified":"2018-01-24T13:32:16+00:00","description":"In machine learning, we can predict a status in classification. If we want to predict a trend, like a stock price, then regression is our answer.","breadcrumb":{"@id":"https:\/\/www.codeastar.com\/regression-model-rmsd\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.codeastar.com\/regression-model-rmsd\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.codeastar.com\/regression-model-rmsd\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.codeastar.com\/"},{"@type":"ListItem","position":2,"name":"What are Regression model and RMSD?"}]},{"@type":"WebSite","@id":"https:\/\/www.codeastar.com\/#website","url":"https:\/\/www.codeastar.com\/","name":"Code A Star","description":"We don't wish upon a star, we code a star","publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.codeastar.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd","name":"Raven Hon","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","width":70,"height":70,"caption":"Raven Hon"},"logo":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/"},"description":"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.","sameAs":["https:\/\/www.codeastar.com","codeastar","https:\/\/twitter.com\/codeastar"]}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/08\/teach.png?fit=700%2C537&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8PcRO-6K","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/418"}],"collection":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/comments?post=418"}],"version-history":[{"count":25,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/418\/revisions"}],"predecessor-version":[{"id":761,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/418\/revisions\/761"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media\/443"}],"wp:attachment":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media?parent=418"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/categories?post=418"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/tags?post=418"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}