{"id":203,"date":"2017-07-09T09:08:16","date_gmt":"2017-07-09T09:08:16","guid":{"rendered":"http:\/\/www.codeastar.com\/?p=203"},"modified":"2017-07-10T06:49:19","modified_gmt":"2017-07-10T06:49:19","slug":"beginner-data-science-tutorial","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/","title":{"rendered":"Data Science Tutorial for Absolutely Python Beginners"},"content":{"rendered":"<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"263\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/giphy\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif?fit=480%2C270&amp;ssl=1\" data-orig-size=\"480,270\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"giphy\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif?fit=300%2C169&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif?fit=480%2C270&amp;ssl=1\" class=\"aligncenter wp-image-263 size-full\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif?resize=480%2C270&#038;ssl=1\" alt=\"Data Science tutorial with Anaconda\" width=\"480\" height=\"270\" \/><\/p>\n<blockquote><p>My anaconda don&#8217;t,\u00a0My anaconda don&#8217;t,\u00a0My anaconda don&#8217;t want none, unless you&#8217;ve got&#8230;.<\/p><\/blockquote>\n<p>Yes, you are still reading <a href=\"https:\/\/www.codeastar.com\">Code A Star<\/a> blog.<\/p>\n<p>In this post we are going to try our Data Science tutorial in Python. Since we are targeting Python beginners for this hands-on, I would like to introduce <a href=\"https:\/\/www.continuum.io\/\" target=\"_blank\" rel=\"noopener\">Anaconda<\/a> to all of you.<\/p>\n<p><!--more--><\/p>\n<h3>All-in-one starting place<\/h3>\n<p>For our Data Science tutorial, there are not many lines to code actually. But we have to spend time understanding the <a href=\"https:\/\/www.codeastar.com\/what-is-data-science\/\">basic concept<\/a>, modules and functions used in our program. And we have to install several libraries for Python to do the science like:<\/p>\n<ul>\n<li>Matplotlib &#8211; a plotting library to make histograms, bar charts, scatter plots and other graphs<\/li>\n<li>NumPy &#8211; a fast library for handling n-dimensional array<\/li>\n<li>Pandas &#8211; a set of data structures tools<\/li>\n<li>Scikit Learn &#8211; a machine learning library that we use it to teach our computer and make prediction<\/li>\n<\/ul>\n<p>You can install above libraries by using <a href=\"https:\/\/www.codeastar.com\/must-know-command-python-pip\/\">the friendly Python command &#8212; pip<\/a>. But do you remember what we have <del datetime=\"2017-07-06T08:11:49+00:00\">sung<\/del> said on the first paragraph? Yes, the Anaconda. It is a Python environment bundled with all essential data science libraries. That means, you can simply use Anaconda to start a data science project instead of pip&#8217;ing those libraries one by one.<\/p>\n<h3>My Anaconda does<\/h3>\n<p>Once you open Anaconda, you would see a similar interface likes below:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"210\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/screen-shot-2017-07-06-at-6-23-29-pm\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.23.29-PM.png?fit=1017%2C715&amp;ssl=1\" data-orig-size=\"1017,715\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screen Shot 2017-07-06 at 6.23.29 PM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.23.29-PM.png?fit=300%2C211&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.23.29-PM.png?fit=1017%2C715&amp;ssl=1\" class=\"alignnone wp-image-210 size-medium\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.23.29-PM.png?resize=300%2C211&#038;ssl=1\" alt=\"\" width=\"300\" height=\"211\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.23.29-PM.png?resize=300%2C211&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.23.29-PM.png?resize=768%2C540&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.23.29-PM.png?resize=940%2C661&amp;ssl=1 940w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.23.29-PM.png?w=1017&amp;ssl=1 1017w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/p>\n<p>Click &#8220;Environment&#8221; on your left and there are tons of Python libraries installed in the environment, including those data science libraries we have mentioned:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"211\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/screen-shot-2017-07-06-at-6-27-05-pm\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.27.05-PM.png?fit=1242%2C948&amp;ssl=1\" data-orig-size=\"1242,948\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screen Shot 2017-07-06 at 6.27.05 PM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.27.05-PM.png?fit=300%2C229&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.27.05-PM.png?fit=1024%2C782&amp;ssl=1\" class=\"alignnone size-medium wp-image-211\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.27.05-PM.png?resize=300%2C229&#038;ssl=1\" alt=\"\" width=\"300\" height=\"229\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.27.05-PM.png?resize=300%2C229&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.27.05-PM.png?resize=768%2C586&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.27.05-PM.png?resize=1024%2C782&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.27.05-PM.png?resize=940%2C717&amp;ssl=1 940w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.27.05-PM.png?w=1242&amp;ssl=1 1242w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/p>\n<p>What we do next is, start our project by clicking the green arrow button and select &#8220;Open with Jupyter Notebook&#8221; option:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"212\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/screen-shot-2017-07-06-at-6-29-00-pm\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.29.00-PM.png?fit=563%2C293&amp;ssl=1\" data-orig-size=\"563,293\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screen Shot 2017-07-06 at 6.29.00 PM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.29.00-PM.png?fit=300%2C156&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.29.00-PM.png?fit=563%2C293&amp;ssl=1\" class=\"alignnone size-medium wp-image-212\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.29.00-PM.png?resize=300%2C156&#038;ssl=1\" alt=\"\" width=\"300\" height=\"156\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.29.00-PM.png?resize=300%2C156&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.29.00-PM.png?w=563&amp;ssl=1 563w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/p>\n<p>Jupyer Notebook is a web application for users to create and share (not only) Data Science projects in (not only again) Python. We can click the &#8220;New&#8221; button in the upper right corner and select &#8220;Python 3&#8221;:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"213\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/screen-shot-2017-07-06-at-6-32-03-pm\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.03-PM.png?fit=1159%2C401&amp;ssl=1\" data-orig-size=\"1159,401\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screen Shot 2017-07-06 at 6.32.03 PM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.03-PM.png?fit=300%2C104&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.03-PM.png?fit=1024%2C354&amp;ssl=1\" class=\"alignnone wp-image-213 size-large\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.03-PM.png?resize=940%2C325&#038;ssl=1\" alt=\"\" width=\"940\" height=\"325\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.03-PM.png?resize=1024%2C354&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.03-PM.png?resize=300%2C104&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.03-PM.png?resize=768%2C266&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.03-PM.png?resize=940%2C325&amp;ssl=1 940w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.03-PM.png?w=1159&amp;ssl=1 1159w\" sizes=\"auto, (max-width: 940px) 100vw, 940px\" \/><\/p>\n<p>A Python development UI is then launched. Okay, here we go, our science starts here:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"214\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/screen-shot-2017-07-06-at-6-32-37-pm\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.37-PM.png?fit=1164%2C230&amp;ssl=1\" data-orig-size=\"1164,230\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screen Shot 2017-07-06 at 6.32.37 PM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.37-PM.png?fit=300%2C59&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.37-PM.png?fit=1024%2C202&amp;ssl=1\" class=\"alignnone wp-image-214 size-large\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.37-PM.png?resize=940%2C185&#038;ssl=1\" alt=\"\" width=\"940\" height=\"185\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.37-PM.png?resize=1024%2C202&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.37-PM.png?resize=300%2C59&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.37-PM.png?resize=768%2C152&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.37-PM.png?resize=940%2C186&amp;ssl=1 940w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/Screen-Shot-2017-07-06-at-6.32.37-PM.png?w=1164&amp;ssl=1 1164w\" sizes=\"auto, (max-width: 940px) 100vw, 940px\" \/><\/p>\n<h3>Do you remember the Data Science Life Cycle?<\/h3>\n<p>You can <a href=\"https:\/\/www.codeastar.com\/what-is-data-science\/#DataScienceLifeCycle\">click here<\/a> to recall your memory. We are going to do the &#8220;Hello World&#8221; of Data Science &#8212; the Iris Classification and the first step of our project is: define a problem.<\/p>\n<p>Iris Classification is a data set of 150 iris plants categorized into 3 classes. Our problem for this project is: when we have some iris plants, which class should our plants belonged to?<\/p>\n<p>We move to step 2 of the Data Science Life Cycle, collect data. Since Iris Data Set is a famous data pattern\u00a0recognition resource, we can simply download it from\u00a0<a href=\"http:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/iris\/bezdekIris.data\" target=\"_blank\" rel=\"noopener\">the web<\/a> (yeah, that is why it is the &#8220;Hello World&#8221; of Data Science).<\/p>\n<p>Now, let&#8217;s put our coding parts on Jupyer Notebook. Firstly, we import our required Data Science modules(the modules which we have mentioned above):<\/p>\n<pre lang=\"python\" line=\"1\">import pandas as pd\r\nimport numpy as np\r\nimport matplotlib.pyplot as plt\r\nfrom sklearn.svm import SVC\r\nfrom sklearn.metrics import accuracy_score\r\nfrom sklearn.metrics import classification_report\r\nfrom sklearn.metrics import confusion_matrix\r\n<\/pre>\n<p>Secondly, we obtain our data set using <em>Pandas&#8217;s<\/em> read_csv function as a variable(<em>df<\/em>, dataframe in short):<\/p>\n<pre lang=\"python\" line=\"8\">df = pd.read_csv(\"http:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/iris\/bezdekIris.data\",\r\nnames = [\"Sepal Length\", \"Sepal Width\", \"Petal Length\", \"Petal Width\", \"Class\"])<\/pre>\n<p>Other than read from web, you can use<em> pd.read_csv()<\/em> to read from your file system. i.e.\u00a0<em>pd.read_csb(&#8220;\/your_directory\/your_filename&#8221;)<\/em>. Since there is no header inside the data file, we make one by using <em>names = [&#8230;..]<\/em> parameter. The header is following iris data set&#8217;s attribute description on its <a href=\"http:\/\/archive.ics.uci.edu\/ml\/datasets\/Iris\" target=\"_blank\" rel=\"noopener\">source page<\/a>.<\/p>\n<p>You can print out the first 10 records of our data set to make sure everything is right:<\/p>\n<pre lang=\"python\" line=\"10\">print(df.head(10))\r\n<\/pre>\n<p>A data grid would be printed out like this:<br \/>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"232\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/10df\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/10df.png?fit=693%2C302&amp;ssl=1\" data-orig-size=\"693,302\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"10df\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/10df.png?fit=300%2C131&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/10df.png?fit=693%2C302&amp;ssl=1\" class=\"alignnone wp-image-232 size-full\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/10df.png?resize=693%2C302&#038;ssl=1\" alt=\"\" width=\"693\" height=\"302\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/10df.png?w=693&amp;ssl=1 693w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/10df.png?resize=300%2C131&amp;ssl=1 300w\" sizes=\"auto, (max-width: 693px) 100vw, 693px\" \/><\/p>\n<h3>Is it magic? No, it is our Data Science Tutorial<\/h3>\n<p>Pandas&#8217; dataframe can do more other than just showing us the content of data, try using its <em>describe()<\/em> function:<\/p>\n<pre lang=\"python\" line=\"11\">print(df.describe())\r\n<\/pre>\n<p>See? It can calculate the means and standard deviations of our data set:<br \/>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"241\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/data_desc\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/data_desc.png?fit=597%2C242&amp;ssl=1\" data-orig-size=\"597,242\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"data_desc\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/data_desc.png?fit=300%2C122&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/data_desc.png?fit=597%2C242&amp;ssl=1\" class=\"alignnone wp-image-241 size-full\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/data_desc.png?resize=597%2C242&#038;ssl=1\" alt=\"\" width=\"597\" height=\"242\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/data_desc.png?w=597&amp;ssl=1 597w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/data_desc.png?resize=300%2C122&amp;ssl=1 300w\" sizes=\"auto, (max-width: 597px) 100vw, 597px\" \/><br \/>\nIt can even plot a histogram for us, its\u00a0<em>hist()<\/em> function transforms data into histogram data. I set the bin size set to 20 for accuracy and resource balance. Then use matpotlib (plt) module to show the graph:<\/p>\n<pre lang=\"python\" line=\"12\">df.hist(bins=20)\r\nplt.show()\r\n<\/pre>\n<p>A histogram with data attributes inside our data set is then generated:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"244\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/hist\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/hist.png?fit=470%2C332&amp;ssl=1\" data-orig-size=\"470,332\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"hist\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/hist.png?fit=300%2C212&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/hist.png?fit=470%2C332&amp;ssl=1\" class=\"alignnone wp-image-244 size-full\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/hist.png?resize=470%2C332&#038;ssl=1\" alt=\"\" width=\"470\" height=\"332\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/hist.png?w=470&amp;ssl=1 470w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/hist.png?resize=300%2C212&amp;ssl=1 300w\" sizes=\"auto, (max-width: 470px) 100vw, 470px\" \/><\/p>\n<h3>Machine Learning Time<\/h3>\n<p>We have a problem to solve and we have collected a data set. So it is time to let a computer learn the data relationships and build a model to solve the puzzle.<\/p>\n<p>There are 150 records from the Iris data set, we would pick 80 out of 150 randomly for machine learning.<\/p>\n<pre lang=\"python\" line=\"14\">data_array = df.values\r\nnp.random.shuffle(data_array)\r\nX_learning = data_array[:80][:,0:4]\r\nY_learning = data_array[:80][:,4]\r\n<\/pre>\n<p>We transform our dataframe, <em>df<\/em>, into a n-dimensional array, <em>data_array<\/em>. And shuffle its order with NumPy \u00a0function <em>.random.shuffle()<\/em> . We take the first 80 records with 4 attributes (sepal length, sepal width, petal length and petal width) as<em> X_learning<\/em>. And take 80 related classes (the 5th attribute) as <em>Y_learning<\/em>.<\/p>\n<p>Then we pick a model from Sciki Learn library as our machine learning model. There are several models available on Sciki Learn library. I choose Support Vector Machine,\u00a0<em>SVC()<\/em>, as our model, as it is shorter to type for our tutorial ( :]] ) .<\/p>\n<pre lang=\"python\" line=\"18\">svc = SVC()\r\nsvc.fit(X_learning,Y_learning)\r\n<\/pre>\n<p>Using .<em>fit()<\/em> function, we let the machine learn, by teaching it about the data relationship: with attributes like <em>X_learning<\/em>, they would get <em>Y_learning<\/em> classes.<\/p>\n<h3>Review Predictions<\/h3>\n<p>We have an &#8220;educated&#8221; machine, it is time to see rather it is reliable enough to solve our problem. Now we take the last 20 records from our shuffled array as testing data. <em>X<\/em> is the array of data attributes to be tested, while <em>Y<\/em> is the set of the answers to be used in verification process later.<\/p>\n<pre lang=\"python\" line=\"20\">X = data_array[-20:][:,0:4]\r\nY = data_array[-20:][:,4]\r\n<\/pre>\n<p>To get our predictions, just simply put the <em>X<\/em> in our model <em>svc<\/em> using<em> .predict()<\/em> function:<\/p>\n<pre lang=\"python\" line=\"22\">predictions = svc.predict(X)\r\n<\/pre>\n<p>Then we go to compare predicted results, <em>predictions<\/em>, with the actual results, <em>Y<\/em>. And get the accuracy rate by using <em>accuracy_score(&lt;actual results&gt;, &lt;predicted result&gt;) .<\/em><\/p>\n<pre lang=\"python\" line=\"23\">print(\"Predicted results:\")\r\nprint(predictions)\r\nprint(\"Actual results:\")\r\nprint(Y)\r\nprint(\"Accuracy rate:  %f\" % (accuracy_score(Y, predictions)))\r\n<\/pre>\n<p>95% accuracy rate, not bad.<br \/>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"249\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/results\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/results.png?fit=680%2C275&amp;ssl=1\" data-orig-size=\"680,275\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"results\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/results.png?fit=300%2C121&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/results.png?fit=680%2C275&amp;ssl=1\" class=\"alignnone wp-image-249 size-full\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/results.png?resize=680%2C275&#038;ssl=1\" alt=\"\" width=\"680\" height=\"275\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/results.png?w=680&amp;ssl=1 680w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/results.png?resize=300%2C121&amp;ssl=1 300w\" sizes=\"auto, (max-width: 680px) 100vw, 680px\" \/><\/p>\n<p>We can get confusion matrix and classification report for analysis:<\/p>\n<pre lang=\"python\" line=\"28\">print(confusion_matrix(Y, predictions))\r\nprint(classification_report(Y, predictions))\r\n<\/pre>\n<p>And we get the matrix and report on Juypter Notebook like:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"250\" data-permalink=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/reports\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/reports.png?fit=567%2C228&amp;ssl=1\" data-orig-size=\"567,228\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"reports\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/reports.png?fit=300%2C121&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/reports.png?fit=567%2C228&amp;ssl=1\" class=\"alignnone wp-image-250 size-full\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/reports.png?resize=567%2C228&#038;ssl=1\" alt=\"\" width=\"567\" height=\"228\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/reports.png?w=567&amp;ssl=1 567w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/reports.png?resize=300%2C121&amp;ssl=1 300w\" sizes=\"auto, (max-width: 567px) 100vw, 567px\" \/><\/p>\n<p>Feel confusing on the confusion matrix? Let read the matrix like this way:<\/p>\n\n<table id=\"tablepress-2\" class=\"tablepress tablepress-id-2\">\n<thead>\n<tr class=\"row-1\">\n\t<td class=\"column-1\"><\/td><th class=\"column-2\">Predicted Iris-setosa<\/th><th class=\"column-3\">Predicted Iris-versicolor<\/th><th class=\"column-4\">Predicted Iris-virginica<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">Actual Iris-setosa<\/td><td class=\"column-2\">8<\/td><td class=\"column-3\">0<\/td><td class=\"column-4\">0<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">Actual Iris-versicolor<\/td><td class=\"column-2\">0<\/td><td class=\"column-3\">4<\/td><td class=\"column-4\">1<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">Actual  Iris-virginica<\/td><td class=\"column-2\">0<\/td><td class=\"column-3\">0<\/td><td class=\"column-4\">7<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p>It is the matrix to see which class is predicted wrong. For our case, we have 5 iris versicolor results, which 4 are predicted as iris versicolor and 1 is wrongly predicted as iris virginica.<\/p>\n<p>We might need to find out rather we need more data on iris versicolor class, or use other models for better predictions. It is a duty for Data Scientist to make better and better data outcome. But right now, we have our first Data Science program, and we can use it to classify iris classes!<\/p>\n<p>Science is about keep researching to find out the knowledge, once we work more on Data Science, we sharpen our skill and get better and better ever since. Keep coding mates!<\/p>\n<p>&nbsp;<\/p>\n<p>The complete source can be found at\u00a0<a href=\"https:\/\/github.com\/codeastar\/python_data_science_tutorial\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/codeastar\/python_data_science_tutorial<\/a>\u00a0.<\/p>\n<p>&nbsp;<\/p>\n<h3>What we have learnt on this hands-on:<\/h3>\n<ul>\n<li>how to start a Data Science project in Anaconda<\/li>\n<li>data collection using Pandas<\/li>\n<li>machine learning using Sciki Learn<\/li>\n<li>predict outcome using collected data<\/li>\n<li>analyze the results<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>My anaconda don&#8217;t,\u00a0My anaconda don&#8217;t,\u00a0My anaconda don&#8217;t want none, unless you&#8217;ve got&#8230;. Yes, you are still reading Code A Star blog. In this post we are going to try our Data Science tutorial in Python. Since we are targeting Python beginners for this hands-on, I would like to introduce Anaconda to all of you.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[2],"tags":[19,25,23,8,24],"class_list":["post-203","post","type-post","status-publish","format-standard","hentry","category-we-code-therefore-we-are","tag-data-science","tag-numpy","tag-pandas","tag-python","tag-scikit-learn"],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.8.1 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Data Science Tutorial for Absolutely Python Beginners &#8902; Code A Star<\/title>\n<meta name=\"description\" content=\"In this post we are going to try our Data Science tutorial in Python. And we are targeting Python beginners for this hands-on.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Data Science Tutorial for Absolutely Python Beginners &#8902; Code A Star\" \/>\n<meta property=\"og:description\" content=\"In this post we are going to try our Data Science tutorial in Python. And we are targeting Python beginners for this hands-on.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/\" \/>\n<meta property=\"og:site_name\" content=\"Code A Star\" \/>\n<meta property=\"article:publisher\" content=\"codeastar\" \/>\n<meta property=\"article:author\" content=\"codeastar\" \/>\n<meta property=\"article:published_time\" content=\"2017-07-09T09:08:16+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2017-07-10T06:49:19+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif\" \/>\n<meta name=\"author\" content=\"Raven Hon\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@codeastar\" \/>\n<meta name=\"twitter:site\" content=\"@codeastar\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Raven Hon\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/\"},\"author\":{\"name\":\"Raven Hon\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"headline\":\"Data Science Tutorial for Absolutely Python Beginners\",\"datePublished\":\"2017-07-09T09:08:16+00:00\",\"dateModified\":\"2017-07-10T06:49:19+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/\"},\"wordCount\":1145,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"image\":{\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif\",\"keywords\":[\"Data Science\",\"NumPy\",\"Pandas\",\"Python\",\"Scikit Learn\"],\"articleSection\":[\"We code therefore we are\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/\",\"url\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/\",\"name\":\"Data Science Tutorial for Absolutely Python Beginners &#8902; Code A Star\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif\",\"datePublished\":\"2017-07-09T09:08:16+00:00\",\"dateModified\":\"2017-07-10T06:49:19+00:00\",\"description\":\"In this post we are going to try our Data Science tutorial in Python. And we are targeting Python beginners for this hands-on.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#primaryimage\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif?fit=480%2C270&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif?fit=480%2C270&ssl=1\",\"width\":480,\"height\":270},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.codeastar.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Data Science Tutorial for Absolutely Python Beginners\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.codeastar.com\/#website\",\"url\":\"https:\/\/www.codeastar.com\/\",\"name\":\"Code A Star\",\"description\":\"We don&#039;t wish upon a star, we code a star\",\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.codeastar.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\",\"name\":\"Raven Hon\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"width\":70,\"height\":70,\"caption\":\"Raven Hon\"},\"logo\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\"},\"description\":\"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.\",\"sameAs\":[\"https:\/\/www.codeastar.com\",\"codeastar\",\"https:\/\/x.com\/codeastar\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Data Science Tutorial for Absolutely Python Beginners &#8902; Code A Star","description":"In this post we are going to try our Data Science tutorial in Python. And we are targeting Python beginners for this hands-on.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/","og_locale":"en_US","og_type":"article","og_title":"Data Science Tutorial for Absolutely Python Beginners &#8902; Code A Star","og_description":"In this post we are going to try our Data Science tutorial in Python. And we are targeting Python beginners for this hands-on.","og_url":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/","og_site_name":"Code A Star","article_publisher":"codeastar","article_author":"codeastar","article_published_time":"2017-07-09T09:08:16+00:00","article_modified_time":"2017-07-10T06:49:19+00:00","og_image":[{"url":"https:\/\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif","type":"","width":"","height":""}],"author":"Raven Hon","twitter_card":"summary_large_image","twitter_creator":"@codeastar","twitter_site":"@codeastar","twitter_misc":{"Written by":"Raven Hon","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#article","isPartOf":{"@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/"},"author":{"name":"Raven Hon","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"headline":"Data Science Tutorial for Absolutely Python Beginners","datePublished":"2017-07-09T09:08:16+00:00","dateModified":"2017-07-10T06:49:19+00:00","mainEntityOfPage":{"@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/"},"wordCount":1145,"commentCount":0,"publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"image":{"@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#primaryimage"},"thumbnailUrl":"https:\/\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif","keywords":["Data Science","NumPy","Pandas","Python","Scikit Learn"],"articleSection":["We code therefore we are"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/","url":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/","name":"Data Science Tutorial for Absolutely Python Beginners &#8902; Code A Star","isPartOf":{"@id":"https:\/\/www.codeastar.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#primaryimage"},"image":{"@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#primaryimage"},"thumbnailUrl":"https:\/\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif","datePublished":"2017-07-09T09:08:16+00:00","dateModified":"2017-07-10T06:49:19+00:00","description":"In this post we are going to try our Data Science tutorial in Python. And we are targeting Python beginners for this hands-on.","breadcrumb":{"@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#primaryimage","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif?fit=480%2C270&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2017\/07\/giphy.gif?fit=480%2C270&ssl=1","width":480,"height":270},{"@type":"BreadcrumbList","@id":"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.codeastar.com\/"},{"@type":"ListItem","position":2,"name":"Data Science Tutorial for Absolutely Python Beginners"}]},{"@type":"WebSite","@id":"https:\/\/www.codeastar.com\/#website","url":"https:\/\/www.codeastar.com\/","name":"Code A Star","description":"We don&#039;t wish upon a star, we code a star","publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.codeastar.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd","name":"Raven Hon","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","width":70,"height":70,"caption":"Raven Hon"},"logo":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/"},"description":"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.","sameAs":["https:\/\/www.codeastar.com","codeastar","https:\/\/x.com\/codeastar"]}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8PcRO-3h","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/203","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/comments?post=203"}],"version-history":[{"count":50,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/203\/revisions"}],"predecessor-version":[{"id":286,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/203\/revisions\/286"}],"wp:attachment":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media?parent=203"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/categories?post=203"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/tags?post=203"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}