{"id":1487,"date":"2018-11-21T20:35:55","date_gmt":"2018-11-21T20:35:55","guid":{"rendered":"https:\/\/www.codeastar.com\/?p=1487"},"modified":"2020-06-02T01:28:06","modified_gmt":"2020-06-02T01:28:06","slug":"revenue-prediction-google-store","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/","title":{"rendered":"Revenue Prediction in Google Store"},"content":{"rendered":"<p>Every business owner wants to make revenue prediction, so he or she can have better marketing decisions. On <a href=\"https:\/\/www.kaggle.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Kaggle<\/a>, the data science community site, there is a challenge on <a href=\"https:\/\/www.kaggle.com\/c\/ga-customer-revenue-prediction\" target=\"_blank\" rel=\"noopener noreferrer\">making a store&#8217;s revenue prediction<\/a>. And that is the topic we are looking for. The store in this challenge is none other than the Google\u00a0Merchandise Store. (It seems Google did not spend enough on Google Plus&#8217;s revenue prediction, at the end they just lost the direction and <a href=\"https:\/\/www.codeastar.com\/social-network-journey\/\" target=\"_blank\" rel=\"noopener noreferrer\">decided to close it<\/a>. :]] )<\/p>\n<p><!--more--><\/p>\n<h3>When Big Data is really Big<\/h3>\n<p>We handled the <a href=\"https:\/\/www.codeastar.com\/click-fraud-detection\/\">TalkingData Click Fraud<\/a> challenge with big training dataset in the past. That was a dataset with\u00a0200 million records in 1.2GB file size. This time, we handle a dataset with 1.7 million records, but well, in<strong> 23.7GB<\/strong> file size. Once again, it is impossible to load the file directly from the Kaggle&#8217;s 17GB kernel. It should be impossible to load the file from a machine with 32GB ram also. But, by using the trick we learnt from TalkingData challenge &#8212; <em>nrows<\/em>, we can load a part (20 rows) of the dataset file first.<\/p>\n<pre lang=\"python\" line=\"1\">df = pd.read_csv('..\/input\/train_v2.csv', nrows=20)\ndf.head()\n<\/pre>\n<p>Then we find the reason why the dataset file is so big:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1498\" data-permalink=\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/jsondf\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/jsondf.png?fit=828%2C307&amp;ssl=1\" data-orig-size=\"828,307\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"jsondf\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/jsondf.png?fit=300%2C111&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/jsondf.png?fit=828%2C307&amp;ssl=1\" class=\"aligncenter size-full wp-image-1498\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/jsondf.png?resize=828%2C307&#038;ssl=1\" alt=\"JSON in a df\" width=\"828\" height=\"307\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/jsondf.png?w=828&amp;ssl=1 828w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/jsondf.png?resize=300%2C111&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/jsondf.png?resize=768%2C285&amp;ssl=1 768w\" sizes=\"auto, (max-width: 828px) 100vw, 828px\" \/><\/p>\n<p>There are several JSON columns stored inside the file, which contain multiple objects on each row that increase the file size.<\/p>\n<p>So now we have to face 2 issues on the training dataset file:<\/p>\n<ol>\n<li>Loading a huge file<\/li>\n<li>Handling records with JSON objects<\/li>\n<\/ol>\n<p>For the first issue, we can load the 1.7 million plus records in 18 rounds. Then we handle 100 thousand records a time. For the second issue, it is good that Python has a\u00a0<a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/generated\/pandas.io.json.json_normalize.html\" target=\"_blank\" rel=\"noopener noreferrer\">json_normalize<\/a> API for us. We can simple use it to normalize JSON objects into flat table structure.\u00a0 i.e. it transforms nested JSON objects into different columns in a dataframe.<\/p>\n<p>Does everything look good?<\/p>\n<p>Not really. By using json_normalize, we can retrieve normalized columns from the dataset. However it turns out we have so many columns there.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1500\" data-permalink=\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/df_col\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/df_col.png?fit=775%2C732&amp;ssl=1\" data-orig-size=\"775,732\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"df_col\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/df_col.png?fit=300%2C283&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/df_col.png?fit=775%2C732&amp;ssl=1\" class=\"aligncenter wp-image-1500\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/df_col.png?resize=501%2C473&#038;ssl=1\" alt=\"df columns\" width=\"501\" height=\"473\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/df_col.png?resize=300%2C283&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/df_col.png?resize=768%2C725&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/df_col.png?w=775&amp;ssl=1 775w\" sizes=\"auto, (max-width: 501px) 100vw, 501px\" \/><\/p>\n<p>We finds that there are several columns containing the same content.<\/p>\n<pre lang=\"python\" line=\"1\" escaped=\"true\">cols_w_same_content= .nunique() &lt;= 1]\nprint('Columns with same content: ', cols_w_same_content)\n<\/pre>\n<p>And some of them are duplicated, like &#8220;customDimensions&#8221; and &#8220;geoNetwork_continent&#8221;. So we have to filter those problematic columns and combine with the json_normalize method:<\/p>\n<pre lang=\"python\" line=\"1\">import json,time, gc\nfrom pandas.io.json import json_normalize\n\ndef load_df(csv_path, chunksize=100000):\n    #use only reasonable columns \n    features = ['channelGrouping', 'date', 'fullVisitorId', 'visitId',\n                'visitNumber', 'visitStartTime', 'device_browser',\n                'device_deviceCategory', 'device_isMobile', 'device_operatingSystem',\n                'geoNetwork_city', 'geoNetwork_continent', 'geoNetwork_country',\n                'geoNetwork_metro', 'geoNetwork_networkDomain', 'geoNetwork_region',\n                'geoNetwork_subContinent', 'totals_bounces', 'totals_hits',\n                'totals_newVisits', 'totals_pageviews', 'totals_transactionRevenue',\n                'trafficSource_adContent', 'trafficSource_campaign',\n                'trafficSource_isTrueDirect', 'trafficSource_keyword',\n                'trafficSource_medium', 'trafficSource_referralPath',\n                'trafficSource_source']\n\n    #columns with JSON objects to normalize\n    JSON_COLS = ['device', 'geoNetwork', 'totals', 'trafficSource']\n    print('Load {}'.format(csv_path))\n    df_reader = pd.read_csv(csv_path,\n                            converters={ column: json.loads for column in JSON_COLS },\n                            dtype={ 'date': str, 'fullVisitorId': str, 'sessionId': str, \n                                  'totals_transactionRevenue' : 'uint64', 'visitId': 'uint64', 'visitNumber': 'uint8', \n                                  'visitStartTime': 'uint64', 'totals_hits': 'uint8'},\n                            chunksize=chunksize)\n    res = pd.DataFrame()\n    for cidx, df in enumerate(df_reader):\n        df.reset_index(drop=True, inplace=True)\n        for col in JSON_COLS:\n            col_as_df = json_normalize(df[col])\n            col_as_df.columns = ['{}_{}'.format(col, subcol) for subcol in col_as_df.columns]\n            df = df.drop(col, axis=1).merge(col_as_df, right_index=True, left_index=True)\n        res = pd.concat([res, df[features]], axis=0).reset_index(drop=True)\n        del df\n        gc.collect()\n        print('Round {}: DF shape {}'.format(cidx + 1, res.shape))\n    return res\n\nstart_time = time.time()\ntrain_df = load_df('..\/input\/train_v2.csv')\nprint (\"Time used: {} sec\".format(time.time()-start_time))\n<\/pre>\n<p>Then we can load all the 1.7 million records within the 17GB Kaggle&#8217;s kernel in 10 minutes.<\/p>\n<h3>Revenue Prediction in Future<\/h3>\n<p>In this challenge, we are predicting customers&#8217; sum of transaction for\u00a0<strong>December 1st 2018 to January 31st 2019<\/strong>, a future date range.\u00a0First, let&#8217; see what we have in our training and testing datasets. We can use the interactive charts that we learnt from\u00a0<a href=\"https:\/\/www.codeastar.com\/tfidf-predict-deal-probability\/\">Avito Demand Prediction Challenge<\/a>\u00a0for data analysis.<\/p>\n<pre lang=\"python\" line=\"1\">import plotly.graph_objs as go\nimport plotly.offline as py\npy.init_notebook_mode(connected=True)\n\ndf2 = train_df.groupby('date')['totals_transactionRevenue'].sum().reset_index()\ndf3 = test_df.groupby('date')['totals_transactionRevenue'].sum().reset_index()\n\ntrace = go.Scatter(\n            x = pd.to_datetime(df2.date),\n            y = df2.totals_transactionRevenue,\n            name=\"Train df\"\n        )\n\ntrace2 = go.Scatter(\n            x = pd.to_datetime(df3.date),\n            y = df3.totals_transactionRevenue,\n            name=\"Test df\"\n        )\n\nlayout = go.Layout(\n             title = \"Volume of Transaction Revenue among Train and Test datasets\",\n                xaxis=dict(\n                  title='Date',\n                  rangeslider=dict(visible=True),\n                  type='date'\n                ),\n                yaxis=dict(\n                  title=\"Volume of Transaction Revenue\",\n                  type='log',\n                  autorange=True\n                )\n             )\ndata = [trace, trace2]\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)\ndel df2, df3;gc.collect()\n<\/pre>\n<p><iframe loading=\"lazy\" src=\"\/\/plot.ly\/~codeastar\/11.embed\" width=\"800\" height=\"500\" frameborder=\"0\" scrolling=\"no\"><\/iframe><\/p>\n<p>It turns out that we are using training data from August 1st 2016 to April 30th 2017, to predict the future revenue trend for customers from May 1st 2018 to October 15th 2018.<\/p>\n<p>From the chart, it seems Google Store has done well in revenue making. But when we look closer the training data:<\/p>\n<pre lang=\"python\" line=\"1\">trx_gpby_fvid = train_df.groupby(\"fullVisitorId\")[\"totals_transactionRevenue\"].sum().reset_index()\npurchase_rate = trx_gpby_fvid['totals_transactionRevenue'].value_counts(normalize=True)*100\nvolume = purchase_rate[:10].index\npercentage= purchase_rate[:10].values\n\nplt.figure(figsize=(10,6))\nax = sns.barplot(x=volume, y=percentage, order=volume )\nax.set_xticklabels(ax.get_xticklabels(),rotation=30)\nax.set(xlabel=\"Transaction Revenue\", ylabel='Precentage')\nplt.title(\"Top 10 Transaction Revenue\")\nfor p in ax.patches:\n   ax.annotate('{:.3f}%'.format(p.get_height()), (p.get_x()+p.get_width()\/5, p.get_height()+.001))\nplt.show()\n<\/pre>\n<p>We finds out ~98.78% customers just didn&#8217;t spend:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1510\" data-permalink=\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/__results___15_0\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/results___15_0.png?fit=615%2C411&amp;ssl=1\" data-orig-size=\"615,411\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"__results___15_0\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/results___15_0.png?fit=300%2C200&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/results___15_0.png?fit=615%2C411&amp;ssl=1\" class=\"aligncenter size-full wp-image-1510\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/results___15_0.png?resize=615%2C411&#038;ssl=1\" alt=\"Transaction ratio\" width=\"615\" height=\"411\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/results___15_0.png?w=615&amp;ssl=1 615w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/results___15_0.png?resize=300%2C200&amp;ssl=1 300w\" sizes=\"auto, (max-width: 615px) 100vw, 615px\" \/><\/p>\n<h3>Data Analysis by Charts<\/h3>\n<p>It is hard to predict the revenue from the 1.22% spending customers. But things would be easier when we use charts to understand our data.<\/p>\n<pre lang=\"python\" line=\"1\">def generateBarScatChart(df, group_by, mean_by, \n                         x_axis, y_axis1, y_axis2, \n                         group_color=\"royalblue\", mean_color=\"orangered\", width=700, \n                         height=400, record_size=100):     \n    df2 = (train_df.groupby([group_by]).size().reset_index()).sort_values(0, ascending=False) \n    df3 = train_df.groupby([group_by])[mean_by].mean().reset_index()\n    df4 = df2.merge(df3, on=group_by, how='left')[:record_size] \n    trace = go.Bar(\n            x = df4[group_by],\n            y = df4[0],\n            name=y_axis1,\n            marker=dict(color = group_color)\n        )    \n    trace2 = go.Scatter(\n            x = df4[group_by],\n            y = df4[mean_by],\n            yaxis='y2',\n            name=y_axis2,\n            marker=dict(color = mean_color)\n        )  \n    layout = go.Layout(\n                title = \"Top {} {} and {} by {}\".format(record_size, y_axis1, y_axis2,  x_axis),\n                xaxis=dict(\n                  title=x_axis,\n                  tickangle=45,\n                ),\n                yaxis=dict(\n                  title=y_axis1\n                ), \n                yaxis2=dict(\n                    title=y_axis2,\n                    titlefont=dict(\n                        color=mean_color\n                    ),\n                    tickfont=dict(\n                        color=mean_color\n                    ),\n                    overlaying='y',\n                    side='right',\n                    type='log',\n                    autorange=True\n                ),\n                width = width, \n                height = height\n             )\n    data = [trace, trace2]\n    fig = go.Figure(data=data, layout=layout)\n    py.iplot(fig)\n    del df2,df3,df4;gc.collect()\n<\/pre>\n<p>We can start from OS first, see rather the choice of OS affect the number of visitor and the transaction revenue mean.<\/p>\n<pre lang=\"python\" line=\"1\">generateBarScatChart(train_df, \"device_operatingSystem\", \"totals_transactionRevenue\", \n                     \"OS\", \"Number of Vistor\", \"Volume of Revenue Mean\", \n                     height=550)\n<\/pre>\n<p><iframe loading=\"lazy\" src=\"\/\/plot.ly\/~codeastar\/13.embed\" width=\"900\" height=\"600\" frameborder=\"0\" scrolling=\"no\"><span data-mce-type=\"bookmark\" style=\"display: inline-block; width: 0px; overflow: hidden; line-height: 0;\" class=\"mce_SELRES_start\">\ufeff<\/span><span data-mce-type=\"bookmark\" style=\"display: inline-block; width: 0px; overflow: hidden; line-height: 0;\" class=\"mce_SELRES_start\">\ufeff<\/span><\/iframe><\/p>\n<p>MacOS and iOS users are more willing to spend comparing to Windows and Android users. And Chrome OS users spend more than all other OS users in general (we can guess most Chrome OS users are Google\u00a0enthusiasts :]]\u00a0).<\/p>\n<p>Then we go to see the difference among countries (geoNetwork_country).<br \/>\n<iframe loading=\"lazy\" src=\"\/\/plot.ly\/~codeastar\/15.embed\" width=\"800\" height=\"700\" frameborder=\"0\" scrolling=\"no\"><span data-mce-type=\"bookmark\" style=\"display: inline-block; width: 0px; overflow: hidden; line-height: 0;\" class=\"mce_SELRES_start\">\ufeff<\/span><\/iframe><br \/>\nThe result is reasonable, just the Top 8th country, Vietnam, is disappointed with 0 transaction revenue at all.<\/p>\n<p>What about day of week?<\/p>\n<pre lang=\"python\" line=\"1\">train_df[\"visitStartTime_date\"] = pd.to_datetime(train_df['visitStartTime'], unit='s')\ntrain_df['visitStartTime_dayofweek'] = train_df[\"visitStartTime_date\"].dt.day_name()\ngenerateBarScatChart(train_df, \"visitStartTime_dayofweek\", \"totals_transactionRevenue\", \n                     \"Visit Day of Week\", \"Number of Vistor\", \"Volume of Revenue Mean\", \n                     height=600, record_size=7, group_color=\"mediumseagreen\", \n                     width=700)\n<\/pre>\n<p><iframe loading=\"lazy\" src=\"\/\/plot.ly\/~codeastar\/17.embed\" width=\"600\" height=\"600\" frameborder=\"0\" scrolling=\"no\"><\/iframe><br \/>\nPeople trend to buy less on Saturday and Sunday as there are notable drops on revenue. We should include day of week as our feature in the training model.<\/p>\n<h3>Build our model<\/h3>\n<p>There is still a lot of room for improvement on feature engineering. But, it is always good to build our base model first.<\/p>\n<p>From our training and testing dataset files, we have normalized JSON columns. Our next move is, changing categorical data to numeric values. Like the Avito challenge last time, we can use the\u00a0<em>preprocessing.LabelEncoder()\u00a0<\/em>method again to convert those values.<\/p>\n<pre lang=\"python\" line=\"1\">from sklearn import preprocessing\n#get categorical columns, other than \"fullVisitorId\"\ncat_cols = .dtypes == object and c != \"fullVisitorId\" ) ]\nfor col in obj_cols:\n    print(\"Handling column: {}\".format(col))\n    lbl = preprocessing.LabelEncoder()\n    lbl.fit(list(set(train_df[col].values.astype('str')).union(set(test_df[col].values.astype('str')))) )\n    train_df[col] = lbl.transform(list(train_df[col].values.astype('str')))\n    test_df[col] = lbl.transform(list(test_df[col].values.astype('str')))\n<\/pre>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1524\" data-permalink=\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/bf_af\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/bf_af.png?fit=700%2C586&amp;ssl=1\" data-orig-size=\"700,586\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"bf_af\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/bf_af.png?fit=300%2C251&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/bf_af.png?fit=700%2C586&amp;ssl=1\" class=\"aligncenter wp-image-1524\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/bf_af.png?resize=600%2C502&#038;ssl=1\" alt=\"Before and After\" width=\"600\" height=\"502\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/bf_af.png?w=700&amp;ssl=1 700w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/bf_af.png?resize=300%2C251&amp;ssl=1 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/p>\n<p>Now we are ready to prepare training (80% of our data) and validation (20% of it) datasets.<\/p>\n<pre lang=\"python\" line=\"1\">from sklearn.model_selection import train_test_split\ntraining_y = np.log1p(train_df['totals_transactionRevenue'].fillna(0).astype(float))\nX_train, X_valid, Y_train, Y_valid = train_test_split(\n    train_df,\n    training_y,\n    test_size=0.2,\n    random_state=256)\nvalid_fvid = X_valid[\"fullVisitorId\"].values\nvalid_ttxr = X_valid[\"totals_transactionRevenue\"].values\n\ncols_to_drop = ['date', 'fullVisitorId', 'visitId', 'visitStartTime', 'totals_transactionRevenue']\nX_train.drop(cols_to_drop, axis=1, inplace=True)\nX_valid.drop(cols_to_drop, axis=1, inplace=True)\n\ntest_id = test_df[\"fullVisitorId\"].values\ntest_df.drop(cols_to_drop, axis=1, inplace=True)\n<\/pre>\n<p>And we use LGB (Light Gradient\u00a0Boosting) model with basic settings as our training model. From our <a href=\"https:\/\/www.codeastar.com\/lgb-winning-gradient-boosting-model\/\">past post about LGB<\/a>, the LGB model is good for handling large dataset. Moreover, LGB is fast in execution, it makes tuning easier in later stage.<\/p>\n<pre lang=\"python\" line=\"1\">import lightgbm as lgb\ndef run_lgb(train_X, train_y, val_X, val_y, test_X):\n    params = {\n        \"objective\" : \"regression\",\n        \"metric\" : \"rmse\", \n        \"num_leaves\" : 30,\n        \"min_child_samples\" : 100,\n        \"learning_rate\" : 0.1,\n        \"bagging_fraction\" : 0.7,\n        \"feature_fraction\" : 0.5,\n        \"bagging_seed\" : 2018,\n        \"verbosity\" : -1\n    }\n    \n    lgtrain = lgb.Dataset(train_X, label=train_y)\n    lgval = lgb.Dataset(val_X, label=val_y)\n    model = lgb.train(params, lgtrain, 1000, valid_sets=[lgval], early_stopping_rounds=100, verbose_eval=100)\n    \n    pred_test_y = model.predict(test_X, num_iteration=model.best_iteration)\n    pred_val_y = model.predict(val_X, num_iteration=model.best_iteration)\n    return pred_test_y, model, pred_val_y    \n\npred_test, model, pred_val = run_lgb(X_train, Y_train, X_valid, Y_valid, test_df)\n<\/pre>\n<p>You may notice that, other than the test prediction (<em>pred_test<\/em>), we have a validation prediction (<em>pred_val<\/em>) as well. The\u00a0<em>pred_val<\/em>\u00a0is used to check how accurate our model is. Although it can not show the actual accuracy with the test dataset, it provides us an educated guess on our final accuracy. Since our revenue prediction is a regression approach, we score our result using\u00a0<a href=\"https:\/\/www.codeastar.com\/regression-model-rmsd\/\">RMSE\u00a0(Root Mean Squared Error)<\/a>.<\/p>\n<pre lang=\"python\" line=\"1\">from sklearn import metrics\nval_pred_df = pd.DataFrame({\"fullVisitorId\":valid_fvid})\nval_pred_df[\"totals_transactionRevenue\"] = valid_ttxr\npred_val[pred_val &lt; 0] = 0 \nval_pred_df[\"PredictedRevenue\"] = np.expm1(pred_val) \nprint(\"RMSE for validation-&gt;{}\".format(np.sqrt(metrics.mean_squared_error(np.log1p(val_pred_df[\"totals_transactionRevenue\"].values), np.log1p(val_pred_df[\"PredictedRevenue\"].values)))))\n<\/pre>\n<p>Then we have:<\/p>\n<pre>RMSE for validation-&gt;1.5186533156775028<\/pre>\n<p>Our model&#8217;s actual RMSE from Kaggle is 1.7161. Yes, of course not the same as our local RMSE, at least we have a rough idea on how well our model works.<\/p>\n<p>This is our base model for GStore customer revenue prediction, feel free to try different tunings to gain better score!<\/p>\n<h3>What have we learnt in this post?<\/h3>\n<ol>\n<li>Handling of dataset file with huge (&gt; 20GB) file size<\/li>\n<li>Handling of JSON objects in dataset file<\/li>\n<li>Usage of Plotly for EDA<\/li>\n<li>Usage of local validation for having accuracy estimation<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>Every business owner wants to make revenue prediction, so he or she can have better marketing decisions. On Kaggle, the data science community site, there is a challenge on making a store&#8217;s revenue prediction. And that is the topic we are looking for. The store in this challenge is none other than the Google\u00a0Merchandise Store. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1530,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"default","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"default","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[18],"tags":[21,74,30,82,22,36,34,35],"class_list":["post-1487","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-big-data","tag-eda","tag-kaggle","tag-lgb","tag-machine-learning","tag-regression","tag-rmsd","tag-rmse"],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Revenue Prediction in Google Store &#8902; Code A Star<\/title>\n<meta name=\"description\" content=\"Revenue prediction can help making marketing decisions. In this post, we use machine learning to make revenue prediction on Google Merchanise Store.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Revenue Prediction in Google Store &#8902; Code A Star\" \/>\n<meta property=\"og:description\" content=\"Revenue prediction can help making marketing decisions. In this post, we use machine learning to make revenue prediction on Google Merchanise Store.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/\" \/>\n<meta property=\"og:site_name\" content=\"Code A Star\" \/>\n<meta property=\"article:publisher\" content=\"codeastar\" \/>\n<meta property=\"article:author\" content=\"codeastar\" \/>\n<meta property=\"article:published_time\" content=\"2018-11-21T20:35:55+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-06-02T01:28:06+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png\" \/>\n\t<meta property=\"og:image:width\" content=\"800\" \/>\n\t<meta property=\"og:image:height\" content=\"429\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Raven Hon\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@codeastar\" \/>\n<meta name=\"twitter:site\" content=\"@codeastar\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Raven Hon\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/\"},\"author\":{\"name\":\"Raven Hon\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"headline\":\"Revenue Prediction in Google Store\",\"datePublished\":\"2018-11-21T20:35:55+00:00\",\"dateModified\":\"2020-06-02T01:28:06+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/\"},\"wordCount\":927,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"image\":{\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png?fit=800%2C429&ssl=1\",\"keywords\":[\"Big Data\",\"EDA\",\"Kaggle\",\"LGB\",\"Machine Learning\",\"Regression\",\"RMSD\",\"RMSE\"],\"articleSection\":[\"Learn Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/\",\"url\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/\",\"name\":\"Revenue Prediction in Google Store &#8902; Code A Star\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png?fit=800%2C429&ssl=1\",\"datePublished\":\"2018-11-21T20:35:55+00:00\",\"dateModified\":\"2020-06-02T01:28:06+00:00\",\"description\":\"Revenue prediction can help making marketing decisions. In this post, we use machine learning to make revenue prediction on Google Merchanise Store.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#primaryimage\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png?fit=800%2C429&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png?fit=800%2C429&ssl=1\",\"width\":800,\"height\":429,\"caption\":\"GStore Revenue Prediction\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.codeastar.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Revenue Prediction in Google Store\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.codeastar.com\/#website\",\"url\":\"https:\/\/www.codeastar.com\/\",\"name\":\"Code A Star\",\"description\":\"We don&#039;t wish upon a star, we code a star\",\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.codeastar.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\",\"name\":\"Raven Hon\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"width\":70,\"height\":70,\"caption\":\"Raven Hon\"},\"logo\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\"},\"description\":\"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.\",\"sameAs\":[\"https:\/\/www.codeastar.com\",\"codeastar\",\"https:\/\/x.com\/codeastar\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Revenue Prediction in Google Store &#8902; Code A Star","description":"Revenue prediction can help making marketing decisions. In this post, we use machine learning to make revenue prediction on Google Merchanise Store.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/","og_locale":"en_US","og_type":"article","og_title":"Revenue Prediction in Google Store &#8902; Code A Star","og_description":"Revenue prediction can help making marketing decisions. In this post, we use machine learning to make revenue prediction on Google Merchanise Store.","og_url":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/","og_site_name":"Code A Star","article_publisher":"codeastar","article_author":"codeastar","article_published_time":"2018-11-21T20:35:55+00:00","article_modified_time":"2020-06-02T01:28:06+00:00","og_image":[{"width":800,"height":429,"url":"https:\/\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png","type":"image\/png"}],"author":"Raven Hon","twitter_card":"summary_large_image","twitter_creator":"@codeastar","twitter_site":"@codeastar","twitter_misc":{"Written by":"Raven Hon","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#article","isPartOf":{"@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/"},"author":{"name":"Raven Hon","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"headline":"Revenue Prediction in Google Store","datePublished":"2018-11-21T20:35:55+00:00","dateModified":"2020-06-02T01:28:06+00:00","mainEntityOfPage":{"@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/"},"wordCount":927,"commentCount":1,"publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"image":{"@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png?fit=800%2C429&ssl=1","keywords":["Big Data","EDA","Kaggle","LGB","Machine Learning","Regression","RMSD","RMSE"],"articleSection":["Learn Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/","url":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/","name":"Revenue Prediction in Google Store &#8902; Code A Star","isPartOf":{"@id":"https:\/\/www.codeastar.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#primaryimage"},"image":{"@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png?fit=800%2C429&ssl=1","datePublished":"2018-11-21T20:35:55+00:00","dateModified":"2020-06-02T01:28:06+00:00","description":"Revenue prediction can help making marketing decisions. In this post, we use machine learning to make revenue prediction on Google Merchanise Store.","breadcrumb":{"@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.codeastar.com\/revenue-prediction-google-store\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#primaryimage","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png?fit=800%2C429&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png?fit=800%2C429&ssl=1","width":800,"height":429,"caption":"GStore Revenue Prediction"},{"@type":"BreadcrumbList","@id":"https:\/\/www.codeastar.com\/revenue-prediction-google-store\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.codeastar.com\/"},{"@type":"ListItem","position":2,"name":"Revenue Prediction in Google Store"}]},{"@type":"WebSite","@id":"https:\/\/www.codeastar.com\/#website","url":"https:\/\/www.codeastar.com\/","name":"Code A Star","description":"We don&#039;t wish upon a star, we code a star","publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.codeastar.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd","name":"Raven Hon","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","width":70,"height":70,"caption":"Raven Hon"},"logo":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/"},"description":"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.","sameAs":["https:\/\/www.codeastar.com","codeastar","https:\/\/x.com\/codeastar"]}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/11\/gstore.png?fit=800%2C429&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8PcRO-nZ","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/1487","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/comments?post=1487"}],"version-history":[{"count":43,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/1487\/revisions"}],"predecessor-version":[{"id":2279,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/1487\/revisions\/2279"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media\/1530"}],"wp:attachment":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media?parent=1487"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/categories?post=1487"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/tags?post=1487"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}