{"id":2134,"date":"2019-11-10T16:08:01","date_gmt":"2019-11-10T16:08:01","guid":{"rendered":"https:\/\/www.codeastar.com\/?p=2134"},"modified":"2020-01-01T17:24:15","modified_gmt":"2020-01-01T17:24:15","slug":"nmt-make-an-easy-neural-machine-translator","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/","title":{"rendered":"NMT &#8211; make an easy Neural Machine Translator"},"content":{"rendered":"\n<p>I haven&#8217;t updated this blog for a few months. As there is <a rel=\"noreferrer noopener\" aria-label=\"something big happening in my hometown (opens in a new tab)\" href=\"https:\/\/www.reuters.com\/subjects\/hong-kong-protests\" target=\"_blank\">something big happened in my hometown<\/a>. But life must go on, so we come back here and learn a new topic &#8212; NMT (Neural Machine Translation).  You may try the translate service before (if not, let&#8217;s <a rel=\"noreferrer noopener\" aria-label=\"Google Translate (opens in a new tab)\" href=\"https:\/\/translate.google.com\" target=\"_blank\">Google Translate<\/a>), it is the time we find out how natural language processing work in translation. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What is NMT?<\/h3>\n\n\n\n<p>Let we start from the very beginning, what is NMT? It is a machine translation process handling in a <a rel=\"noreferrer noopener\" aria-label=\" (opens in a new tab)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_neural_network\" target=\"_blank\">neural network<\/a>. We have used neural network to solve different kinds of problems in this blog. For example, <a href=\"https:\/\/www.codeastar.com\/convolutional-neural-network-python\/\">hand writing recognition<\/a> and <a href=\"https:\/\/www.codeastar.com\/recurrent-neural-network-rnn-in-nlp-and-python-part-2\/\">toxic comment classification<\/a>. And this time, we use neural network in machine learning, to solve translation problem. <\/p>\n\n\n\n<p>When we started our machine learning journey in the <a href=\"https:\/\/www.codeastar.com\/what-is-data-science\/\">early posts<\/a>, we mentioned machine learning was a series of actions on data analysis. In our case of translation, machine learns word relationships from reading bilingual corpora. It is no problem to translate words in this way. But it is not the full side of machine translation. We want a machine can translate not only words but a whole sentence like a human does. Then we apply neural network on machine translation.<\/p>\n\n\n\n<p>From our previous comment classification exercise, we use neural network to let machine learn several words in a sentence. So the machine can classify rather a sentence is &#8220;toxic&#8221; or not. This time, instead of classifying a &#8220;toxic&#8221; sentence, we classify words on the sentence from language A to language B. How do we do that? Here it comes, Encoder-Decoder Model.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Encoder-Decoder Model<\/h3>\n\n\n\n<p>Although it sounds like a new term to us, in fact, we have tasted part of it before. Do you remember the <a href=\"https:\/\/www.codeastar.com\/word-embedding-in-nlp-and-python-part-1\/\">word embedding<\/a> in NLP? Yes, it was a encoder actually. Last time, we encoded comments into sequences. Then this time, we encode language phases into sequences. The decoder works like an encoder in reserve. But we decode sequences into another language. So the whole concept should look like:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img data-attachment-id=\"2144\" data-permalink=\"https:\/\/www.codeastar.com\/encoder_decoder_2-3\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/Encoder_Decoder_2-2.png?fit=913%2C373&amp;ssl=1\" data-orig-size=\"913,373\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Encoder_Decoder_2\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/Encoder_Decoder_2-2.png?fit=300%2C123&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/Encoder_Decoder_2-2.png?fit=913%2C373&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" width=\"913\" height=\"373\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/Encoder_Decoder_2-2.png?resize=913%2C373&#038;ssl=1\" alt=\"NMT Encoder Decoder Model\" class=\"wp-image-2144\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/Encoder_Decoder_2-2.png?w=913&amp;ssl=1 913w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/Encoder_Decoder_2-2.png?resize=300%2C123&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/Encoder_Decoder_2-2.png?resize=768%2C314&amp;ssl=1 768w\" sizes=\"(max-width: 913px) 100vw, 913px\" data-recalc-dims=\"1\" \/><\/figure>\n\n\n\n<ul><li>First, we embed sentences from language A into ids, to make &#8220;machine learn-able&#8221; sentences :]]. Then we put each id from sentences into LSTM layer, i.e. several LSTM units. Each LSTM unit process its result and push forward to the next unit. <\/li><\/ul>\n\n\n\n<ul><li>There is a context vector to encapsulate the sentence information from the encoder into a vector. <\/li><\/ul>\n\n\n\n<ul><li>Like the encoder, a decoder has its LSTM layer as well. The decoder uses the final state of the encoder as its input. Then each LSTM unit produces an output and passes it to the next unit. At the end, the final layer produces a target word from language B.<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Code a NMT model<\/h3>\n\n\n\n<p>In this blog, we always learn by practice. There are many ways to code a NMT model. You can run <a href=\"https:\/\/www.codeastar.com\/beginner-data-science-tutorial\/\">Anaconda from your machine<\/a>, run on a <a aria-label=\"Kaggle (opens in a new tab)\" href=\"https:\/\/www.codeastar.com\/deep-learning-amazon-web-services-aws\/\">machine learning instance from Amazon<\/a> or follow my favorite way, run from a <a rel=\"noreferrer noopener\" aria-label=\"Kaggle (opens in a new tab)\" href=\"http:\/\/www.kaggle.com\" target=\"_blank\">Kaggle<\/a>&#8216;s notebook. By using Kaggle, there are many pre-installed packages and most importantly, we can use <strong>free<\/strong> 30 GPU hours per week.   <\/p>\n\n\n\n<p>So we start our coding from a Kaggle&#8217;s notebook. As this is a project created by ourselves, other than a Kaggle provided project. We need to import our own dataset. For a NMT project, the dataset we need is a set of bilingual sentence pairs. We selected a language pair from <a rel=\"noreferrer noopener\" aria-label=\"ManyThings.org (opens in a new tab)\" href=\"http:\/\/www.manythings.org\/anki\/\" target=\"_blank\">ManyThings.org<\/a> then placed it on <a rel=\"noreferrer noopener\" aria-label=\"Kaggle's public dataset (opens in a new tab)\" href=\"https:\/\/www.kaggle.com\/codeastar\/en-dutch-pairs\" target=\"_blank\">Kaggle&#8217;s public dataset<\/a>. Dutch was selected as I spent a good time there for my last visit :]] .<\/p>\n\n\n\n<p>Then we add the dataset into our notebook and load it into a dataframe. <\/p>\n\n\n\n<pre lang=\"python\" line=\"1\">def read_text(filename): \n        file = open(filename, mode='rt', encoding='utf-8') \n        text = file.read() \n        file.close() \n        return text\n\n#load our English-Dutch sentence pairs file\nfile_path = \"\/kaggle\/input\/en-dutch-pairs\/nld.txt\"\n#and put its content into a dataframe\ntrain_df = pd.read_csv(file_path, sep='\\t', lineterminator='\\n', names=[\"EN\",\"NL\"])\n<\/pre>\n\n\n\n<p>Let&#8217; see the first few rows of our dataframe.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img data-attachment-id=\"2151\" data-permalink=\"https:\/\/www.codeastar.com\/head\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/head.png?fit=283%2C346&amp;ssl=1\" data-orig-size=\"283,346\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"head\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/head.png?fit=245%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/head.png?fit=283%2C346&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" width=\"245\" height=\"300\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/head.png?resize=245%2C300&#038;ssl=1\" alt=\"train_df.head()\" class=\"wp-image-2151\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/head.png?resize=245%2C300&amp;ssl=1 245w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/10\/head.png?w=283&amp;ssl=1 283w\" sizes=\"(max-width: 245px) 100vw, 245px\" data-recalc-dims=\"1\" \/><\/figure><\/div>\n\n\n\n<p>Now we have the sentence pairs, what&#8217;s next? We change human readable languages to machine readable language. Like the way we did in the toxic comment project, we use tokenizer to do the job. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Unsupervised Tokenizer<\/h3>\n\n\n\n<p>Our tokenizer chops sentences into word pieces and assigns an id to each piece. So a machine can use word ids to do NLP. Then it raises out a question, how does a tokenizer &#8220;chop&#8221; a sentence? For English sentences, we may assume it &#8220;chops&#8221; by space. But what about Asian languages like Chinese(&#8220;\u4f60\u597d\u55ce?&#8221;) and Japanese(&#8220;\u79c1\u306f\u65e5\u672c\u8a9e\u3092\u8a71\u3057\u307e\u3059&#8221;)? There is no space within those words. Are we going to use different tokenizers for different languages. Well, you can. But we find a better way to handle it. <\/p>\n\n\n\n<p>We can use unsupervised tokenizer. That means we don&#8217;t apply rules and languages for tokenizer, we let tokenizer learn by itself. And the tokenizer we use is <a rel=\"noreferrer noopener\" aria-label=\"SentencePiece (opens in a new tab)\" href=\"https:\/\/github.com\/google\/sentencepiece\" target=\"_blank\">SentencePiece<\/a> from Google. Other than its unsupervised nature and  language independence, another great we feature of SentencePiece is subword algorithm. <\/p>\n\n\n\n<p>What can we do with &#8220;subword algorithm&#8221;? Subword is in between word and character. For example, &#8220;subword&#8221; itself is a word combined with 2 subwords, &#8220;sub&#8221; and &#8220;word&#8221;. We then use subwords from our vocabulary to help handling rare and unseen words. In an extreme case, English words are combined with 26 subwords. :]]<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">SentencePiece in Action<\/h3>\n\n\n\n<p>We know the basic concept of SentencePiece, let&#8217; see how it goes in our English-Dutch translation project.<\/p>\n\n\n\n<pre lang=\"python\" line=\"1\">import sentencepiece as sp\n\ndef write_trainer_file(col, filename):\n    texts = list(col.values)\n    with open(filename, 'w',encoding='utf-8') as f:\n        for text in texts:\n            f.write(text + \"\\n\")\n#use sentences from train_df as our training data\nen_sp_trainer = \"en_spm.txt\"\nnl_sp_trainer = \"nl_spm.txt\"\nwrite_trainer_file(train_df[\"EN\"], en_sp_trainer)\nwrite_trainer_file(train_df[\"NL\"], nl_sp_trainer)\n\n#create our English SentencePiece model\nsp_en_train_param = f\"--input={en_sp_trainer} --model_prefix=en_sp --vocab_size=7000\"\nsp.SentencePieceTrainer.Train(sp_en_train_param)\nen_sp = sp.SentencePieceProcessor()\nen_sp.Load(\"en_sp.model\")\n\n#create our Dutch SentencePiece model\nsp_nl_train_param = f\"--input={nl_sp_trainer} --model_prefix=nl_sp --vocab_size=9500\"\nsp.SentencePieceTrainer.Train(sp_nl_train_param)\nnl_sp = sp.SentencePieceProcessor()\nnl_sp.Load(\"nl_sp.model\")\n<\/pre>\n\n\n\n<p>First, we load language sentences from our &#8220;<em>train_df<\/em>&#8221; dataframe into &#8220;<em>en_spm.txt<\/em>&#8221; and &#8220;<em>nl_spm.txt<\/em>&#8221; text files. Then we use those files with SentencePiece to train English and Dutch tokenizer models. Please note that we have assigned vocabulary sizes to each language. And the number is related to the size of the language sentences we have. i.e. When we have more sentences, there should be more words, so we should have larger vocabulary size.  <\/p>\n\n\n\n<p>Once we have trained our tokenizer models, we can use them to encode and decode sentences.<\/p>\n\n\n\n<pre lang=\"python\" lang=\"1\">print(en_sp.EncodeAsPieces(\"This is a test.\"))\nprint(en_sp.EncodeAsIds(\"This is a test.\"))\nprint(en_sp.DecodeIds(en_sp.EncodeAsIds(\"This is a test.\")))\n<\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>['\u2581Thi', 's', '\u2581is', '\u2581a', '\u2581test', '.']\n[69, 9, 13, 11, 1263, 3]\nThis is a test.<\/code><\/pre>\n\n\n\n<p>We find that &#8220;This&#8221; is divided into 2 subwords, &#8220;Thi&#8221; and &#8220;s&#8221;.  As &#8220;s&#8221; is widely used in plural nouns and &#8220;thi&#8221; can be found from many words like &#8220;think&#8221;, &#8220;thing&#8221; and &#8220;third&#8221;. The rules of tokenization are depended on the sizes of training data and the vocabulary setting.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Make the machine read the languages<\/h3>\n\n\n\n<p>Now we can tokenize sentences and encode words into ids, it is the time to let our machine read the languages. <\/p>\n\n\n\n<pre lang=\"python\" line=\"1\">def encode_sentence(df, lang, spm):\n    lang_pieces = []\n    lang_lens = []\n    for index, row in df.iterrows():\n        lang_piece = spm.EncodeAsIds(row[lang])\n        lang_pieces.append(lang_piece)\n        lang_lens.append(len(lang_piece)) \n    df[f\"{lang}_pieces\"] = lang_pieces\n    df[f\"{lang}_len\"] = lang_lens\n\nencode_sentence(train_df, \"EN\", en_sp)\nencode_sentence(train_df, \"NL\", nl_sp)\n<\/pre>\n\n\n\n<p>We use our SentencePiece models created earlier to encode English and Dutch sentences. Then we see what we get here:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img data-attachment-id=\"2162\" data-permalink=\"https:\/\/www.codeastar.com\/encode2\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/encode2-e1572775759443.png?fit=1406%2C380&amp;ssl=1\" data-orig-size=\"1406,380\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"encode2\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/encode2-e1572775759443.png?fit=300%2C81&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/encode2-e1572775759443.png?fit=1024%2C277&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"277\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/encode2.png?resize=1024%2C277&#038;ssl=1\" alt=\"Encode sentences to ids\" class=\"wp-image-2162\" data-recalc-dims=\"1\"\/><\/figure>\n\n\n\n<p>Words are tokenized and encoded into ids. Next, let&#8217; see the lengths distribution of English and Dutch sentences.<\/p>\n\n\n\n<pre lang=\"python\" line=\"1\">def plotLangLen(lang1, lang2):\n    trace1 = go.Histogram(\n        x=train_df[f\"{lang1}_len\"].values,\n        opacity=0.75,\n        name = f\"Length of {lang1} sentences\",\n        marker=dict(color='rgba(171, 50, 96, 0.6)'))\n    trace2 = go.Histogram(\n        x=train_df[f\"{lang2}_len\"].values,\n        opacity=0.75,\n        name = f\"Length of {lang2} sentences\",\n        marker=dict(color='rgba(12, 50, 196, 0.6)'))\n\n    data = [trace1, trace2]\n    layout = go.Layout(barmode='overlay',\n                       title=f\"Lengths of {lang1} and {lang2} sentences\",\n                       xaxis=dict(title='Length'),\n                       yaxis=dict( title='Count'),\n    )\n    fig = go.Figure(data=data, layout=layout)\n    iplot(fig, config={'showLink': True})\n\nplotLangLen(\"EN\", \"NL\")\n<\/pre>\n\n\n\n<iframe loading=\"lazy\" width=\"100%\" height=\"530\" frameborder=\"0\" scrolling=\"no\" src=\"\/\/plot.ly\/~codeastar\/32.embed\"><\/iframe>\n\n\n\n<p>Well, most of English and Dutch sentences are under length of 30 pieces. So we use 30 as the max length for sentence padding, thus we can save our machine&#8217;s processing time.<\/p>\n\n\n\n<pre lang=\"python\" line=\"1\">from keras.preprocessing.sequence import pad_sequences\nen_vocab_size = en_sp.get_piece_size()\nnl_vocab_size = nl_sp.get_piece_size()\nen_max_length = train_df[\"EN_len\"].max()\nnl_max_length = train_df[\"NL_len\"].max()\n#we use 30 as length here, to shorten processing time\nen_max_length=30\nnl_max_length=en_max_length\n#use post padding to fill up short sentence with 0\nen_padded_seq = pad_sequences(train_df[\"EN_pieces\"].tolist(), maxlen=en_max_length, padding='post')\nnl_padded_seq = pad_sequences(train_df[\"NL_pieces\"].tolist(), maxlen=nl_max_length, padding='post')\ntrain_seq_df = pd.DataFrame( {'en_seq':en_padded_seq.tolist(), 'nl_seq':nl_padded_seq.tolist()})\n<\/pre>\n\n\n\n<p>And now we have made a English-Dutch dataframe which is only readable by machine.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img data-attachment-id=\"2169\" data-permalink=\"https:\/\/www.codeastar.com\/seq\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/seq.png?fit=874%2C319&amp;ssl=1\" data-orig-size=\"874,319\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"seq\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/seq.png?fit=300%2C109&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/seq.png?fit=874%2C319&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" width=\"874\" height=\"319\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/seq.png?resize=874%2C319&#038;ssl=1\" alt=\"Language sequence dataframe\" class=\"wp-image-2169\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/seq.png?w=874&amp;ssl=1 874w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/seq.png?resize=300%2C109&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/seq.png?resize=768%2C280&amp;ssl=1 768w\" sizes=\"(max-width: 874px) 100vw, 874px\" data-recalc-dims=\"1\" \/><\/figure><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">The NMT Model Building Time!<\/h3>\n\n\n\n<p>Although the heading here sounds attractive, in fact, we just do a basic NMT model in a few lines of code. :]]<\/p>\n\n\n\n<pre lang=\"python\" line=\"1\">def define_model(input_vocab,output_vocab, input_length,output_length,output_dim):\n      model = Sequential()\n      #mark_zero , set 0 as special character reserved for unknown words  \n      model.add(Embedding(input_vocab, output_dim, input_length=input_length, mask_zero=True))\n      model.add(LSTM(output_dim))\n      #repeat the input (n) times\n      model.add(RepeatVector(output_length))\n      #return the full sequences\n      model.add(LSTM(output_dim, return_sequences=True))\n      model.add(Dense(output_vocab, activation='softmax'))\n      return model\n<\/pre>\n\n\n\n<p>We then put the language sequences into our model and start to train it. Have a cup of your favorite drink or spend some time for other CodeAStar articles. :]] We are going to witness the birth of our first ever NMT model. <\/p>\n\n\n\n<pre lang=\"python\" line=\"1\">train, test = train_test_split(train_seq_df, test_size=0.1, random_state = 3)\ntrainX = np.asarray(train[\"nl_seq\"].tolist())\ntrainY = np.asarray(train[\"en_seq\"].tolist())\ntestX = np.asarray(test[\"nl_seq\"].tolist())\ntestY = np.asarray(test[\"en_seq\"].tolist())\n#sparse_categorical_crossentropy for densed target output as integers\nmodel.compile(optimizer=optimizers.RMSprop(lr=0.001), loss='sparse_categorical_crossentropy')\nfilename = 'nmt_model'\ncheckpoint = ModelCheckpoint(filename, monitor='val_loss', verbose=1, save_best_only=True, mode='min')\n# train model\nhistory = model.fit(trainX, trainY.reshape(trainY.shape[0], trainY.shape[1], 1),\n                    epochs=15, batch_size=64, validation_split = 0.1,callbacks=[checkpoint], \n                    verbose=1)\n<\/pre>\n\n\n\n<p>When we train our model, we record the loss and validation loss of each epoch.<\/p>\n\n\n\n<iframe loading=\"lazy\" width=\"100%\" height=\"500\" frameborder=\"0\" scrolling=\"no\" src=\"\/\/plot.ly\/~codeastar\/34.embed\"><\/iframe>\n\n\n\n<p>It seems that our model won&#8217;t be improved after the 8th epoch. Since we have saved the 8th epoch model using <em>ModelCheckpoint<\/em>, we can load the model there and predict our outcome. We then use Dutch sentences as input and predict the English translations.<\/p>\n\n\n\n<pre lang=\"python\" line=\"1\">model = load_model('nmt_model')\npreds = model.predict_classes(testX.reshape((testX.shape[0],testX.shape[1])))\n\ndef get_word(ids, tokenizer):\n    return tokenizer.DecodeIds(list(filter(lambda a: a != 0, ids.tolist())))\n\ntest_ids = []\ntest_nls = []\ntest_ens = []\ntest_mts = []\nfor y_index in range(len(testY)): \n  test_ids.append(y_index)\n  test_nls.append(get_word(testX[y_index], nl_sp))\n  test_ens.append(get_word(testY[y_index], en_sp))\n  test_mts.append(get_word(preds[y_index], en_sp))\n\npredict_df = pd.DataFrame( {'id':test_ids, 'NL':test_nls, 'EN':test_ens, 'MT':test_mts})\n<\/pre>\n\n\n\n<p>We pick 10 Dutch sentences (NL) from the testing set randomly, then compare the actual English translation (EN) with our machine&#8217;s translation (MT).<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img data-attachment-id=\"2175\" data-permalink=\"https:\/\/www.codeastar.com\/translate\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/translate.png?fit=1029%2C528&amp;ssl=1\" data-orig-size=\"1029,528\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"translate\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/translate.png?fit=300%2C154&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/translate.png?fit=1024%2C525&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"525\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/translate.png?resize=1024%2C525&#038;ssl=1\" alt=\"Dutch-English Machine Translation\" class=\"wp-image-2175\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/translate.png?resize=1024%2C525&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/translate.png?resize=300%2C154&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/translate.png?resize=768%2C394&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/11\/translate.png?w=1029&amp;ssl=1 1029w\" sizes=\"(max-width: 1000px) 100vw, 1000px\" data-recalc-dims=\"1\" \/><\/figure>\n\n\n\n<p>Oh well, there&#8217;s room for improvement. But at least we have made our first ever NMT model.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Suggestions<\/h3>\n\n\n\n<p>From above results, we notice that certain words cannot be translated. We may need to increase the number of sentence pairs for model training. Other than the training dataset, we need to use larger training files for both Dutch and English SentencePiece models as well, thus they can recognize more words. <\/p>\n\n\n\n<p>On the NMT model itself, we may need to add more layers to tune up the model performance. We have just started our journey of NMT here, there are still a lot of places to be adventured!<\/p>\n\n\n\n<div style=\"height:160px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">What have we learnt in this post?<\/h3>\n\n\n\n<ol><li>Concept of NMT <\/li><li>Concept of Encode-Decoder Model<\/li><li>Usage of unsupervised tokenizer<\/li><li>Building a working NMT model<\/li><\/ol>\n\n\n\n<p>(the complete source can be found at&nbsp;<strong>Kaggle<\/strong>:&nbsp;<a href=\"https:\/\/www.kaggle.com\/codeastar\/nmt-playground\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\" (opens in a new tab)\">https:\/\/www.kaggle.com\/codeastar\/nmt-playground<\/a> ) <\/p>\n","protected":false},"excerpt":{"rendered":"<p>I haven&#8217;t updated this blog for a few months. As there is something big happened in my hometown. But life must go on, so we come back here and learn a new topic &#8212; NMT (Neural Machine Translation). You may try the translate service before (if not, let&#8217;s Google Translate), it is the time we [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2191,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"default","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"default","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"NMT - make an easy Neural Machine Translator with Python and Keras.","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[18],"tags":[119,59,146,160,161,140,159,162,141],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.8.1 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>NMT - make an easy Neural Machine Translator &#8902; Code A Star<\/title>\n<meta name=\"description\" content=\"You may try the NMT (Neural Machine Translation) like Google Translate before, but what about making your own NMT in Python and Keras? Let&#039;s try it here!\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"NMT - make an easy Neural Machine Translator &#8902; Code A Star\" \/>\n<meta property=\"og:description\" content=\"You may try the NMT (Neural Machine Translation) like Google Translate before, but what about making your own NMT in Python and Keras? Let&#039;s try it here!\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/\" \/>\n<meta property=\"og:site_name\" content=\"Code A Star\" \/>\n<meta property=\"article:publisher\" content=\"codeastar\" \/>\n<meta property=\"article:author\" content=\"codeastar\" \/>\n<meta property=\"article:published_time\" content=\"2019-11-10T16:08:01+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-01-01T17:24:15+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2020\/01\/nmt_4.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1052\" \/>\n\t<meta property=\"og:image:height\" content=\"551\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Raven Hon\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@codeastar\" \/>\n<meta name=\"twitter:site\" content=\"@codeastar\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Raven Hon\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/\"},\"author\":{\"name\":\"Raven Hon\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"headline\":\"NMT &#8211; make an easy Neural Machine Translator\",\"datePublished\":\"2019-11-10T16:08:01+00:00\",\"dateModified\":\"2020-01-01T17:24:15+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/\"},\"wordCount\":1401,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"keywords\":[\"easy\",\"keras\",\"LSTM\",\"Machine Translation\",\"neural network\",\"NLP\",\"NMT\",\"SentencePiece\",\"Word embedding\"],\"articleSection\":[\"Learn Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/\",\"url\":\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/\",\"name\":\"NMT - make an easy Neural Machine Translator &#8902; Code A Star\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/#website\"},\"datePublished\":\"2019-11-10T16:08:01+00:00\",\"dateModified\":\"2020-01-01T17:24:15+00:00\",\"description\":\"You may try the NMT (Neural Machine Translation) like Google Translate before, but what about making your own NMT in Python and Keras? Let's try it here!\",\"breadcrumb\":{\"@id\":\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.codeastar.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"NMT &#8211; make an easy Neural Machine Translator\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.codeastar.com\/#website\",\"url\":\"https:\/\/www.codeastar.com\/\",\"name\":\"Code A Star\",\"description\":\"We don&#039;t wish upon a star, we code a star\",\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.codeastar.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\",\"name\":\"Raven Hon\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"width\":70,\"height\":70,\"caption\":\"Raven Hon\"},\"logo\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\"},\"description\":\"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.\",\"sameAs\":[\"https:\/\/www.codeastar.com\",\"codeastar\",\"https:\/\/twitter.com\/codeastar\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"NMT - make an easy Neural Machine Translator &#8902; Code A Star","description":"You may try the NMT (Neural Machine Translation) like Google Translate before, but what about making your own NMT in Python and Keras? Let's try it here!","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/","og_locale":"en_US","og_type":"article","og_title":"NMT - make an easy Neural Machine Translator &#8902; Code A Star","og_description":"You may try the NMT (Neural Machine Translation) like Google Translate before, but what about making your own NMT in Python and Keras? Let's try it here!","og_url":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/","og_site_name":"Code A Star","article_publisher":"codeastar","article_author":"codeastar","article_published_time":"2019-11-10T16:08:01+00:00","article_modified_time":"2020-01-01T17:24:15+00:00","og_image":[{"width":1052,"height":551,"url":"https:\/\/www.codeastar.com\/wp-content\/uploads\/2020\/01\/nmt_4.png","type":"image\/png"}],"author":"Raven Hon","twitter_card":"summary_large_image","twitter_creator":"@codeastar","twitter_site":"@codeastar","twitter_misc":{"Written by":"Raven Hon","Est. reading time":"11 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/#article","isPartOf":{"@id":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/"},"author":{"name":"Raven Hon","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"headline":"NMT &#8211; make an easy Neural Machine Translator","datePublished":"2019-11-10T16:08:01+00:00","dateModified":"2020-01-01T17:24:15+00:00","mainEntityOfPage":{"@id":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/"},"wordCount":1401,"commentCount":0,"publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"keywords":["easy","keras","LSTM","Machine Translation","neural network","NLP","NMT","SentencePiece","Word embedding"],"articleSection":["Learn Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/","url":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/","name":"NMT - make an easy Neural Machine Translator &#8902; Code A Star","isPartOf":{"@id":"https:\/\/www.codeastar.com\/#website"},"datePublished":"2019-11-10T16:08:01+00:00","dateModified":"2020-01-01T17:24:15+00:00","description":"You may try the NMT (Neural Machine Translation) like Google Translate before, but what about making your own NMT in Python and Keras? Let's try it here!","breadcrumb":{"@id":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.codeastar.com\/"},{"@type":"ListItem","position":2,"name":"NMT &#8211; make an easy Neural Machine Translator"}]},{"@type":"WebSite","@id":"https:\/\/www.codeastar.com\/#website","url":"https:\/\/www.codeastar.com\/","name":"Code A Star","description":"We don&#039;t wish upon a star, we code a star","publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.codeastar.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd","name":"Raven Hon","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","width":70,"height":70,"caption":"Raven Hon"},"logo":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/"},"description":"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.","sameAs":["https:\/\/www.codeastar.com","codeastar","https:\/\/twitter.com\/codeastar"]}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2020\/01\/nmt_4.png?fit=1052%2C551&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8PcRO-yq","jetpack-related-posts":[{"id":2404,"url":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/","url_meta":{"origin":2134,"position":0},"title":"The Lazy (and Easy) Pre-Trained Translator of the Year","author":"Raven Hon","date":"November 30, 2022","format":false,"excerpt":"We made our own Neural Machine Translator (NMT) in 2019, which helped us to translate Dutch to English. Now it is 2022, and many things have changed in the world of Data Science. The arrival of Bidirectional Encoder Representations from Transformers (BERT), a pre-trained transformer model, in 2019 brought a\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Lazy Translator","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/lazy_translator.png?fit=1200%2C469&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/lazy_translator.png?fit=1200%2C469&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/lazy_translator.png?fit=1200%2C469&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/lazy_translator.png?fit=1200%2C469&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/lazy_translator.png?fit=1200%2C469&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":1941,"url":"https:\/\/www.codeastar.com\/recurrent-neural-network-rnn-in-nlp-and-python-part-2\/","url_meta":{"origin":2134,"position":1},"title":"RNN (Recurrent Neural Network) in NLP and Python &#8211; Part 2","author":"Raven Hon","date":"May 15, 2019","format":false,"excerpt":"From our Part 1 of NLP and Python topic, we talked about word pre-processing for a machine to handle words. This time, we are going to talk about building a model for a machine to classify words. We learned to use CNN to classify images in past. Then we use\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Recurrent Neural Network","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":764,"url":"https:\/\/www.codeastar.com\/convolutional-neural-network-python\/","url_meta":{"origin":2134,"position":2},"title":"Python Image Recognizer with Convolutional Neural Network","author":"Raven Hon","date":"February 11, 2018","format":false,"excerpt":"On our data science journey, we have solved classification and regression problems. What's next? There is one popular machine learning territory we have not set feet on yet --- the image recognition. But now the wait is over, in this post we are going to teach our machine to recognize\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Teach our machine with Convolutional Neural Network","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/learning.png?fit=1052%2C744&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":1066,"url":"https:\/\/www.codeastar.com\/bartener-machine-learning\/","url_meta":{"origin":2134,"position":3},"title":"&#8220;Do you have a dog?&#8221; explained in Machine Learning","author":"Raven Hon","date":"May 19, 2018","format":false,"excerpt":"You have probably read the above comic in 9gag or imgur before. It is a funny joke, but on the other hand, it is also a material for our Machine Learning topic. It sounds weird? Oh yeah, sometimes knowledge comes from strange ideas. The Comic Here is the comic, for\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"\"Do you have a dog?\" in Machine Learning","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/05\/dyhad.png?fit=377%2C221&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":815,"url":"https:\/\/www.codeastar.com\/visualize-convolutional-neural-network\/","url_meta":{"origin":2134,"position":4},"title":"Visualize a Convolutional Neural Network","author":"Raven Hon","date":"February 21, 2018","format":false,"excerpt":"On last post, we tried our image recognition project\u00a0with handwritten digits. We used a Convolutional Neural Network (CNN) to train our machine and it did pretty well with 99.47% accuracy. We learnt how a CNN works by actually implementing a model. Today, we move one step further to learn more\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Visualize a CNN","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/cnn_seal.png?fit=1052%2C744&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/cnn_seal.png?fit=1052%2C744&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/cnn_seal.png?fit=1052%2C744&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/cnn_seal.png?fit=1052%2C744&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/02\/cnn_seal.png?fit=1052%2C744&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":1895,"url":"https:\/\/www.codeastar.com\/word-embedding-in-nlp-and-python-part-1\/","url_meta":{"origin":2134,"position":5},"title":"Word Embedding in NLP and Python &#8211; Part 1","author":"Raven Hon","date":"April 30, 2019","format":false,"excerpt":"We have handled text in machine learning using TFIDF. And we can use it to build word cloud for analytic purpose. But is it the capability of a machine can do on text? Definitely not, as we just haven't let machine to \"learn\" about text yet. TFIDF is a statistics\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Happy word embedding","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/04\/happy_embedding.png?fit=800%2C779&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/04\/happy_embedding.png?fit=800%2C779&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/04\/happy_embedding.png?fit=800%2C779&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/04\/happy_embedding.png?fit=800%2C779&ssl=1&resize=700%2C400 2x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/2134"}],"collection":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/comments?post=2134"}],"version-history":[{"count":39,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/2134\/revisions"}],"predecessor-version":[{"id":2187,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/2134\/revisions\/2187"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media\/2191"}],"wp:attachment":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media?parent=2134"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/categories?post=2134"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/tags?post=2134"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}