Artificial Intelligence: What is the SEQ2SEQ Training Model?

Nivan Gujral
Level Up Coding
Published in
5 min readApr 1, 2020

--

When we say “Hello”, we can understand what it means, but a computer 11110001111 is all it can understand. It is hard for a computer to understand human language which we as humans can do without a thought. Natural Language Processing (NLP) is a technology where a computer learns human text or speech by using AI algorithms. The SEQ2SEQ Training Model is a type of AI algorithm that is used in NLP. Recently I created a chatbot using this training model. The chatbot can have a positive conversation with the user by asking questions and answering the user’s questions. How does the SEQ2SEQ Training Model work?

The Structure of the SEQ2SEQ Training Model

The SEQ2SEQ training model has two main parts, which are the encoder and the decoder. The encoder acts as the input for the Training Model, while the decoder acts as the output. In the SEQ2SEQ Training Model diagram above, each of the blocks represents an individual Neural Network. The green blocks represent the Encoder's Neural Network and the red blocks represent the Decoder’s Neural Networks. Each of the Encoder’s Neural Network has one input and one output, while each of the Decoder’s Neural Network has one input and two outputs.

Another important part of the SEQ2SEQ Training Model is the EOS which stands for ‘End of Sentence’ or ‘End of String’. The EOS is used to determine when the sentence or string is over. The EOS is also used to store all the values from Encoder and transfer it to the Decoder.

The last part of the SEQ2SEQ Training Model is the Synapse which are the lines that connect each of the Neural Networks together. Synapses are used to carry signals from one Neural Network to another. Each synapse has a weight that belongs to it which determines what signal is the most important to go through. Let’s imagine that the input and the hidden layer are train stations, the Synapses are train tracks, and the signal is the train. The train goes on the train tracks to get to from one station to another station. On the track, there are a lot of trains going through which has caused a backup to occur. To make the trains start flowing smoothly again there is a weight assigned to each which allows the important trains to go through first as compared to the non-important ones.

How does the SEQ2SEQ Training Model work?

The first step the SEQ2SEQ Training Model does in order to give a response is that it takes in the text that it is going to respond to. Let us take the sentence, “Bob ran to the store to buy milk.” Then it takes the text and splits it up amongst all of the different words and phrases. Therefore the sentence “Bob ran to the store to buy milk.” will become ‘Bob’, ‘ran’, ‘to’, ‘the’, ‘store’, ‘to’, ‘buy’, ‘milk’, ‘.’.

Then the SEQ2SEQ Training Model will take each of the individual words and place them into the input of the Encoder’s Neural Network so that each Neural Network is attached to its own word. Each of the Neural Networks will connect to each other to make sure that the Training Model knows the words are connected to each other. All of these signals will then be transferred into the EOS.

After all of the signals get sent to the EOS, they will be transferred to the decoder. The decoder will then look at the sentence and see what is the most likely word that will follow after the input. Then that word will be sent to the next Neural Network’s input. This is done to make sure that all of the words match with each other in one sentence. The probability of each word appearing after the other is determined by the weights. The SEQ2SEQ Training Model does not know the probability of the weights yet, but it needs to learn and figure it out.

How to train the SEQ2SEQ Training Model?

The SEQ2SEQ Training Model trains by taking in sample inputs and outputs so it can learn from them. For my chatbot, I trained the Training model by using lines from movies. Let us take the two lines which are box in red. The input is “She okay?” and the output is “I hope so.”

The SEQ2SEQ Training Model will first take the input and split it up between all of the individual words and punctuations. Then it will take each of the words and place it to its own Neural Network in the Encoder. Then all of the values will be go to the EOS and be transferred to the decoder.

In the decoder, the training model will try to make a prediction of what should be the answer. Then it will compare the value that it predicted with the correct answer from the data. It will then find the cost function which is the error that took place. If there is no cost function then that means the weights of the SEQ2SEQ Training Model are correct. If there is a cost function then it will be sent back into the model and then update the function. It will keep repeating the process until the prediction matches the correct answer.

Let’s imagine a person learning what is the capital of the United States. When they are asked the question, they respond by saying, “It is New York”. The person asking the question, responds by saying “No it is Washington DC”. The person who received the question will try and figure out what mistake he made and will learn from it. When the person is asked the question again he should be able to get it correctly next time.

SEQ2SEQ a Key to Langauge

NLP provides intelligence to applications that most of us use every day and is transforming the way that humans and computers communicate with each other. Currently, Chatbots will help customers get right to the point without the wait, answering customer questions and directing them to relevant resources and products at any hour, any day of the week. NLP also enables people to search smarter by finding the meaning of what they meant instead of just the keywords. There is a lot NLP can help with in the future and the SEQ2SEQ Training Model is a key to unlocking NLP’s full potential.

--

--