cd models/tutorials/rnn/translate python translate.py --data_dir [your_data_directory]
File | What is in it? |
---|---|
tensorflow / tensorflow / python / ops / seq2seq.py | Library for creating sequence-to-sequence models |
models / tutorials / rnn / translate / seq2seq_model.py | Sequence-to-sequence model of neural translation |
models / tutorials / rnn / translate / data_utils.py | Auxiliary functions for preparing translation data |
models / tutorials / rnn / translate / translate.py | A binary that trains and runs a translation model |
outputs, states = basic_rnn_seq2seq(encoder_inputs, decoder_inputs, cell)
encoder_inputs
is a list of tensors representing encoder input data, corresponding to the letters A, B, C from the image above. Similarly, decoder_inputs
are tensors representing decoder input data. GO, W, X, Y, Z from the first picture.cell
argument is an instance of the tf.contrib.rnn.RNNCell
class, which determines which cell will be used in the model. You can use existing cells, for example, GRUCell
or LSTMCell
, or you can write your own. In addition, tf.contrib.rnn
provides a shell for creating layered cells, adding exceptions to cell input and output data, or other transformations. Read the RNN Tutorial for examples.basic_rnn_seq2seq
returns two arguments: outputs
and states
. They both represent a list of tensors of the same length as decoder_inputs
. outputs
corresponds to the output of the decoder at each time step, the first picture is W, X, Y, Z, EOS. The returned states
represents the internal state of the decoder at each time step.seq2seq.py
support both modes with the feed_previous
argument. For example, analyze the following use of the nested RNN model. outputs, states = embedding_rnn_seq2seq( encoder_inputs, decoder_inputs, cell, num_encoder_symbols, num_decoder_symbols, embedding_size, output_projection=None, feed_previous=False)
embedding_rnn_seq2seq
model, all input data (both encoder_inputs
and decoder_inputs
) are integer tensors reflecting discrete values. They will be embedded in a solid representation (for details on the attachment, refer to the Vector Representation Guide ), but to create these attachments, you need to specify the maximum number of discrete characters: num_encoder_symbols
on the coder side and num_decoder_symbols
on the decoder side.feed_previous
to False. This means that the decoder will use the decoder_inputs
tensors as they are provided. If we set feed_previous
to True, the decoder will use only the first decoder_inputs
element. All other tensors from the list will be ignored, and the previous value of the decoder output will be used instead. This is used to decode translations in our translation model, but can also be used during training, to improve the model’s resistance to its errors. Approximately like Bengio et al., 2015 ( pdf ).output_projection
. Without clarification, the conclusions of the nested model will be the form tensors of the number of training samples on num_decoder_symbols
, since they represent the logits of each generated symbol. When training models with large dictionaries at the output, for example, with large num_decoder_symbols
, storing these large tensors becomes impractical. Instead, it is better to return smaller tensors, which will subsequently be projected onto a large tensor using output_projection
. This allows us to use our seq2seq models with sampled softmax losses, as described by Jean et. al., 2014 ( pdf ).basic_rnn_seq2seq
and embedding_rnn_seq2seq
in seq2seq.py
there are several more sequence-to-sequence models. Pay attention to them. All of them have a similar interface, so we will not go into their details. For our translation model below, we use embedding_attention_seq2seq
.Source: https://habr.com/ru/post/430780/
All Articles