How Did I Failed RNN

Why is RNN more difficult to work with them MLP or CNN?

This is, of course, an open question. But for me, the troubles are majorly caused by text embeddings.

The problem we encountered was to load the text embedded vector into the gRAM (both 6GB on my local GTX 1660Ti and gcloud 12GB Tesla K100).

The problem was fixed before the deadline with the utilization of torchtext. We did not change any of the tokenize methods, but used BucketIterator from torchtext to feed data into the network instead of batching the embeddings ourselves.

Without reading the implementation of torchtext, I’m guessing the package utilizes some kind of easy-loading technics.

For code example and how to use torchtext.data, checkout

Text Generation

Our model works, on some level. Following this tutorial, the model we build can be trained, but the jokes it generated are not understandable.

ToDo analysis more on the LSTM on text generation task.