Understanding a simple LSTM pytorch

Question

The output for the LSTM is the output for all the hidden nodes on the final layer.
hidden_size – the number of LSTM blocks per layer.
input_size – the number of input features per time-step.
num_layers – the number of hidden layers.
In total there are hidden_size * num_layers LSTM blocks.

The input dimensions are (seq_len, batch, input_size).
seq_len – the number of time steps in each input stream.
batch – the size of each batch of input sequences.

The hidden and cell dimensions are: (num_layers, batch, hidden_size)

output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t.

So there will be hidden_size * num_directions outputs. You didn’t initialise the RNN to be bidirectional so num_directions is 1. So output_size = hidden_size.

Edit: You can change the number of outputs by using a linear layer:

out_rnn, hn = rnn(input, (h0, c0))
lin = nn.Linear(hidden_size, output_size)
v1 = nn.View(seq_len*batch, hidden_size)
v2 = nn.View(seq_len, batch, output_size)
output = v2(lin(v1(out_rnn)))

Note: for this answer I assumed that we’re only talking about non-bidirectional LSTMs.

Source: PyTorch docs.

Leave a Comment Cancel reply