LSTM参数

LSTM参数
LSTM参数
```
input_size:输入维数
hidden_size:输出维数
num_layers:LSTM层数，默认是1
bias:True 或者 False，决定是否使用bias, False则b_h=0. 默认为True
batch_first:True 或者 False，因为nn.lstm()接受的数据输入是(序列长度，batch，输入维数)，这和我们cnn输入的方式不太一致，所以使用batch_first，我们可以将输入变成(batch，序列长度，输入维数)
dropout:表示除了最后一层之外都引入一个dropout
bidirectional:表示双向LSTM，也就是序列从左往右算一次，从右往左又算一次，这样就可以两倍的输出
```
输入

– input (seq_len, batch_size, input_size)
– h_0 (num_layers * num_directions, batch_size, hidden_size)
– c_0 (num_layers * num_directions, batch_size, hidden_size)

输出

– output (seq_len, batch_size, num_directions * hidden_size)
– h_n (num_layers * num_directions, batch_size, hidden_size)
– c_n (num_layers * num_directions, batch_size, hidden_size)

【注】如果batch_first = True，则output (batch_size, num_directions * hidden_size)

上图是LSTM的执行数据流程图，可以看到第$i$层会输出$h_{n}^{(i)}$，所以第一维为num_layers * num_directions，而对于每个批次，有batch_size个样本，每个样本都要输出，所以第二维的维度为batch_size，第三位就是$h$本身的维度大小了，及hidden_size。

$c_n$的维度大小同$h_n$是相同的。

对于句子中的每个单词，output都有一个输出，所以第一维为seq_len，第二维依然还是batch_size，第三位就是hidden_size，双向的话拼接起来就是2*hidden_size，所以就是num_directions * hidden_size。

由此可以看到当模型为LSTM时，$output[-1,:,:] = h_n[-1,:,:]$。

参考：

【1】LSTM细节分析理解（pytorch版）

【2】[深度学习] Pytorch中RNN/LSTM 模型小结
相关阅读:
2005年春晚冯巩和朱军那个以《艺术人生》为蓝本的小品，冯巩念的诗
 《十一种孤独》札记
 《OpenCV3编程入门》札记
 JavaScript对浏览器的URL进行编码、解码
 Jquery对Cookie的操作
 asp.net对cookie的操作
 asp.net时间日期(DateTime) 的格式处理
 asp.net的JSON数据进行序列化和反序列化
 jQuery自动分页打印表格(HTMLtable)，可以强制换页
 asp.net将内容导出到Excel，Table表格数据(html)导出EXCEL
原文地址：https://www.cnblogs.com/zyb993963526/p/13786310.html

LSTM参数

输入

输出