LSTM和GRU
LSTM
忽略偏置: $$egin{align} i_t&=sigma(x_tcdot W_i+h_{t-1}cdot U_i)\ f_t&=sigma(x_tcdot W_f+h_{t-1}cdot U_f)\ o_t&=sigma(x_tcdot W_o+h_{t-1}cdot U_o)\ widetilde{C}_t&=tanh(x_tcdot W_c+h_{t-1}cdot U_c)\ C_t&=fcdot C_{t-1}+ icdot widetilde{C}_{t}\ h_t&=tanh(o_tcdot C_t) end{align} $$ 其中: >$i_t:$输入门 >$f_t:$遗忘门 >$o_t:$输出门 >$widetilde{C}_t:$新信息GRU——LSTM的一种变体
比较如图:
GRU节点更新方式:
[egin{align}
z_t&=sigma(x_tcdot W_z+h_{t-1}cdot U_z)\
r_t&=sigma(x_tcdot W_r+h_{t-1}cdot U_r)\
widetilde{h}_t&=tanh(x_tcdot W+(r_todot h_{t-1})cdot U)\
h_t&=(1-z_t)h_{t-1}+z_tcdot widetilde{h}_t
end{align}
]
其中:
(z_t:)更新门
(r_t:)重置门