batch/iteration/epoch的区别:
深度学习中的batch(batch size,full batch,mini batch, online learning)、iterations与epoch
GCN相关:
GRAPH CONVOLUTIONAL NETWORKS——THOMAS KIPF
How powerful are Graph Convolutions? (review of Kipf & Welling, 2016)
attention/Transformer:
自然语言处理中的自注意力机制(Self-attention Mechanism)
图解Transformer (英文原文:The Illustrated Transformer 讲得非常详细,强烈推荐!)
依存句法与成分句法:
Constituent Parsing & Dependency Parsing 句法分析简介
Batch Normalization、Layer Normalization:
BatchNormalization、LayerNormalization、InstanceNorm、GroupNorm、SwitchableNorm总结