<解析>speaker verification模型中的GE2E损失函数

<解析>speaker verification模型中的GE2E损失函数
GE2E loss 是什么
- GE2E loss 全称为Generalized end to end loss function。它聚焦于embedding的差异性，比TE2E(tuple-based endto-end loss function)损失函数更有效。
前提准备
- batch的形式 每个batch由NxM个embedding组成，形状为(N,M,e) ：N个speaker,每个speaker有M个embedding，每个embedding的长度为e。
- (e_{j,i}) 第j个speaker的第i个embedding
- (c_j) 第j个speaker的centroid（我把他翻译为中心向量），(c_{j}) = (frac{1}{M})(sum^{M}_{m=1})(e_{jm})
- (S_{ji,k}) eji和ck的相似度。我们定义S为相似矩阵。(S_{ji,k}) = w · cos((e_{j,i}), (c_k)) + b
计算公式
- 对于每个Batch，其loss为 (L_G)(X; W) = (L_G)(S) = (sum_{j,i})L((e_{j,i}))
- 其中 L((e_{j,i}))有两种计算方式：
  - Contrast
    L((e_{ji})) = 1 - sigmoid((S_{ji,j})) + (max_{1<=k<=N,k!=j})sigmoid((S_{ji,k}))
  - Softmax
    L((e_{ji})) = -(S_{ji,j}) + log(sum^{N}_{k=1})exp((S_{ji,k}))
  - 如何选择：Contrast公式在TD—SV类模型上表现更好，Softmax公式在TI-SV类模型上表现更好。
改进
- 在计算正相关对儿的相似度的时候，即计算(S_{jk,j})的时候将(e_{ji}从)(c_j)的计算公式中移除，会取得更佳的效果。
- TD-SV & TI-SV TD-SV即text-dependent speaker verification，TI-SV即text-independent speaker verification. In TD-SV, the transcript of both enrollment and verification utterances is phonetially constrained, while in TI-SV, there are no lexicon constraints on the transcript of the enrollment or verification utterances, exposing a larger variability of phonemes and utterance durations.
Reference
- GENERALIZED END-TO-END LOSS FOR SPEAKER VERIFICATION https://arxiv.org/pdf/1710.10467.pdf
相关阅读:
云原生技术图谱
 vscode开发环境基础配置
 TVM 模型量化
 开源ERP系统
 PHP 安装和使用pecl_http
Linux中使用onedrive
php 删除无意义的空行，移动文件指针
 PHP 从2个字符串找到相同的部分
 Java JPA
when use sudo to run a specified command which is not found in os, sudo will ask you for password even if you have configured NOPASSWD for this user.
原文地址：https://www.cnblogs.com/dynmi/p/13343455.html

<解析>speaker verification模型中的GE2E损失函数

GE2E loss 是什么

前提准备

计算公式

改进

Reference