TensorFlow2_200729系列---9、前k项准确率实例
一、总结
一句话总结:
就是用tf.math.top_k可以得到概率前k的项的索引构成的数组,然后在普通计算即可,核心点就是矩阵的一些操作
def accuracy(output, target, topk=(1,)): maxk = max(topk) # print("maxk: ",maxk) batch_size = target.shape[0] # print("batch_size: ",batch_size) pred = tf.math.top_k(output, maxk).indices pred = tf.transpose(pred, perm=[1, 0]) target_ = tf.broadcast_to(target, pred.shape) # [10, b] correct = tf.equal(pred, target_) res = [] for k in topk: correct_k = tf.cast(tf.reshape(correct[:k], [-1]), dtype=tf.float32) correct_k = tf.reduce_sum(correct_k) acc = float(correct_k* (100.0 / batch_size) ) res.append(acc) return res
1、随机正态分布,4行3列?
output = tf.random.normal([4, 3])
2、softmax,使概率和为1,横向概率和为1?
output = tf.math.softmax(output, axis=1) # 表示取第一件物品、第二件物品、第三件物品的概率和为1
3、在output中找最大值的索引,横向?
pred = tf.argmax(output, axis=1)
4、将target扩充到pred.shape的shape?
target_ = tf.broadcast_to(target, pred.shape)
二、前k项准确率实例
博客对应课程的视频位置:
import tensorflow as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.random.set_seed(2467)
In [4]:
# 随机正态分布,4行3列
output = tf.random.normal([4, 3])
output
Out[4]:
In [5]:
# softmax,使概率和为1,横向概率和为1
# 表示取第一件物品、第二件物品、第三件物品的概率和为1
output = tf.math.softmax(output, axis=1)
output
Out[5]:
In [8]:
# 0-2,随机label做target
# uniform:均匀分布:产生于low和high之间,产生的值是均匀分布的
target = tf.random.uniform([4], maxval=3, dtype=tf.int32)
target
Out[8]:
In [9]:
print('prob:', output.numpy())
In [10]:
# 在output中找最大值的索引,横向
pred = tf.argmax(output, axis=1)
print('pred:', pred.numpy())
print('label:', target.numpy())
根据output和target做判断
output对应:
```
array([[0.28500617, 0.40185377, 0.31314012],
[0.4922847 , 0.133816 , 0.37389928],
[0.27983427, 0.43976992, 0.2803958 ],
[0.0258685 , 0.6705995 , 0.303532 ]], dtype=float32)>
```
这是output中预测概率最大的物品的序号
pred: [1 0 1 1]
target对应:label: [1 0 2 2]
In [11]:
def accuracy(output, target, topk=(1,)):
maxk = max(topk)
# print("maxk: ",maxk)
batch_size = target.shape[0]
# print("batch_size: ",batch_size)
pred = tf.math.top_k(output, maxk).indices
pred = tf.transpose(pred, perm=[1, 0])
target_ = tf.broadcast_to(target, pred.shape)
# [10, b]
correct = tf.equal(pred, target_)
res = []
for k in topk:
correct_k = tf.cast(tf.reshape(correct[:k], [-1]), dtype=tf.float32)
correct_k = tf.reduce_sum(correct_k)
acc = float(correct_k* (100.0 / batch_size) )
res.append(acc)
return res
output中概率最大物品的序号:pred: [1 0 1 1]
target对应:label: [1 0 2 2]
从上面可以看到,前两个是一样的,top1是50%
总共三件物品,top3肯定是100%
仔细对比数据,也可以发现top2刚好是100%
label:
[1 0 2 2]
output:
array([[0.28500617, 0.40185377, 0.31314012],
[0.4922847 , 0.133816 , 0.37389928],
[0.27983427, 0.43976992, 0.2803958 ],
[0.0258685 , 0.6705995 , 0.303532 ]], dtype=float32)>
[1 0 2 2] 对应的值
array([[***, 0.40185377, ***],
[0.4922847 , *** , ***],
[***, ***, 0.2803958 ],
[*** , *** , 0.303532 ]], dtype=float32)>
top2正确率显然是100%
In [13]:
# topk 表示前k项的正确率
acc = accuracy(output, target, topk=(1,2,3))
print('top-1-3 acc:', acc)
详细测试accuracy函数
In [14]:
topk=(1,2,3)
maxk = max(topk)
print("maxk: ",maxk)
In [15]:
print(target)
batch_size = target.shape[0]
print("batch_size: ",batch_size)
In [16]:
output
Out[16]:
In [18]:
# 计算output的前k大(概率大)的索引
# 比如 [0.28500617, 0.40185377, 0.31314012]中,按照大小排序,那就是[1 2 0]
pred = tf.math.top_k(output, maxk).indices
print(pred)
print("===========================================")
# 转置
pred = tf.transpose(pred, perm=[1, 0])
print(pred)
In [21]:
# 将target扩充到pred.shape的shape
target_ = tf.broadcast_to(target, pred.shape)
print("对比pred和target_,结果很明显:")
print(pred)
print(target_)
print("===============================================")
# [10, b]
correct = tf.equal(pred, target_)
print(correct)
In [22]:
# 很简单的就可以在correct中统计tok的正确率了
res = []
for k in topk:
correct_k = tf.cast(tf.reshape(correct[:k], [-1]), dtype=tf.float32)
correct_k = tf.reduce_sum(correct_k)
acc = float(correct_k* (100.0 / batch_size) )
res.append(acc)
res
Out[22]:
In [ ]: