本系列文章由 @yhl_leo 出品,转载请注明出处。
文章链接: http://blog.csdn.net/yhl_leo/article/details/51793984
本文列举出几种python中常见的计算点积的方式,并统计随着向量维度的增大,各种方法的计算效率上的差异。
运行环境:
- CPU:Intel® Core™ i7-5930K @ 3.50GHz
- Python: 2.7.6
代码:
from itertools import izip, starmap, imap
import operator
import numpy as np
import time
r = range(10000)
# method 1
np.dot(r,r)
# method 2
sum(starmap(operator.mul, izip(r,r)))
# method 3
out = 0
for k in range(len(r)):
out += r[k] * r[k]
# method 4
sum(map(operator.mul,r,r))
# method 5
sum(imap(operator.mul,r,r))
# method 6
sum(i*j for i, j in zip(r, r))
统计在不同向量维度:
10, 100, 1000, 2000, 3000, 4000, 5000, 8000, 10000
各运行三次:
10
1. 0.000285 0.000188 0.000309
2. 0.000117 6.3e-05 9.4e-05
3. 9.9e-05 6.1e-05 9.2e-05
4. 8.6e-05 4.4e-05 7.6e-05
5. 5.7e-05 4e-05 6.99999999999e-05
6. 9.3e-05 6e-05 8.29999999999e-05
100
1. 0.000513 0.00052 0.000504
2. 0.000169 0.000162 0.000167
3. 0.000451 0.000311 0.000288
4. 0.000137 0.000144 0.000153
5. 0.000131 0.000138 0.000141
6. 0.000224 0.000271 0.000216
1000
1. 0.001683 0.001687 0.001679
2. 0.000664 0.00065 0.000661
3. 0.002238 0.002301 0.002582
4. 0.000821 0.00089 0.00088
5. 0.000707 0.000928 0.000822
6. 0.001958 0.001948 0.00193
2000
1. 0.003138 0.00306 0.003158
2. 0.001197 0.001089 0.001075
3. 0.005211 0.004113 0.004399
4. 0.001891 0.001826 0.001953
5. 0.001415 0.001456 0.00173
6. 0.003595 0.003884 0.004285
3000
1. 0.004468 0.004292 0.004507
2. 0.001842 0.001727 0.001637
3. 0.007802 0.007341 0.006858
4. 0.002548 0.002274 0.0022
5. 0.002374 0.002348 0.002335
6. 0.005697 0.005613 0.005669
4000
1. 0.005946 0.005987 0.005954
2. 0.002251 0.002102 0.002189
3. 0.009069 0.010478 0.009226
4. 0.003149 0.003699 0.003363
5. 0.003032 0.003536 0.003142
6. 0.012805 0.012598 0.012316
5000
1. 0.007411 0.00731 0.007234
2. 0.002744 0.002508 0.002576
3. 0.012194 0.01231 0.009216
4. 0.003953 0.003815 0.003936
5. 0.00354 0.002698 0.002948
6. 0.013849 0.012262 0.015122
8000
1. 0.010604 0.011742 0.011604
2. 0.004712 0.004703 0.005037
3. 0.020271 0.014874 0.020436
4. 0.007199 0.006417 0.007193
5. 0.006887 0.006889 0.006892
6. 0.021665 0.021659 0.021992
10000
1. 0.01461 0.013028 0.014307
2. 0.005814 0.005789 0.005875
3. 0.023581 0.025064 0.025116
4. 0.008041 0.008833 0.008868
5. 0.007898 0.008619 0.008925
6. 0.025248 0.02643 0.026212
取运行时间的均值,绘制成曲线图,可以看出,几种方法里,第2种方法的复杂度最小,随着向量维度的增加,时间消耗增加比较缓慢,而其他方法则相对较大。