numpy笔记

通过下标范围获取的新的数组是原始数组的一个视图。它与原始数组共享同一块数据空间，会一起修改
>>> b = a[3:7] # 通过下标范围产生一个新的数组b，b和a共享同一块数据空间
>>> b
array([101, 4, 5, 6])
>>> b[2] = -10 # 将b的第2个元素修改为-10
>>> b
array([101, 4, -10, 6])
>>> a # a的第5个元素也被修改为10
array([ 0, 1, 100, 101, 4, -10, 6, 7, 8, 9])
当使用整数序列对数组元素进行存取时，将使用整数序列中的每个元素作为下标，整数序列可以是列
表或者数组。使用整数序列作为下标获得的数组不和原始数组共享数据空间。
>>> x = np.arange(10,1,-1)
>>> x
array([10, 9, 8, 7, 6, 5, 4, 3, 2])
>>> x[[3, 3, 1, 8]] # 获取x中的下标为3, 3, 1, 8的4个元素，组成一个新的数组
array([7, 7, 9, 2])
>>> b = x[np.array([3,3,-3,8])] #下标可以是负数
>>> b[2] = 100
>>> b
array([7, 7, 100, 2])
>>> x # 由于b和x不共享数据空间，因此x中的值并没有改变
array([10, 9, 8, 7, 6, 5, 4, 3, 2])
>>> x[[3,5,1]] = -1, -2, -3 # 整数序列下标也可以用来修改元素的值
>>> x
array([10, -3, 8, -1, 6, -2, 4, 3, 2])
当使用布尔数组b作为下标存取数组x中的元素时，将收集数组x中所有在数组b中对应下标为True的
元素。使用布尔数组作为下标获得的数组不和原始数组共享数据空间
#多维数组
a=np.arange(0, 60, 10).reshape(-1, 1) + np.arange(0, 6)
array([[ 0, 1, 2, 3, 4, 5],
[10, 11, 12, 13, 14, 15],
[20, 21, 22, 23, 24, 25],
[30, 31, 32, 33, 34, 35],
[40, 41, 42, 43, 44, 45],
[50, 51, 52, 53, 54, 55]])
>>> a[3:,[3,5]]
array([[33, 35],
       [43, 45],
       [53, 55]])
#结构体
>>> persontype = np.dtype({
'names':['name', 'age', 'weight'],
'formats':['S32','i', 'f']})
# S32 : 32个字节的字符串类型，由于结构中的每个元素的大小必须固定，因此需要指定字符串的长度
# i : 32bit的整数类型，相当于np.int32
# f : 32bit的单精度浮点数类型，相当于np.float32
>>> persontype
dtype([('name', 'S32'), ('age', '<i4'), ('weight', '<f4')])
>>> a = np.array([("Zhang",32,75.5),("Wang",24,65.2)],
dtype=persontype)
>>> a
array([('Zhang', 32, 75.5), ('Wang', 24, 65.19999694824219)],
      dtype=[('name', 'S32'), ('age', '<i4'), ('weight', '<f4')])
>>> a["name"]
array(['Zhang', 'Wang'],
      dtype='|S32')
>>> a[['name','age']]
array([('Zhang', 32), ('Wang', 24)],
      dtype=[('name', 'S32'), ('age', '<i4')])
>>> a['age']+200
array([232, 224])
>>> a['name'][0]='cao'
>>> a
array([('cao', 32, 75.5), ('Wang', 24, 65.19999694824219)],
      dtype=[('name', 'S32'), ('age', '<i4'), ('weight', '<f4')])
#ufunc运算
x = np.linspace(0, 2*np.pi, 10) # 等差数列
>>> np.logspace(0, 2, 20) # 等比数列产生1(10^0)到100(10^2)、有20个元素的等比数列:
array([ 1. , 1.27427499, 1.62377674, 2.06913808,
2.6366509 , 3.35981829, 4.2813324 , 5.45559478,
6.95192796, 8.8586679 , 11.28837892, 14.38449888,
18.32980711, 23.35721469, 29.76351442, 37.92690191,
48.32930239, 61.58482111, 78.47599704, 100. ])

>>> x = np.linspace(0, 20, 11)
>>> x
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20.])
>>> len(x)
11
>>> y=np.sin(x)
>>> z=np.sqrt(x)
>>> y
array([ 0. , 0.90929743, -0.7568025 , -0.2794155 , 0.98935825,
       -0.54402111, -0.53657292, 0.99060736, -0.28790332, -0.75098725,
        0.91294525])
>>> z
array([ 0. , 1.41421356, 2. , 2.44948974, 2.82842712,
        3.16227766, 3.46410162, 3.74165739, 4. , 4.24264069,
        4.47213595])
>>> x
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20.])
>>> np.sin(x,x) #将sin函数所计算的结果直接覆盖到数组x上去的话，可以将要被覆盖的数组作为第二个参数传递给ufunc函数。
array([ 0. , 0.90929743, -0.7568025 , -0.2794155 , 0.98935825,
       -0.54402111, -0.53657292, 0.99060736, -0.28790332, -0.75098725,
        0.91294525])
>>> x
array([ 0. , 0.90929743, -0.7568025 , -0.2794155 , 0.98935825,
       -0.54402111, -0.53657292, 0.99060736, -0.28790332, -0.75098725,
        0.91294525])
>>> np.abs(x)
array([ 0. , 0.90929743, 0.7568025 , 0.2794155 , 0.98935825,
        0.54402111, 0.53657292, 0.99060736, 0.28790332, 0.75098725,
        0.91294525])
>>> x
array([ 0. , 0.90929743, -0.7568025 , -0.2794155 , 0.98935825,
       -0.54402111, -0.53657292, 0.99060736, -0.28790332, -0.75098725,
        0.91294525])
>>> np.abs(x,x) #同理，覆盖。
array([ 0. , 0.90929743, 0.7568025 , 0.2794155 , 0.98935825,
        0.54402111, 0.53657292, 0.99060736, 0.28790332, 0.75098725,
        0.91294525])
>>> x
array([ 0. , 0.90929743, 0.7568025 , 0.2794155 , 0.98935825,
        0.54402111, 0.53657292, 0.99060736, 0.28790332, 0.75098725,
        0.91294525])
numpy的ufunc可以对数组直接进行计算，所以np.sin()比for ... math.sin()要快
然而，单个值计算时，np.sin(0.5) 比 math.sin(0.5) 慢。可以把np看成批量操作。
>>> a = np.arange(0,4)
>>> b = np.arange(1,5)
>>> a+b
array([1, 3, 5, 7])
>>> np.add(a,b)
array([1, 3, 5, 7])
>>> np.add(a,b,c)

Traceback (most recent call last):
  File "<pyshell#139>", line 1, in <module>
    np.add(a,b,c)
ValueError: operands could not be broadcast together with shapes (4) (4) (100)
>>> np.add(a,b,a) #覆盖a
array([1, 3, 5, 7])
>>> a
array([1, 3, 5, 7])
>>> a=[1,2,3,4]
>>> b=[2,3,4,5]
>>> a+b #python自带数组的+
[1, 2, 3, 4, 2, 3, 4, 5]
>>> np.add(a,b)
array([3, 5, 7, 9])
#运算符
y = x1 + x2: add(x1, x2 [, y])
y = x1 - x2: subtract(x1, x2 [, y])
y = x1 * x2: multiply (x1, x2 [, y])
y = x1 / x2: divide (x1, x2 [, y]), 如果两个数组的元素为整数，那么用整数除法
y = x1 / x2: true divide (x1, x2 [, y]), 总是返回精确的商
y = x1 // x2: floor divide (x1, x2 [, y]), 总是对返回值取整
y = -x: negative(x [,y])
y = x1**x2: power(x1, x2 [, y])
y = x1 % x2: remainder(x1, x2 [, y]), mod(x1, x2, [, y])

2.2.1 广播
当我们使用ufunc函数对两个数组进行计算时，ufunc函数会对这两个数组的对应元素进行计算，因
此它要求这两个数组有相同的大小(shape相同)。如果两个数组的shape不同的话，会进行如下的广播
(broadcasting)处理：
1. 让所有输入数组都向其中shape最长的数组看齐，shape中不足的部分都通过在前面加1补齐
2. 输出数组的shape是输入数组shape的各个轴上的最大值
3. 如果输入数组的某个轴和输出数组的对应轴的长度相同或者其长度为1时，这个数组能够用来计
算，否则出错
4. 当输入数组的某个轴的长度为1时，沿着此轴运算时都用此轴上的第一组值
>>> a = np.arange(0, 60, 10).reshape(-1, 1)
>>> a
array([[ 0], [10], [20], [30], [40], [50]])
>>> a.shape
(6, 1)
>>> b = np.arange(0, 5)
>>> b
array([0, 1, 2, 3, 4])
>>> b.shape
(5,)
>>> c = a + b
>>> c
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44],
[50, 51, 52, 53, 54]])
>>> c.shape
(6, 5)

#矩阵
>>> a = np.matrix([[1,2,3],[5,5,6],[7,9,9]])
>>> a**-1 #逆矩阵
matrix([[-0.6 , 0.6 , -0.2 ],
        [-0.2 , -0.8 , 0.6 ],
        [ 0.66666667, 0.33333333, -0.33333333]])
>>> a*a**-1
matrix([[ 1.00000000e+00, 0.00000000e+00, -5.55111512e-17],
        [ 4.44089210e-16, 1.00000000e+00, -1.11022302e-16],
        [ 4.44089210e-16, 0.00000000e+00, 1.00000000e+00]])
#写文件，读 。维度会变成一维的
tofile可以方便地将数组中数据以二进制的格式写进文件。tofile输出的数据没有格
式，因此用numpy.fromfile读回来的时候需要自己格式化数据：
>>> a = np.arange(0,12)
>>> a.shape = 3,4
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a.tofile("a.bin")
>>> b = np.fromfile("a.bin", dtype=np.float) # 按照float类型读入数据
>>> b # 读入的数据是错误的
array([ 2.12199579e-314, 6.36598737e-314, 1.06099790e-313,
1.48539705e-313, 1.90979621e-313, 2.33419537e-313])
>>> a.dtype # 查看a的dtype
dtype('int32')
>>> b = np.fromfile("a.bin", dtype=np.int32) # 按照int32类型读入数据
>>> b # 数据是一维的
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> b.shape = 3, 4 # 按照a的shape修改b的shape
>>> b # 这次终于正确了
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a.tofile("d:\a1.bin",sep='#') #数组将以文本格式输入输出。，以#分隔。
>>> np.save("d:\a.npy", a) #二进制
>>> c = np.load( "d:\a.npy" )#维度未丢失，且不用设置dtype
>>> c
array([[ 0, 1, 2, 3],
       [ 4, 5, 6, 7],
       [ 8, 9, 10, 11]])
#np.savez() 存储多个数组
>>> a = np.array([[1,2,3],[4,5,6]])
>>> b = np.arange(0, 1.0, 0.1)
>>> c = np.sin(b)
>>> np.savez("result.npz", a, b, sin_array = c)
>>> r = np.load("result.npz")
>>> r["arr_0"] # 数组a
array([[1, 2, 3],
[4, 5, 6]])
>>> r["arr_1"] # 数组b
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
>>> r["sin_array"] # 数组c
array([ 0. , 0.09983342, 0.19866933, 0.29552021, 0.38941834,
0.47942554, 0.56464247, 0.64421769, 0.71735609, 0.78332691])
如果你用解压软件打开result.npz文件的话，会发现其中有三个文件：arr_0.npy， arr_1.npy，
sin_array.npy，其中分别保存着数组a, b, c的内容。
# 读写txt文件
使用numpy.savetxt和numpy.loadtxt可以读写1维和2维的数组：
>>> a = np.arange(0,12,0.5).reshape(4,-1)
>>> np.savetxt("a.txt", a) # 缺省按照'%.18e'格式保存数据，以空格分隔
>>> np.loadtxt("a.txt")
array([[ 0. , 0.5, 1. , 1.5, 2. , 2.5],
[ 3. , 3.5, 4. , 4.5, 5. , 5.5],
[ 6. , 6.5, 7. , 7.5, 8. , 8.5],
[ 9. , 9.5, 10. , 10.5, 11. , 11.5]])
>>> np.savetxt("a.txt", a, fmt="%d", delimiter=",") #改为保存为整数，以逗号分隔
>>> np.loadtxt("a.txt",delimiter=",") # 读入的时候也需要指定逗号分隔
array([[ 0., 0., 1., 1., 2., 2.],
[ 3., 3., 4., 4., 5., 5.],
[ 6., 6., 7., 7., 8., 8.],
[ 9., 9., 10., 10., 11., 11.]])

本节介绍所举的例子都是传递的文件名，也可以传递已经打开的文件对象，例如对于load和save
函数来说，如果使用文件对象的话，可以将多个数组储存到一个npy文件中：
>>> a = np.arange(8)
>>> b = np.add.accumulate(a)
>>> c = a + b
>>> f = file("result.npy", "wb")
>>> np.save(f, a) # 顺序将a,b,c保存进文件对象f
>>> np.save(f, b)
>>> np.save(f, c)
>>> f.close()
>>> f = file("result.npy", "rb")
>>> np.load(f) # 顺序从文件对象f中读取内容
array([0, 1, 2, 3, 4, 5, 6, 7])
>>> np.load(f)
array([ 0, 1, 3, 6, 10, 15, 21, 28])
>>> np.load(f)
array([ 0, 2, 5, 9, 14, 20, 27, 35])
相关阅读:
MySQL 性能调优之索引
 MySQL 性能调优之存储引擎
 MySQL数据类型优化—整数类型优化选择
 MySQL数据性能优化-修改方法与步骤
 MySQL设计SQL语句优化规范
 MySQL索引的设计、使用和优化
 MySQL的SQL语句优化-group by语句的优化
 SQL性能优化-order by语句的优化
 MySQL查询优化注意下面的四个细节
 优化MySQL性能的几种方法-总结
原文地址：https://www.cnblogs.com/nicolexu/p/5934335.html