本文以csr_matrix为例来说明sparse矩阵的使用方法,其他类型的sparse矩阵可以参考https://docs.scipy.org/doc/scipy/reference/sparse.html
csr_matrix是Compressed Sparse Row matrix的缩写组合,下面介绍其两种初始化方法
csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])
where data, row_ind and col_ind satisfy the relationship a[row_ind[k], col_ind[k]] = data[k].
csr_matrix((data, indices, indptr), [shape=(M, N)])
is the standard CSR representation where the column indices for row i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]]. If the shape parameter is not supplied, the matrix dimensions are inferred from the index arrays.
上述官方文档给出了:稀疏矩阵的参数及其含义、稀疏矩阵的构造方式。阐述形式简单明了,读起来令人赏心悦目。
Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power
Advantages of the CSR format
- efficient arithmetic operations CSR + CSR, CSR * CSR, etc.
- efficient row slicing
- fast matrix vector products
Disadvantages of the CSR format
- slow column slicing operations (consider CSC)
- changes to the sparsity structure are expensive (consider LIL or DOK)
上述官方文档时稀疏矩阵的一些特性以及csr_matrix的优缺点,并且在指明各种缺点的同时,提供了可以考虑的技术实现。
代码示例1
import numpy as np from scipy.sparse import csr_matrix row = np.array([0, 0, 1, 2, 2, 2]) col = np.array([0, 2, 2, 0, 1, 2]) data = np.array([1, 2, 3, 4, 5, 6]) a = csr_matrix((data, (row, col)), shape=(3, 3)).toarray()
print(a)
运行结果:
array([[1, 0, 2], [0, 0, 3], [4, 5, 6]])
代码示例2
indptr = np.array([0, 2, 3, 6]) indices = np.array([0, 2, 2, 0, 1, 2]) data = np.array([1, 2, 3, 4, 5, 6]) a = csr_matrix((data, indices, indptr), shape=(3, 3)).toarray() print(a)
允许结果:
array([[1, 0, 2], [0, 0, 3], [4, 5, 6]])