COOrdinate format
As an example of a sparse matrix format, this section describes one of the sparse
formats implemented in Scipy, the The COOrdinate format (COO). This is also known as the
"ijv" or "triplet" format, and stores the non-zero elements in three arrays, row
,
col
, and data
. The data[i]
value is the non-zero entry in row row[i]
and column
col[i]
of the matrix. The advantages of this format are:
fast format for constructing sparse matrices
fast conversions to/from the CSR and CSC formats
fast matrix-vector multiplication
fast elementwise operations (e.g. multiply each element by 2 is just
data * 2
)
However, slicing using this format is difficult.
Here are some examples of the COO matrix format using
scipy.sparse.coo_matrix
.
Again, these have been taken from
scipy-lectures,
which is an excellent resource and contains examples of the other sparse matrix formats
implemented in Scipy.
create empty COO matrix
from scipy import sparse import numpy as np mtx = sparse.coo_matrix((3, 4), dtype=np.int8) mtx.todense()
Output:
matrix([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], dtype=int8)
create using (data, ij) tuple
row = np.array([0, 3, 1, 0]) col = np.array([0, 3, 1, 2]) data = np.array([4, 5, 7, 9]) mtx = sparse.coo_matrix((data, (row, col)), shape=(4, 4)) mtx mtx.todense()
Output:
>>> mtx <4x4 sparse matrix of type '<class 'numpy.int64'>' with 4 stored elements in COOrdinate format> >>> mtx.todense() matrix([[4, 0, 9, 0], [0, 7, 0, 0], [0, 0, 0, 0], [0, 0, 0, 5]])
duplicates entries are summed together
row = np.array([0, 0, 1, 3, 1, 0, 0]) col = np.array([0, 2, 1, 3, 1, 0, 0]) data = np.array([1, 1, 1, 1, 1, 1, 1]) mtx = sparse.coo_matrix((data, (row, col)), shape=(4, 4)) mtx.todense()
Output:
>>> mtx.todense() matrix([[3, 0, 1, 0], [0, 2, 0, 0], [0, 0, 0, 0], [0, 0, 0, 1]])
no slicing…
mtx[2, 3]
Output:
>>> mtx[2, 3] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'coo_matrix' object is not subscriptable