Mathematical Background
This page provides the mathematical foundations underlying TensorRSVD. Readers familiar with tensor decompositions can jump directly to Randomized SVD or Tensor-Free Approach.
Tensors and Notation
For the purposes of this library, a tensor of order \(k\) is a multi-dimensional array
indexed by \(\mathcal{T}_{i_1, i_2, \ldots, i_k}\) where \(0 \le i_m < n_m\) for each mode \(m\). Ordinary matrices are order-2 tensors and vectors are order-1 tensors.
Callable representation. TensorRSVD never stores \(\mathcal{T}\) as a dense array. Instead it represents the tensor as a Python callable (i.e., function)
where each coordinate \(x_m = i_m / (n_m - 1)\) is the normalized grid position of index \(i_m\) in mode \(m\). This means the grid spans \([0, 1]\) uniformly in every mode.
The callable must be fully vectorized: each argument \(x_m\) is a NumPy (or JAX / CuPy) array, and the function returns an array of the same shape with the corresponding tensor values.
import numpy as np
# T(x0, x1, x2) = exp(-(x0^2 + x1^2 + x2^2)) (3-D Gaussian)
def gaussian(x0, x1, x2):
return np.exp(-(x0**2 + x1**2 + x2**2))
Mode-m Unfolding (Matricization)
The mode-\(m\) unfolding (or matricization) of \(\mathcal{T}\) is the matrix
obtained by making the mode-\(m\) index the row index and collapsing all remaining indices into a single column index in C (row-major) order.
Concretely, the element in row \(i_m\) and the column that corresponds to multi-index \((i_0, \ldots, i_{m-1}, i_{m+1}, \ldots, i_{k-1})\) is
Each row of \(T_{(m)}\) is a mode-m fiber: the 1-D section of the tensor obtained by varying index \(i_m\) while holding all other indices fixed.
Mode-m unfoldings are the key primitive behind HOSVD: the SVD of \(T_{(m)}\) gives the most important “directions” of \(\mathcal{T}\) along mode \(m\).
Tucker Decomposition and HOSVD
The Tucker decomposition expresses a tensor approximately as
where
\(\mathcal{G} \in \mathbb{C}^{r_1 \times r_2 \times \cdots \times r_k}\) is the core tensor,
\(U_m \in \mathbb{C}^{n_m \times r_m}\) is the factor matrix for mode \(m\) (orthonormal columns),
\(\times_m\) denotes the mode-m product: contracting the \(m\)-th index of \(\mathcal{G}\) with \(U_m\).
The integer tuple \((r_1, \ldots, r_k)\) is called the Tucker rank.
HOSVD construction. The Higher-Order SVD (HOSVD) [DLV2000] computes each factor matrix \(U_m\) as the leading \(r_m\) left singular vectors of the mode-m unfolding:
The mode-m singular values \(\sigma_1^{(m)} \ge \sigma_2^{(m)} \ge \cdots \ge 0\) (the diagonal of \(\Sigma_m\)) measure how much energy \(\mathcal{T}\) has in each direction along mode \(m\). Truncating to rank \(r_m\) discards the directions that carry least energy.
Note
TensorRSVD returns the factor matrices \(U_m\) and the mode-m singular values \(\Sigma_m\) but not the core tensor \(\mathcal{G}\). See Reconstructing the Tensor in the User Guide for how to recover \(\mathcal{G}\) if needed.
Randomized SVD
Computing the exact SVD of \(T_{(m)}\) requires forming the matrix explicitly, which costs \(\mathcal{O}(n_m \cdot N_m)\) memory and \(\mathcal{O}(n_m^2 \cdot N_m)\) time.
TensorRSVD uses the randomized SVD algorithm of Halko, Martinsson, and Tropp [HMT2011].
1. Range finder. Find an orthonormal matrix \(Q \in \mathbb{C}^{n_m \times \ell}\), \(\ell = r_m + p\), whose column space approximates the range of \(T_{(m)}\):
Draw a Gaussian random matrix \(\Omega \in \mathbb{C}^{N_m \times \ell}\).
Form the sketch \(Y = T_{(m)} \Omega \in \mathbb{C}^{n_m \times \ell}\).
Orthonormalize: \(Q, \_ = \mathrm{QR}(Y)\).
Power iterations (optional, controlled by
num_power_iterations): repeat \(q\) times\[Q \leftarrow \mathrm{QR}\!\left(T_{(m)}^\dagger Q\right), \qquad Q \leftarrow \mathrm{QR}\!\left(T_{(m)} Q\right).\]Each iteration sharpens the range estimate, at the cost of two additional passes through the operator.
The oversampling parameter \(p\) (num_oversamples) provides a cushion: a
value of 5–10 is usually sufficient for near-exact results.
2. Compression and small SVD. Project the operator into the \(\ell\)-dimensional subspace spanned by \(Q\) and perform a cheap dense SVD:
Compute \(B^\dagger = T_{(m)}^\dagger Q \in \mathbb{C}^{N_m \times \ell}\).
Factorize: \(B^\dagger = \hat{Q} R\) (QR decomposition).
Small SVD: \(R = \hat{U} \Sigma \hat{V}^\dagger\).
Recover left singular vectors: \(U_m = Q \hat{V}[:, :r_m]\).
Singular values: \(S_m = \Sigma[:r_m]\).
The dominant cost is Step 1 (two matrix-vector products with \(T_{(m)}\) per power iteration), each of which costs \(O(\ell \cdot n_m \cdot N_m)\). Since \(\ell \ll \min(n_m, N_m)\), this is far cheaper than exact SVD.
Tensor-Free Approach
The critical insight is that step 1 and step 2 only require matrix-matrix products \(T_{(m)} X\) and \(T_{(m)}^\dagger Y\), never the explicit matrix \(T_{(m)}\).
MatricizedTensorOperator implements these products
on the fly using the callable \(f\):
Forward product \(T_{(m)} X\): iterate over rows \(i_m\) of \(T_{(m)}\). Row \(i_m\) equals \(f(x_0, \ldots, x_{m-1}, x_m^{(i_m)}, x_{m+1}, \ldots, x_{k-1})\) evaluated over all combinations of the remaining indices with one vectorized call per row. Contract with the corresponding rows of \(X\) and accumulate.
Adjoint product \(T_{(m)}^\dagger Y\): the same row evaluations, but now each row value is an outer product contribution to \(T_{(m)}^\dagger Y\).
The tensor \(\mathcal{T}\) is never materialized as a dense array. Only \(\mathcal{O}(\ell \cdot n_m)\) intermediates are required at any one time.
References
Halko, N., Martinsson, P.-G., & Tropp, J. A. (2011). Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Review, 53(2), 217-288. https://doi.org/10.1137/090771806
De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000). A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 21(4), 1253-1278. https://doi.org/10.1137/S0895479896305696