Mathematical Background

This page provides the mathematical foundations underlying TensorRSVD. Readers familiar with tensor decompositions can jump directly to Randomized SVD or Tensor-Free Approach.

Tensors and Notation

For the purposes of this library, a tensor of order \(k\) is a multi-dimensional array

\[\mathcal{T} \in \mathbb{C}^{n_1 \times n_2 \times \cdots \times n_k},\]

indexed by \(\mathcal{T}_{i_1, i_2, \ldots, i_k}\) where \(0 \le i_m < n_m\) for each mode \(m\). Ordinary matrices are order-2 tensors and vectors are order-1 tensors.

Callable representation. TensorRSVD never stores \(\mathcal{T}\) as a dense array. Instead it represents the tensor as a Python callable (i.e., function)

\[f : [0,1]^k \to \mathbb{C},\]

where each coordinate \(x_m = i_m / (n_m - 1)\) is the normalized grid position of index \(i_m\) in mode \(m\). This means the grid spans \([0, 1]\) uniformly in every mode.

The callable must be fully vectorized: each argument \(x_m\) is a NumPy (or JAX / CuPy) array, and the function returns an array of the same shape with the corresponding tensor values.

import numpy as np

# T(x0, x1, x2) = exp(-(x0^2 + x1^2 + x2^2))  (3-D Gaussian)
def gaussian(x0, x1, x2):
    return np.exp(-(x0**2 + x1**2 + x2**2))

Mode-m Unfolding (Matricization)

The mode-\(m\) unfolding (or matricization) of \(\mathcal{T}\) is the matrix

\[T_{(m)} \in \mathbb{C}^{n_m \times N_m}, \qquad N_m = \prod_{j \ne m} n_j,\]

obtained by making the mode-\(m\) index the row index and collapsing all remaining indices into a single column index in C (row-major) order.

Concretely, the element in row \(i_m\) and the column that corresponds to multi-index \((i_0, \ldots, i_{m-1}, i_{m+1}, \ldots, i_{k-1})\) is

\[\bigl[T_{(m)}\bigr]_{i_m,\, \mathrm{col}} = \mathcal{T}_{i_0, \ldots, i_{k-1}}.\]

Each row of \(T_{(m)}\) is a mode-m fiber: the 1-D section of the tensor obtained by varying index \(i_m\) while holding all other indices fixed.

Mode-m unfoldings are the key primitive behind HOSVD: the SVD of \(T_{(m)}\) gives the most important “directions” of \(\mathcal{T}\) along mode \(m\).

Tucker Decomposition and HOSVD

The Tucker decomposition expresses a tensor approximately as

\[\mathcal{T} \approx \mathcal{G} \times_1 U_1 \times_2 U_2 \cdots \times_k U_k,\]

where

  • \(\mathcal{G} \in \mathbb{C}^{r_1 \times r_2 \times \cdots \times r_k}\) is the core tensor,

  • \(U_m \in \mathbb{C}^{n_m \times r_m}\) is the factor matrix for mode \(m\) (orthonormal columns),

  • \(\times_m\) denotes the mode-m product: contracting the \(m\)-th index of \(\mathcal{G}\) with \(U_m\).

The integer tuple \((r_1, \ldots, r_k)\) is called the Tucker rank.

HOSVD construction. The Higher-Order SVD (HOSVD) [DLV2000] computes each factor matrix \(U_m\) as the leading \(r_m\) left singular vectors of the mode-m unfolding:

\[T_{(m)} = U_m \Sigma_m V_m^\dagger, \qquad U_m \in \mathbb{C}^{n_m \times r_m}.\]

The mode-m singular values \(\sigma_1^{(m)} \ge \sigma_2^{(m)} \ge \cdots \ge 0\) (the diagonal of \(\Sigma_m\)) measure how much energy \(\mathcal{T}\) has in each direction along mode \(m\). Truncating to rank \(r_m\) discards the directions that carry least energy.

Note

TensorRSVD returns the factor matrices \(U_m\) and the mode-m singular values \(\Sigma_m\) but not the core tensor \(\mathcal{G}\). See Reconstructing the Tensor in the User Guide for how to recover \(\mathcal{G}\) if needed.

Randomized SVD

Computing the exact SVD of \(T_{(m)}\) requires forming the matrix explicitly, which costs \(\mathcal{O}(n_m \cdot N_m)\) memory and \(\mathcal{O}(n_m^2 \cdot N_m)\) time.

TensorRSVD uses the randomized SVD algorithm of Halko, Martinsson, and Tropp [HMT2011].

1. Range finder. Find an orthonormal matrix \(Q \in \mathbb{C}^{n_m \times \ell}\), \(\ell = r_m + p\), whose column space approximates the range of \(T_{(m)}\):

  1. Draw a Gaussian random matrix \(\Omega \in \mathbb{C}^{N_m \times \ell}\).

  2. Form the sketch \(Y = T_{(m)} \Omega \in \mathbb{C}^{n_m \times \ell}\).

  3. Orthonormalize: \(Q, \_ = \mathrm{QR}(Y)\).

  4. Power iterations (optional, controlled by num_power_iterations): repeat \(q\) times

    \[Q \leftarrow \mathrm{QR}\!\left(T_{(m)}^\dagger Q\right), \qquad Q \leftarrow \mathrm{QR}\!\left(T_{(m)} Q\right).\]

    Each iteration sharpens the range estimate, at the cost of two additional passes through the operator.

The oversampling parameter \(p\) (num_oversamples) provides a cushion: a value of 5–10 is usually sufficient for near-exact results.

2. Compression and small SVD. Project the operator into the \(\ell\)-dimensional subspace spanned by \(Q\) and perform a cheap dense SVD:

  1. Compute \(B^\dagger = T_{(m)}^\dagger Q \in \mathbb{C}^{N_m \times \ell}\).

  2. Factorize: \(B^\dagger = \hat{Q} R\) (QR decomposition).

  3. Small SVD: \(R = \hat{U} \Sigma \hat{V}^\dagger\).

  4. Recover left singular vectors: \(U_m = Q \hat{V}[:, :r_m]\).

  5. Singular values: \(S_m = \Sigma[:r_m]\).

The dominant cost is Step 1 (two matrix-vector products with \(T_{(m)}\) per power iteration), each of which costs \(O(\ell \cdot n_m \cdot N_m)\). Since \(\ell \ll \min(n_m, N_m)\), this is far cheaper than exact SVD.

Tensor-Free Approach

The critical insight is that step 1 and step 2 only require matrix-matrix products \(T_{(m)} X\) and \(T_{(m)}^\dagger Y\), never the explicit matrix \(T_{(m)}\).

MatricizedTensorOperator implements these products on the fly using the callable \(f\):

  • Forward product \(T_{(m)} X\): iterate over rows \(i_m\) of \(T_{(m)}\). Row \(i_m\) equals \(f(x_0, \ldots, x_{m-1}, x_m^{(i_m)}, x_{m+1}, \ldots, x_{k-1})\) evaluated over all combinations of the remaining indices with one vectorized call per row. Contract with the corresponding rows of \(X\) and accumulate.

  • Adjoint product \(T_{(m)}^\dagger Y\): the same row evaluations, but now each row value is an outer product contribution to \(T_{(m)}^\dagger Y\).

The tensor \(\mathcal{T}\) is never materialized as a dense array. Only \(\mathcal{O}(\ell \cdot n_m)\) intermediates are required at any one time.

References

[HMT2011]

Halko, N., Martinsson, P.-G., & Tropp, J. A. (2011). Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Review, 53(2), 217-288. https://doi.org/10.1137/090771806

[DLV2000]

De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000). A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 21(4), 1253-1278. https://doi.org/10.1137/S0895479896305696