Mathematical Background ======================= This page provides the mathematical foundations underlying TensorRSVD. Readers familiar with tensor decompositions can jump directly to :ref:`randomized-svd` or :ref:`tensor-free-approach`. .. contents:: Contents :local: :depth: 2 .. _tensors-and-notation: Tensors and Notation -------------------- For the purposes of this library, a tensor of order :math:`k` is a multi-dimensional array .. math:: \mathcal{T} \in \mathbb{C}^{n_1 \times n_2 \times \cdots \times n_k}, indexed by :math:`\mathcal{T}_{i_1, i_2, \ldots, i_k}` where :math:`0 \le i_m < n_m` for each mode :math:`m`. Ordinary matrices are order-2 tensors and vectors are order-1 tensors. **Callable representation.** TensorRSVD never stores :math:`\mathcal{T}` as a dense array. Instead it represents the tensor as a Python callable (i.e., function) .. math:: f : [0,1]^k \to \mathbb{C}, where each coordinate :math:`x_m = i_m / (n_m - 1)` is the *normalized* grid position of index :math:`i_m` in mode :math:`m`. This means the grid spans :math:`[0, 1]` uniformly in every mode. The callable must be fully vectorized: each argument :math:`x_m` is a NumPy (or JAX / CuPy) array, and the function returns an array of the same shape with the corresponding tensor values. .. code-block:: python import numpy as np # T(x0, x1, x2) = exp(-(x0^2 + x1^2 + x2^2)) (3-D Gaussian) def gaussian(x0, x1, x2): return np.exp(-(x0**2 + x1**2 + x2**2)) .. _matricization: Mode-m Unfolding (Matricization) --------------------------------- The mode-:math:`m` unfolding (or matricization) of :math:`\mathcal{T}` is the matrix .. math:: T_{(m)} \in \mathbb{C}^{n_m \times N_m}, \qquad N_m = \prod_{j \ne m} n_j, obtained by making the mode-:math:`m` index the row index and collapsing all remaining indices into a single column index in C (row-major) order. Concretely, the element in row :math:`i_m` and the column that corresponds to multi-index :math:`(i_0, \ldots, i_{m-1}, i_{m+1}, \ldots, i_{k-1})` is .. math:: \bigl[T_{(m)}\bigr]_{i_m,\, \mathrm{col}} = \mathcal{T}_{i_0, \ldots, i_{k-1}}. Each row of :math:`T_{(m)}` is a *mode-m fiber*: the 1-D section of the tensor obtained by varying index :math:`i_m` while holding all other indices fixed. Mode-m unfoldings are the key primitive behind HOSVD: the SVD of :math:`T_{(m)}` gives the most important "directions" of :math:`\mathcal{T}` along mode :math:`m`. .. _tucker-hosvd: Tucker Decomposition and HOSVD -------------------------------- The Tucker decomposition expresses a tensor approximately as .. math:: \mathcal{T} \approx \mathcal{G} \times_1 U_1 \times_2 U_2 \cdots \times_k U_k, where - :math:`\mathcal{G} \in \mathbb{C}^{r_1 \times r_2 \times \cdots \times r_k}` is the core tensor, - :math:`U_m \in \mathbb{C}^{n_m \times r_m}` is the factor matrix for mode :math:`m` (orthonormal columns), - :math:`\times_m` denotes the mode-m product: contracting the :math:`m`-th index of :math:`\mathcal{G}` with :math:`U_m`. The integer tuple :math:`(r_1, \ldots, r_k)` is called the Tucker rank. **HOSVD construction.** The Higher-Order SVD (HOSVD) [DLV2000]_ computes each factor matrix :math:`U_m` as the leading :math:`r_m` left singular vectors of the mode-m unfolding: .. math:: T_{(m)} = U_m \Sigma_m V_m^\dagger, \qquad U_m \in \mathbb{C}^{n_m \times r_m}. The mode-m singular values :math:`\sigma_1^{(m)} \ge \sigma_2^{(m)} \ge \cdots \ge 0` (the diagonal of :math:`\Sigma_m`) measure how much energy :math:`\mathcal{T}` has in each direction along mode :math:`m`. Truncating to rank :math:`r_m` discards the directions that carry least energy. .. note:: TensorRSVD returns the factor matrices :math:`U_m` and the mode-m singular values :math:`\Sigma_m` but not the core tensor :math:`\mathcal{G}`. See :ref:`user-guide-reconstruction` in the User Guide for how to recover :math:`\mathcal{G}` if needed. .. _randomized-svd: Randomized SVD -------------- Computing the exact SVD of :math:`T_{(m)}` requires forming the matrix explicitly, which costs :math:`\mathcal{O}(n_m \cdot N_m)` memory and :math:`\mathcal{O}(n_m^2 \cdot N_m)` time. TensorRSVD uses the randomized SVD algorithm of Halko, Martinsson, and Tropp [HMT2011]_. **1. Range finder.** Find an orthonormal matrix :math:`Q \in \mathbb{C}^{n_m \times \ell}`, :math:`\ell = r_m + p`, whose column space approximates the range of :math:`T_{(m)}`: 1. Draw a Gaussian random matrix :math:`\Omega \in \mathbb{C}^{N_m \times \ell}`. 2. Form the sketch :math:`Y = T_{(m)} \Omega \in \mathbb{C}^{n_m \times \ell}`. 3. Orthonormalize: :math:`Q, \_ = \mathrm{QR}(Y)`. 4. Power iterations (optional, controlled by ``num_power_iterations``): repeat :math:`q` times .. math:: Q \leftarrow \mathrm{QR}\!\left(T_{(m)}^\dagger Q\right), \qquad Q \leftarrow \mathrm{QR}\!\left(T_{(m)} Q\right). Each iteration sharpens the range estimate, at the cost of two additional passes through the operator. The oversampling parameter :math:`p` (``num_oversamples``) provides a cushion: a value of 5–10 is usually sufficient for near-exact results. **2. Compression and small SVD.** Project the operator into the :math:`\ell`-dimensional subspace spanned by :math:`Q` and perform a cheap dense SVD: 1. Compute :math:`B^\dagger = T_{(m)}^\dagger Q \in \mathbb{C}^{N_m \times \ell}`. 2. Factorize: :math:`B^\dagger = \hat{Q} R` (QR decomposition). 3. Small SVD: :math:`R = \hat{U} \Sigma \hat{V}^\dagger`. 4. Recover left singular vectors: :math:`U_m = Q \hat{V}[:, :r_m]`. 5. Singular values: :math:`S_m = \Sigma[:r_m]`. The dominant cost is Step 1 (two matrix-vector products with :math:`T_{(m)}` per power iteration), each of which costs :math:`O(\ell \cdot n_m \cdot N_m)`. Since :math:`\ell \ll \min(n_m, N_m)`, this is far cheaper than exact SVD. .. _tensor-free-approach: Tensor-Free Approach --------------------- The critical insight is that step 1 and step 2 only require matrix-matrix products :math:`T_{(m)} X` and :math:`T_{(m)}^\dagger Y`, never the explicit matrix :math:`T_{(m)}`. :class:`~tensorrsvd.core.MatricizedTensorOperator` implements these products on the fly using the callable :math:`f`: - **Forward product** :math:`T_{(m)} X`: iterate over rows :math:`i_m` of :math:`T_{(m)}`. Row :math:`i_m` equals :math:`f(x_0, \ldots, x_{m-1}, x_m^{(i_m)}, x_{m+1}, \ldots, x_{k-1})` evaluated over all combinations of the remaining indices with one vectorized call per row. Contract with the corresponding rows of :math:`X` and accumulate. - **Adjoint product** :math:`T_{(m)}^\dagger Y`: the same row evaluations, but now each row value is an outer product contribution to :math:`T_{(m)}^\dagger Y`. The tensor :math:`\mathcal{T}` is never materialized as a dense array. Only :math:`\mathcal{O}(\ell \cdot n_m)` intermediates are required at any one time. .. rubric:: References .. [HMT2011] Halko, N., Martinsson, P.-G., & Tropp, J. A. (2011). Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. *SIAM Review*, 53(2), 217-288. https://doi.org/10.1137/090771806 .. [DLV2000] De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000). A multilinear singular value decomposition. *SIAM Journal on Matrix Analysis and Applications*, 21(4), 1253-1278. https://doi.org/10.1137/S0895479896305696