New tensor algebra changes the rules of data analysis
Research published in PNAS describes a way to bridge the divide between matrix and tensor algebra.
Research published in PNAS describes a way to bridge the divide between matrix and tensor algebra.
We live in a multi-dimensional world, immersed in huge volumes of data. This data often involves complex interlinked structures that span across multiple dimensions. Processes and phenomena also exhibit multi-dimensional behavior, requiring their models to operate in high dimensional settings.
Typically, we use matrix algebra to manipulate data, in so-called vector embedded spaces. But such representations usually don’t take into account the underlying integrity of an object’s dimension, either missing out on high-order links that go beyond pairwise relations or requiring an overhead in encoding such relations. This is where tensor algebra comes into play, addressing multiple dimensions.
But there is a problem. Despite a broad consensus, distilled over centuries of mathematical research, for matrix algebra, there is no such standard for its multidimensional counterpart, tensor algebra. There have been several propositions for tensor algebra frameworks over the years.1 Existing techniques that decompose tensor constructs into simpler tangible entities,2 Tucker,3, 4, 5 have limitations and inconsistencies compared to matrix algebra. These issues have been hindering broad adoption of tensor algebra into mainstream use.
We want to change that.
In our latest paper, “Tensor-Tensor Algebra for Optimal Representationand Compression of Multiway Data,”6 published in Proceedings of the National Academy of Sciences of the United States of America, we describe a way to bridge the divide between matrix and tensor algebra. We’ve developed new algebraic constructs that represent and manipulate natively high-dimensional entities—while preserving their multi-dimensional integrity.
The new tensor algebra could be of benefit to many industries. Take autonomous vehicles. For them to work, they have to rapidly process large streams of high dimensional data—say, video feed and LIDAR—and be able to identify links across the spatial, channel and temporal dimensions. Then there is climate change. Climate imagery often involves time-evolving multi-spectral input weaved with non-trivial local and causal correlations.
It all started with a feeling that something was amiss.
We felt that the standard practice of taking multi-dimensional data, metricizing it and performing matrix manipulations, was suboptimal. The moment one unfolds a tensor into a matrix form, important local correlations associating entries of the tensor in multiple dimensions are lost.
However, state-of-the-art brought only partial resort. For example, determining the rank of the so-called canonical tensor decomposition (known as CP) is an NP hard problem.7 The truncated approximation of the popular Tucker decomposition and its orthogonal variant HOSVD is not great. 3, 8
So we decided to devise a tensorial framework that could mimic some of the essential properties of matrix algebra—a new algebra with so-called matrix-mimetic properties. Among the desired mimetic properties, probably the most sought-after one is the Eckart-Young optimality theorem. This theorem gives theoretical guarantees that truncated matrix decompositions offer an optimal approximation of the original matrix. This property is instrumental in many data analysis and compression tasks. But until now, no tensor algebra had offered Eckart-Young theorem as the result.
We started with the representation of the basic tensor product. We were inspired by the elegance of the convoluting structure called block-circulants becoming diagonal products after a Fourier transform. That resembled the duality we expected to see when we shifted from one space to its dual. We wondered whether other components necessary to form an algebra could be defined, and whether we could form decompositions with their help.
We realized it was possible to generalize our framework—from Fourier to any invertible transformation—and to prove the optimality of approximated presentations.
One challenge of high-dimensional constructs is the excessive computational requirements they impose. Many applications today involve operating on large-dimensional datasets, and to handle the ever-increasing sizes we need to develop scalable algorithms.
Our new tensor algebra offers a way to manipulate tensor structures while preserving their dimensional integrity and offering favorable algebraic properties that so far were only found in matrix algebra. Our results indicate that truncation of decomposition using the new tensor algebra offers optimal approximation results, in a mimetic way that the Eckrat-Young theorem offers optimality for matrix representations.
We’ve based the result on spectral analysis of the residual error of approximated tensor-tensor representation of an equal dimensional spanning space of matrix representation. Our research also indicates that the tensor representation is superior, and at worst similar to the matrix representation.
To compare the new truncated tensor approximation with other tensorial frameworks, we show that few popular tensorial frameworks can be viewed as special cases of the new algebra and be expressed using the new framework. In the paper, we demonstrate that the proposed decomposition in the new tensor algebra outperforms its counterparts.
Apart from applying the results to improve the data analysis by autonomous vehicles, our research could also be of use for other AI applications. For example, one could create an end-to-end neural network that ingests tensorial objects and performs tensorial computation. Numerical (multi)linear algebra that involves computation on matrices and tensors is at the heart of many domains including scientific computing, machine learning and AI, quantum computing, signal processing, data analysis, and others. We are confident that our results can take high-dimensional data processing to a whole new level, unveiling latent insights and opening windows to new dimensions.
This joint research was done in collaboration with Misha Kilmer, the William Walker Professor of Mathematics and Adjunct Professor of Computer Science at Tufts University, Haim Avron, associate professor, Department of Applied Mathematics at Tel Aviv University, and Elizabeth Neuman, distinguished visiting assistant professor in Scientific Computing at Emory University.
References
-
T. Kolda, B. Bader. Tensor decompositions and applications. SIAM Rev., 51(3), 455-500. (2009). ↩
-
F. Hitchcock. The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics., 6(1-4), 164-189. (1927). ↩
-
L. Lathauwer, B. Moor, J. Vandewalle. A Multilinear Singular Value Decomposition. SIAM J. Matrix Anal. Appl., 21(4), 1253–1278. (2000). ↩ ↩2
-
I. Oseledets. Tensor-train decomposition. SIAM J. Sci. Comput., 33(5), 2295–2317. (2011). ↩
-
L. Tucker. Implications of factor analysis of three-way matricesfor measurement of change. Problems in measuring change., 15, 122-137. (1963). ↩
-
Kilmer, M., Horesh, L., Avron, H., et al., Tensor-Tensor Algebra for Optimal Representationand Compression of Multiway Data. Proceedings of the National Academy of Sciences of the United States of America. VOL. PG. (2021). ↩
-
V. De Silva, L. Lim. Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM J. Matrix Anal. Appl., 30(3), 1084–1127. (2008). ↩
-
A. Cichocki, R. Zdunek, A. Phan, S. Amari. Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. John Wiley & Sons. (2009). ↩