An Integrated Photonics Engine for Unsupervised Correlation Detection
Abstract
With more and more aspects of modern life and scientific tools becoming digitized, the amount of data being generated is growing exponentially. Fast and efficient statistical processing, such as identifying correlations in big data sets, is therefore becoming increasingly important, and this, on accounts of the various compute bottlenecks in modern digital machines has necessitated new computational paradigms. Here we demonstrate one such novel paradigm, via the development of an integrated phase-change photonics engine. The computational memory engine exploits the accumulative property of Ge₂Sb₂Te₅ phase-change cell and wavelength division multiplexing property of optics in delivering fully parallelized and co-located temporal correlation detection computations. We investigate this property and present an experimental demonstration of identifying real-time correlations in data streams on the social media platform Twitter, and high-traffic computing nodes in data centers. Our results demonstrate the use-case of high-speed integrated photonics in accelerating statistical analysis methods.
Authors’ notes
Light is extremely efficient for digital data communication, thanks to the parallelism and speed of photons. For that very reason, fiber connections use optical signals to send digital data to and from your computer. While data is transferred in the optical domain at mind-blowingly high speeds, the computational operations which allow data processing, for example in streaming the video of a playful cat, use relatively slow electronics.
The data must be converted into electrical signals, stored in the data storage units, and then shuttled to data-processing units — the CPU or GPU on your computer — where they become meaningful information. This delay or latency is the “time lag” that we often experience on our gadgets. While maybe a minor annoyance for streaming videos, such latencies can be extremely detrimental in many fields, including healthcare, autonomous driving, and science experiments.
Our team at IBM’s research lab in Zurich, in collaboration with researchers from the universities of Muenster, Exeter, and Oxford, has developed a way to improve modern-day computing — that removes this latency — by asking: could “computing with light” be a solution?
In this Science Advances paper, we introduce a method of light-based real-time computing using an integrated photonic phase-change technology, demonstrating a non-von Neumann, all-optical in-memory computing framework for useful big data analytics.
Processing data on-the-fly
The general idea of building faster computers today is to make the processors faster. However, even with the fastest imaginable processors, processing would be limited by the data flow, which is bottlenecked by the optoelectrical conversion processes, as well as the communication delays due to the physical separation of memories and processors in current computer architectures.
Two approaches are being independently investigated in the field: The first is to compute at the locations where the data is stored, using the so-called framework of electronic in-memory computing. The second is the idea to compute at the speed of light using optical processing units.
By removing the need to shuttle data between memory and processing units, in-memory computing already brings significant latency reductions. So does optical processing, which in addition to faster data movement leverages property to multiplex photons for parallelized computing.
The holy grail is to combine the two approaches into one, creating in effect a photonic in-memory computing approach. However, this must be achieved in a way where data is communicated as well as processed fully in the optical domain. Together with our collaborators, we developed just that method.
By using the crystallization physics of phase-change materials, we found a way to compute data when it is being transmitted. Think of commuting to work and getting some part of the work done during the commute. This is precisely our approach here. Note that while the interest in optical computing got its start back in the 1960s, the field has recently experienced a renewal with the advent of integrated photonics and functional materials — integrated photonics circuits are already in use for data signaling in data centers.
Detecting temporal correlations in data streams
To demonstrate a real-world use case, we set out to build photonic in-memory computing for solving a computationally difficult task of temporal correlation detection. Correlation detection is a statistical method used for detecting patterns and modeling structures in data streams. Determining the temporal correlation in and between data streams is important for a host of applications, ranging from social media analysis to financial forecasting, the detection of hacking threats, and much more. Importantly, there is no room for latencies in this task since it requires data’s real-time or temporal correlation for computations.
We started this work in early 2020, during the COVID pandemic. We first developed a computational algorithm suited for the correlation detection task and tested that out on simulations. Simultaneously, we laid grounds for modeling the device physics of phase change materials and implemented a model with Prof David C. Wright and his group at the University of Exeter. We figured out what format of devices and circuitry we needed and the parameters that would allow us to test our ideas. Over the course of a year, we built and characterized our photonic circuits at the University of Muenster and Oxford in the labs of Prof. Wolfram Pernice and Prof. Harish Bhaskaran.
Crucially, we validated two key new features:
- One is the accumulative behavior in phase change material germanium-antimony-tellurium (GST) arising from its crystallization dynamics.
- The second is the ability to perform accumulation operations in parallel on multiple photonic phase-change memory devices using the wavelength division multiplexing property of optics. This is an entirely different approach.
The property of phase-change materials that has been so far utilized in the optical domain is the ability to program them to multi-transmissive states for static analog weighing operations. Here, we take an approach where data is directly written into the phase change memory and the computational result is obtained after a data-analysis task is complete.
Broadly, our approach utilizes the crystal growth dynamics in phase change materials to integrate information. Each data stream is assigned a phase change memory device, and a part of the device crystallizes when the light signal has a bit value of 1 in the incoming data bitstream. When more and more device volume crystallizes, the transmission of the optical signal that is read out drops. Thus, devices that end up with reduced transmission after an experiment are the ones that are correlated.
At no point is the incoming data shuttled, or stored. It is both recorded and computed within the same physical phase-change memory device.
By using this property, we demonstrated how correlations in social media data streams and high-traffic data centers can be efficiently detected and recorded. Our photonic engine analyzed over 100,000 tweets to find correlations on the social media platform Twitter, as well as detected anomalies on high-traffic data streams in data centers.
One can elegantly combine our photonic correlator with existing photonic transceivers units, for example, to process data in real-time for the above tasks. Some other demanding applications include data analysis on very long baseline telescopes, and particle accelerators where computations can be performed and recorded during the course of the scientific experiment.
While there are some challenges to work out, such as scaling up our approach, our results set another example of the potential of optics in computing to suggest that integrated photonics is coming of age and in some cases can begin to even challenge electronic computation.
This research was partially funded by European Union’s Horizon 2020 research and innovation program (Fun-COMP project, Grant Number 780848).