Co-packaged optics deliver high-speed connectivity to supercharge generative AI computing

News
5 minute read

Optical fibers carry voice and data at high speeds across long distances, and IBM Research scientists are bringing this speed and capacity somewhere they haven’t previously gone: inside data centers and onto circuit boards, where they will help accelerate generative AI computing.

Optical fibers carry voice and data at high speeds across long distances, and IBM Research scientists are bringing this speed and capacity somewhere they haven’t previously gone: inside data centers and onto circuit boards, where they will help accelerate generative AI computing.

Scientists at IBM Research have announced a new set of advancements in chip assembly and packaging, called co-packaged optics, that promises to improve energy efficiency and boost bandwidth by bringing optical link connections inside devices and within the walls of data centers used to train and deploy large language models. This new process promises to increase the number of optical fibers that can be connected at the edge of a chip, a measure known as beachfront density, by six times. As artificial intelligence demands ever more bandwidth, this innovation will use the world’s first successful polymer optical waveguide to bring the speed and bandwidth of optics all the way to the edge of chips.

Early results suggest that switching from conventional electrical interconnects to co-packaged optics will slash energy costs for training AI models, speed up model training, and dramatically increase energy efficiency for data centers.

Today’s advanced chip and chip packaging technologies typically use electrical signals for the transistors in microelectronics that power phones, computers, and almost everything that we do. Transistors, for their part, have gotten many times smaller over the decades, enabling us to pack more capability into a given space. But even the most capable semiconductor components are only as fast as the connections between them.

an IBM polymer optical waveguide
IBM's prototype polymer optical waveguides bring the speed and bandwidth of fiber optic connections all the way to the edge of chips, replacing sluggish electrical connectors.

These connections make it possible for us to seamlessly use electronic devices in our daily lives — like when we drive our cars, which include chips in nearly every system from the seats to the tires. “Even your refrigerator has electronics in it to help everything operate properly,” says IBM Research engineer John Knickerbocker, a distinguished engineer of chiplets and advanced packaging.

Knickerbocker and his team are thinking smaller, though. Because of optical connectors’ lower cost and higher energy efficiency, they make great candidates for improving the performance of chip-to-chip and device-to-device communication in data centers, where generative AI computing is demanding ever higher and higher bandwidth.

“Large language models have made AI very popular these days across the tech industry,” Knickerbocker says. “And the resulting growth of LLMs — and generative AI more broadly — is requiring exponential growth in high-speed connections between chips and data centers.”

IBM Research scientists in a lab looking at an optics module under a microscope
Hsianghan Hsu (left) and John Knickerbocker (right) inspect a polymer optical waveguide module under a microscope at IBM Research's global headquarters in Yorktown Heights, New York.

And while optical cables can carry data in and out of data centers, what happens inside is a different story. Even today’s most advanced chips still communicate via copper-based wires that carry electrical signals. It takes quite a bit of energy to make the link from the edge of a chip to a circuit board, then from the circuit board across miles of optical cable, and then back onto another module and onto another chip in a remote data center. Regardless of whether you’re transmitting data or a voice call, sending a signal seamlessly across all these junctions costs energy. Low-bandwidth wire connections within servers also slow down GPU accelerators, which sit idle as they wait for data.

Electrical signals use electrons to provide power and signal communication from one device to another. Optics, on the other hand, which has been used for communications technologies for decades, uses light to transmit data. Fiber optics cables, hair-thin and sometimes thousands of miles long, can transmit hundreds of terabits of data per second. Bundled together and insulated in cables that run beneath the ocean, optical fibers carry nearly all the global commerce and communications traffic that flows between continents.

Bringing the power of optical connections onto circuit boards and all the way to chips results in a more than 80% reduction in energy consumption compared to electrical connections, Knickerbocker and his colleagues have found — a reduction from 5 picojoules per bit to less than 1. Over thousands of chips and millions of operations, this means massive savings.

A small plastic case holds several polymer optical waveguide modules
John Knickerbocker carefully handles polymer optical waveguide modules in the lab. These connectors promise to reduce the time that GPUs sit idle as they wait for data during AI model training.

The Chiplet and Advanced Packaging team at IBM Research is seeking to streamline this system with co-packaged optics, an approach that promises to improve the efficiency and density of communication, both within and among chips. Part of bringing optical connections onto integrated circuit boards is building in transmitters and photodetectors to send and receive optical signals. Optical fibers are about 250 microns in diameter, around three times the width of a human hair. That may sound tiny, but four fibers add up to a millimeter, and as the millimeters add up you quickly run out of space at the edges of a chip.

The solution, as IBM Research scientists saw it, lies in the next generation of optical links that enable much denser connections: the polymer optical waveguide. This device makes it possible to line up high-density bundles of optical fibers right at the edge of a silicon chip so it can communicate directly out through the polymer fibers. High-fidelity optical connections require exacting tolerances of half a micron or less between a fiber and connector, a feat the team has now achieved.

Thanks to these approaches, the team has demonstrated the viability of a 50-micron pitch for optical channels, coupled to silicon photonics waveguides and connector pluggable to single mode glass fiber (SMF) arrays, using standard assembly packaging processes. This represents an 80% size reduction from the conventional 250-micron pitch, but testing indicates they can shrink this even more, down to 20 or 25 microns, which would correspond to a 1,000% to 1,200% increase in bandwidth.

Exploded diagram of an IBM optical waveguide module
An exploded view of the prototype co-packaged optics module.

The insertion loss of photonic integrated circuit (PIC) to SMF optical link has typically been 1.5 to 2 decibels (dB) per channel, but in this case, it has been demonstrated to be below 1.2 dB per full optical link. In addition, demonstrations with 18.4 micrometer pitch optical waveguides have shown less than 30 dB cross-talk, indicating this co-packaged optics technology is scalable to very high bandwidth density for chip interconnection.

This means that, by taking a lesson out of the telephone industry’s book, they can transmit multiple wavelengths of light per optical channel, which has the potential to boost that bandwidth increase by at least 4,000% and as much as 8,000%.

Beyond the fiber-to-chip and fiber-to-board connections, they’re also reinforcing conventional glass fibers with high-strength polymers, a move that improves durability and efficiency but also requires advanced modeling simulations of optical lengths to ensure that light can transmit through multiple components without any losses — the ‘co-packaging’ of it all.

Scientists and technicians walk through a clean room where electronic components are being developed
The polymer optical waveguides are tested at IBM's plant in Bromont, Québec. There, they are put through heat and cold cycles, high-humidity conditions, and mechanical stress testing.

This development process also includes industry-standard reliability stress testing to ensure all the optical and electrical links still work when they go through the stresses seen during fabrication and application use. Components are subjected to temperatures ranging from -40°C to 125°C, as well as mechanical durability testing to confirm that the optical fibers can endure bending without breaking or incurring data losses. This testing takes place at IBM Research’s global headquarters in Yorktown Heights, New York, as well as at IBM’s plant in Bromont, Québec.

“The big deal is not only that we’ve got this big density enhancement for communications on module, but we've also demonstrated that this is compatible with stress tests that optical links haven’t been passing in the past,” says Knickerbocker. IBM’s modules are meant to be compatible with standard electronic passive advanced packaging assembly processes, which can lead to lower production costs. With this innovation, IBM can produce co-packaged optics modules at its Bromont facility.

The team is building out a roadmap for the next steps this technology will take, including soliciting feedback from IBM clients and enabling co-packaged optics to meet generative AI compute business needs. “We’ll also be working with the component suppliers to position them for this next step of technology,” Knickerbocker says, “as well as positioning them for the ability to support production quantities, not just prototypes.”

Date