Aug 26th, 2025
Dr. Carlos Berto, Director of Network Engineering at Axiom
Brian Chang, Technical Writer/Editor
Recent developments in AI and HPC have incited plenty of excitement in the IT world. It comes at a cost, however.
The increasing computing demands for AI/HPC workloads puts much greater stress on our data centers. Stats show that AI/HPC power consumption can push 80kw, with upper tier chips pushing 120kw. At that level of power consumption, essential data center components will heat up at much higher rates, potentially imposing a bottleneck on data center performance.
How can we upgrade our thermal management?
Traditional air-cooling methods are currently the most commonly used cooling solution for data centers. They are very effective in standard data centers, but they lack the cooling firepower for high density AI/HPC clusters.
There are several cooling technologies with great potential in mitigating the aforementioned thermal concerns. They have yet to see wide-scale adoption, but they offer a level of cooling that is conducive for high-end computing.
Enhance power efficiency with Liquid Cooling
The emerging leader in this trend is liquid cooling. Whereas traditional air-cooling relies solely on fans and heatsinks to address excess heat, liquid cooling technology bolsters heat dissipation by introducing liquid coolants to the system.
Because liquid moves heat more efficiently than air does, liquid cooling can drastically reduce thermal levels even in workload-intensive environments. The spending trends reflect this as the cooling market hit $1.5 billion last year and is expected to grow to $6.2 billion in the next five years, a staggering 4x growth rate.
Let's examine our choices for liquid cooling:
D2C: Direct-to-Chip
Direct-to-chip cooling (D2C) is a very straightforward concept which addresses heat concerns directly at the source. It puts the cooling mechanisms directly in contact with key essential components such as the CPU and GPU. The cold plates for D2C are mounted onto these components, which easily generate the highest amount of heat in the system.
These cold plates are comprised of thermally conductive materials in the shape of rectangular blocks with a network of tubes/channels running through these large cooling blocks. The coolant liquid flows to and from these tubes, which absorbs excess heat that builds up around the component sockets.
D2C cooling plates are very effective. They are capable of removing around 75% of the heat at the source. But this leaves the remaining 25% amount of heat that needs to be accounted for via air cooling, which can leave systems running into the same overheating issues as before.
Immersion cooling
A more comprehensive cooling method that has been gaining serious consideration among data center operators for hyperscaling data centers, is immersion cooling.
In immersion cooling, system components are submerged in dielectric liquids, which are materials (in a liquid form) that can quell electrical discharges. By bathing these components in the dielectric liquid, the heat is removed much more quickly than air cooling and over a more widespread area than a liquid cooling block.
Just as importantly, the dielectric liquid can come into contact with energized components without damaging any of the system components. The ability to cool the entire system without damage to any of the expensive components allows network operators to keep their data centers running seamlessly and with peace of mind.
Immersion cooling can be separated into two different types:
Single-phase
This type of immersion cooling uses a dielectric coolant which never changes state. It remains in liquid form the entire cooling process, absorbs the heat, and travels to an external heat exchanger to cool before traveling back to the server rack.
Two-phase
In two-phase cooling, the coolant changes states several times to enhance cooling. The coolant starts off in its original liquid form and is boiled until it turns into vapor. Although the idea of boiling coolants in a server environment may seem counterintuitive, the phase change between liquid and vapor allows it to absorb a higher percentage of the heat in the system. The coolant also has a lower boiling point (sub-60 degrees Celsius) to improve boiling efficiency.
The generated vapor then rises and is cooled by a cooling condenser at the top of the rack and drips back into the rack. Two-phase cooling is a closed loop system with sealed or semi-sealed tanks to limit contamination.
With high thermal conductivity and higher heat capacity, immersion cooling outperforms other cooling technologies by removing 100 percent of the heat without the need for any traditional air cooling assistance.
Rear door heat exchanges (RDHx)
A hybrid version of a liquid cooling technology are rear door heat exchanges. Similar to D2C cooling, RDHx coolers offer cooling directly at the source of the heat. RDHx coolers utilize a radiator that is mounted on the back side of the server rack. The cooler then absorbs the excess heat and transfers it to a liquid coolant. Once cooled, the air is recirculated into the data center.
RDHx is not strictly a liquid cooling strategy as it incorporates both liquid cooling elements as well as air cooling technology. The hybrid technology, however, has high efficacy as it also removes 100% of the heat.
What are the drawbacks of Liquid cooling?
Although liquid cooling provides major benefits and with higher efficacy compared to air cooling, we still have to consider the potential drawbacks of these methodologies.
On paper, liquid cooling delivers superior cooling in and around the high heat generation areas that air cooling might not be able to reach. In practice, however, liquid cooling systems are generally more complex to install and they do have several vulnerabilities.
If we take a closer look at the two-phase immersion cooling system that was mentioned earlier, the risks can be substantial. For one, the cooling system induces much higher system pressure with the vaporization and condensation. It also comes with potential risks such as increased water leaks and the use of unsafe chemicals such as PFAS (Per- and polyfluoroalkyl substances) fluids. As such, installation of liquid cooling systems requires detailed knowledge of the operating mechanisms and skilled technicians to configure properly.
What are some cooling technologies for network infrastructures?
As data requirements scale relentlessly, the thermal concerns also apply to the networking components of a data center infrastructure. High-data rate transceivers are key building blocks in any network, but they are susceptible to the same overheating issues and also bound by the same thermal constraints as compute processing units.
Statistics show that OSFP transceivers (high-end 8 lane transceivers) can consume upwards of 40W of power per deployment. At gigabit data transfer speeds, heat builds up fast. Taking into account that these transceivers also operate in close proximity to other crucial network components, this is another problem to consider.
With higher-data rate transceivers, there are several new technologies that can help keep the heat in check, namely:
Thermoelectric Coolers (TECs):
These type of coolers are implemented in long range transceivers (FR, LR, ZR) and utilize mechanisms to maintain laser diode temps at a stready range to prevent signal degration and wavelength drifting.
High-K Aluminum Alloy Housings:
Traditionally, transceivers utilize zinc alloys but it tends to have lower heat dissipation qualities. Replacing traditional zinc alloys with lightweight and corrision-resistant High-K aluminum alloy housings can greatly ramp up heat dissipation.
Snap-On Heat Sinks:
Implementing snap-on heat sinks maximizes surface area in even the most compact modules, The swappable heat sinks greatly increases heat dissipation with more space for the heat to travel to, while allowing for easier maintenance and upgrades.
Liquid Cooling compatible transceiver technology:
High-data rate transceivers should also be paired with liquid technology technologies to further bolster cooling:
Liquid Cooling & Phase-Change Materials (PCM):
Liquid Cooling and PCM can enhance cooling efficiency in high-density deployments, which is why they have been gaining traction for 800G and 1.6T modules.
Micro TECs:
Transceiver manufacturers such as Axiom are also starting to implement ultra-small thermoelectric coolers in compact transceivers. Micro TECs give network operators more granular control on gigabit transceivers by enabling them to control temperatures on 800G+ modules with much greater precision. Micro TECs also allow for the maintenance of laser diode temperatures to enhance signal integrity.
Immersion-Compatible Transceivers:
With Immersion cooling gaining significant traction, the system components must also be compatible with a liquid-based cooling architecture. Immersion-compatible transceivers from Axiom are engineered to operate at full capacity even while submerged in the dielectric fluids. The transceivers perform with greater power efficiency in HPC environments with advanced cooling.
Axiom optimizes power efficiency and bolsters cooling in data center
With these advanced cooling technologies in place, data center operators can ramp up their data center infrastructures to handle the high-intensity, arduous tasks that HPC and AI applications require of our data center hardware.
Axiom features state-of-the-art transceivers with built-in cooling methods such as Micro-TECs and Liquid cooling/PCM technology. Axiom offers 800G transceivers that are fully compatible for Immersion cooling. Our network architecture teams can guide your data centers with seamless implementation of liquid cooling to mitigate thermal concerns.
The robust combination of Axiom data center solutions with liquid cooling technologies can greatly optimize power efficiency as well as reduce the carbon footprint of our data centers. Paired with Axiom data center solutions, these cooling technologies ensure that we can build more sustainable data center infrastructures for next generation applications.
To talk to an engineer about liquid cooling implementation or high-data rate transceivers with advanced cooling for HPC environments, contact us today: sales@axiomupgrades.com.
------------------------------------------------------------------------------------------------------------------