How High-Quality PCB Design Impacts Performance in AI Data Centers

Google ADs

PCBs are the backbone of modern electronics because they are responsible for routing power, data, and signals between the processors, electronic components, memory, sensors, and peripheral devices. When building PCBs for AI and data center infrastructure, the circuits that interconnect the high-performance GPUs, processors, tensor units, and other onboard devices must be capable of handling high data throughput while effectively managing extreme electrical and thermal stress. To enable this, a meticulous PCB design process must be done first, which involves precise material selection and PCB layout. If done right, this high-quality PCB design impacts the AI and data infrastructure in the following ways.

Impact of High-Quality PCB Design on AI Data Centers Infrastructure Performance 

Better Thermal Management

The GPUs and CPUs that run AI processing in data centers have massive compute power to handle massive data sets during training. While doing this intensive work, they generate a lot of heat, which can damage both them and the rest of the motherboard, resulting in expensive outages.

But high-quality PCB design practices that include integrating heat sinks, thermal vias, metal core layers, and generally using materials with high thermal conductivity to dissipate the heat, ensures the processors and PCBs operate within the recommended temperature limits at all times.

Google ADs

Advanced PCB manufacturers have developed techniques to include microfluidic cooling channels inside the layer structure of multilayer boards to drastically enhance cooling, especially around “hot” components like GPU chips, to ensure hyperscale AI data centers throttle their engines to full speed without failure.

Reduced Signal Delays

AI training is a highly demanding task that requires multiple interconnected GPUs to work simultaneously while sharing data to coordinate the computations. There is the aspect of PCB interconnectivity speeds, which can be optimized by using optical fiber connections. But inside the board, the chip must rely on metal connections to link it to other components and eventually to peripheral ports within the layers.

To ensure these copper traces are up to the task, the PCB design should:

  • Feature reduced trace length (especially high-speed paths between chips and memory)
  • Have a HDI layout that allows more interconnections per square inch
  • Optimize the layer stack-up
  • Use low-loss materials
  • Reduce via usage
  • Use controlled and consistent impedance routing while avoiding discontinuities, such as sharp corners and sudden trace width changes
  • Implement differential pair routing

Compacts Hardware Installations

While data centers are large and space might not seem like an issue, it actually is. So increasing the computing density is important because it raises the processing power per rack, meaning the same real estate can concentrate more compute power.

This compactness provides a huge cost saving advantage to investors in these aspects.

  • Lowers Real Estate Costs: There’s no need to acquire or lease land to build new data centers.
  • Enables Infrastructure Consolidation: High computing density setups need fewer cabinets, reduced cabling, less networking equipment, and lower management complexity.
  • Reduces Running Costs: Set up costs will be higher, but the compact data center will need less power over time because the energy used for cooling and providing compute power will be lower.

To achieve this high level of computing density, design techniques like HDI with segregated layer stacking, intelligent routing to minimize interference, component embedding, and thermal management must be considered.

Reduces Power Consumption

AI data centers are notorious when it comes to power consumption, with the servers that host the GPUs and CPUs responsible for AI training and inference accounting for around 60% of the total power needed to run a single location. The other 40% goes into storage systems, networking equipment, cooling, UPS power backups, lighting, office equipment, etc. So efficient AI PCB designs can save thousands of watts annually, resulting in thousands of dollars worth of savings in utility costs for the data center operators.

This efficient design is achieved by implementing techniques like:

  • Shortening and widening traces to reduce electrical resistance and heat conversion losses
  • Using dedicated power and ground planes to lower impedance
  • Optimizing component layout to lower parasitic inductance and capacitance
  • Decoupling capacitors to stabilize the voltage and reduce high-frequency noise, resulting in lower voltage operation
  • Isolating components in functional blocks to enable power gating, where sections of the PCB can be cut off from power when idle
  • Selecting modern, low-power components to mount to the board

Eliminates Computation Errors

AI hardware runs at extremely high speeds (gigahertz frequencies), and minute distortions along the signal paths can lead to computational errors. These distortions primarily include crosstalk and reflections, which can cause data corruption, system freezes, and timing failures. Since these are binary systems, distortions basically means a 0 being read as a 1 or vice versa.

High-quality PCB design demands features like controlled impedance routing to eliminate reflections, lowering the via count along high-speed lines to cut parasitic inductance and capacitance, and differential pair routing to cancel noise and maintain timing.

Other factors like the 3W rule spacing between traces and separating high-speed digital circuits from their analog counterparts reduce EMI-related computation errors, while decoupling capacitors and providing solid power and ground planes creates stable Power Delivery Networks (PDN) that ensure clean and stable power (voltage) reaches the chips and memory.

How To Ensure These High-Quality PCB Design Benefits Are Achieved

Design and Prototyping Collaboration

The line between AI hardware design and manufacturing needs to be as thin as possible to ensure seamless testing and collaboration during development. Each fault in the prototype and subsequent improvement must be handled quickly and efficiently so that the final production board gets to the market quickly and performs as intended.

Standardization and Scalability

To ensure consistency in the design and manufacturing quality of the AI accelerator boards, PCB designers and manufacturers must adopt automation (such as by using AI and automated design rule checks) and industry standards. These measures are critical for identical, stable, and predictable performance in each server and rack across data centers.

Conclusion

AI is still in its infancy, so as it matures, the compute power required to run it will be significantly higher. High-quality PCB design is critical to ensuring such high levels of compute power are available to the incoming data by ensuring the benefits described above are achieved. PCB design and manufacturing engineers are also investing heavily in research to bring more efficient implementations in these circuits, so the best is yet to come.

ABOUT THE AUTHOR


Leave a Comment

Your email address will not be published. Required fields are marked *

Shopping Cart