Empowering Engineers, Enabling Innovation: The AI Journey
ABSTRACT. As we enter a new era of accelerated innovation, AI is not just augmenting engineering—it is redefining it. From tackling the most complex design challenges to unleashing creative breakthroughs, AI is becoming an indispensable partner in the engineering process. In this keynote, we explore how AI is transforming engineering workflows and enabling new levels of innovation. For example, in high-speed, nonlinear signal integrity analysis, traditional methods fall short under the complexity of modern interconnects. While standalone AI techniques like Bayesian optimization can reduce simulation time from weeks to days, only human-AI collaboration has shown the ability to compress this process to mere hours—bringing practical, real-time solutions into reach.
Beyond efficiency gains, AI is a catalyst for engineering creativity. Once engineers begin solving problems with AI, they unlock new pathways for innovation. In the field of thermal and power analysis for advanced SoC design, the development of an AI tool—SuperGrid—reduced analysis time from two weeks to seconds. This breakthrough spurred a wave of innovation, expanding analysis coverage by orders of magnitude and driving better-informed design decisions across the board. Looking forward, the rise of agentic AI—autonomous systems capable of planning, reasoning, and acting across devices and platforms—is set to revolutionize engineering once again. At Intel, we built the SuperBuilder platform to empower engineers to create their own AI agents. The latest version supports MCP (model context protocol) framework and runs across diverse hardware, enabling secure, on-device AI agents like our IEEE editor agent, which drastically speeds up technical paper reviews.
Join us on this journey as we showcase how AI is not just a tool but a partner—empowering engineers, amplifying creativity, and enabling innovation at an unprecedented scale.
ABSTRACT. In this paper, we introduce a minimum-mean-squared error (MMSE)-based crosstalk-aware equalizer approach that enhances signal integrity (SI) in high-speed chiplet interconnects. Existing MMSE-based methods treat lanes in isolation or overlook inter-channel correlation, yielding suboptimal performance in densely high-speed links.
By jointly exploiting inter-channel correlations, our method computes feed-forward equalizer (FFE) tap weights that proactively suppress crosstalk and minimize interference-induced distortion. We validate our approach using a fast statistical-eye simulator driven by a voltage-transfer-function (VTF) model, verified to within 5\% error against a commercial simulator. Evaluated on a 20-lane, 16Gb/s/lane interconnect, the proposed approach enhances signal integrity by at least 37.6% under worst-case crosstalk versus standard MMSE, and outperforms crosstalk-aware MMSE variants by over 13.7%. These results establish the proposed approach as a practical solution for designing robust, high-density chiplet interconnects.
Temperature-Dependent SPICE Models for UCIe Interconnects
ABSTRACT. This work presents a methodology for generating
temperature-dependent channel models for die-to-die interfaces.
The parametrized models are represented as distributed RLGC
networks. The modeling methodology is demonstrated for ×32
and ×64 UCIe interfaces implemented on a silicon interposer,
achieving average accuracy greater than 99% when predicting
channel loss and crosstalk. These surrogate models are over
five orders of magnitude faster than a single electromagnetic
simulation, significantly expediting design space exploration.
Signal Integrity Design and Analysis of Inter-Stack Crosslink for 3D-Heterogeneous Integrated High Bandwidth Memory (3D-HI-HBM) Module
ABSTRACT. In this paper, we designed and analyzed an inter-stack crosslink interconnect for the 3-dimensional heterogeneous integrated high bandwidth memory (3D-HI-HBM) module. Each interconnect component including the crosslink through silicon via (TSV), bump and channel are designed considering the physical dimensions and routing feasibility within the design constraints of HBM module. The designed interconnect is analyzed in terms of signal integrity (SI) through S-parameter and eye diagram simulation. Analysis results show that the inter-stack crosslink operating at 16 Gb/s is capable of providing a total inter-stack bandwidth of 2 TB/s while maintaining routing feasibility. In addition, when operating at an extended 32 Gb/s data rate, analysis results indicate that further advances to the inter-stack crosslink is required to provide the extended bandwidth over 4 TB/s to enable further performance scaling between 3D-HI-HBM modules.
Ground Resonance On-Wafer Measurement up to 67 GHz and Modeling with Physical Equivalent Circuit
ABSTRACT. We present a set of simplified ground‑resonance test‑coupon designs to experimentally validate resonance diagnosis using the proposed one‑dimensional ground‑resonance (GS) physical‑based transmission‑line (PBTL) equivalent circuit model for high‑speed connectors. The GS‑PBTL simulation results accurately predict on‑wafer measurement S‑parameters and agree well with 3D HFSS FEM simulations up to 67 GHz. This work provides a fast, physically intuitive tool for signal‑integrity (SI) diagnosis with measurement‑to‑simulation correlation.
Separate-Cavity Optimization Method for PCIe Gen6.0 PCB-Connector Pad Design Using Particle Swarm Optimization (PSO) Algorithm
ABSTRACT. This paper proposes a method for optimizing cavity dimensions to minimize impedance fluctuations in an interconnection area between a PCB and connector. The approach involves separating the cavity region based on the connector contact point, characterizing each part using 2D-based equations with Finite Element Method (FEM) and Djordjevic-Sarkar model, and utilizing the Particle Swarm Optimization (PSO) algorithm to find the optimal dimensions that minimize impedance variations for the cavity. The proposed method has verified experimentally through improvements observed in S-parameters, TDR responses, and Pulse Amplitude Modulation - 4 level (PAM-4) eye diagrams of EDSFF-based test coupons. Especially, the top eye height and width have improved from 42.3 mV to 72.7 mV and from 3.8 ps to 7.0 ps, respectively.
A Conditional Diffusion Framework for Sample-Efficient Thermal Modeling in 3DICs
ABSTRACT. Thermal constraints pose a critical challenge in
modern three-dimensional integrated circuits (3DICs) due to the
increasing density of transistors and the limited heat dissipa-
tion paths. We propose a conditional diffusion-based generative
framework that accurately predicts static temperature distribu-
tions from power maps with low latency. Our model introduces a
fusion-based U-Net architecture named HeatDiffUNet that inte-
grates power and temperature information across multiple scales,
improving conditioning effectiveness and thermal localization.
Our proposed method achieves a temperature difference error of
2.31°C using only 200 training samples, outperforming existing
methods by as much as 68%. Notably, even when provided with
five times more training data, both the prior techniques are
unable to match the predictive accuracy demonstrated by our
model. Additionally, our approach excels in hotspot localization,
achieving an average deviation of just 2.47 pixels at merely
500 training samples. This performance is notably superior to
the existing works by up to 46%, despite these baselines being
trained on twice the amount of data. These results underscore our
method’s exceptional suitability for accurate thermal analysis and
effective targeted cooling strategies in early-stage 3D IC design.
Machine-Learning-Based Optical I/O Design and Analysis Automation for Co-Packaged Optics
ABSTRACT. This paper demonstrates a machine-learning based methodology for optical I/O design and analysis for co-packaged optics. The trained neural network model achieves over 98% accuracy in predicting results compared to full-physical simulations. The implementation, developed by opensource scripting, offers broad compatibility with major commercial modeling tools or measurement-based data input, enabling scalable, simulation-free design automation for next-generation optical interconnects.
High-Speed Channel Identification Using Contrastive Learning
ABSTRACT. his paper presents a contrastive learning approach with dual encoders to learn a joint embedding space that identifies an unknown channel in a known set. One encoder learns an embedding space from the channel output waveform and the receiver filter configuration, whereas the second encoder learns an embedding space for the corresponding eye diagram. We validate the approach using real-world measurements from a channel emulator for varying losses and receivers with equalization settings running at 15 Gb/s NRZ.
At test time, we measure top-n accuracy, which determines whether the correct label is present within n samples. The approach determines the correct BER plot for a given channel waveform and receiver filter setting, with accuracies of top-3/9/27 as 61.5%, 87%, and 97.38%, respectively. Conversely, when searching for the correct underlying channel waveforms for a given receiver filter setting and BER contour plot, we find the top-1/3/5 accuracies to be 72.75%, 92.78%, and 95.75%, respectively. The proposed approach enables a fully data-driven forward and inverse solution while retaining independent encoder models for different downstream tasks.
ABSTRACT. Ground resonances in high-speed connectors critically impact the signal integrity of data links, inducing extra signal loss and crosstalk degradation. In this work, the dielectric add-in for multimode phase matching (DAMPM) technique is developed to minimize the phase difference between the signal and ground propagation modes in multi-channel connectors. Dielectric segments are incorporated into the peripheral component interconnect express (PCIe) low-profile fast-pass input/output (FPIO) connector to reduce mutual energy coupling. As a result, ground resonances are effectively suppressed, with differential near-end crosstalk (DDNEXT) and far-end crosstalk (DDFEXT) reduced by 11 dB and 19 dB respectively on the first resonance (12.65 GHz) at receiver (Rx) side. For 128 GT/s pulse amplitude modulation 4-level (PAM4) transmission, the signal eye height (EH) and eye width (EW) improve by 89% and 65% respectively.
Corner Modeling and Sensitivity Analysis of a Rough-Foil Microstrip using Machine Learning
ABSTRACT. Impedance-attenuation process corners for a differential microstrip design with 18 varying design features are produced.
Rough copper surfaces are included for detailed loss modeling, and feature tolerances are based on manufacturing data where applicable.
Machine learning is used to produce impedance and attenuation surrogate models, which require a relatively small dataset to train and test compared against other approaches despite the large design space.
The active subspaces of these models are identified using symbolic differentiation and are interpreted as sensitivity measures to inform tolerance updates and suggest key design variables.
High-Bandwidth Memory (HBM) Network Switch Architecture Design with Low Latency and Efficient Energy Consumption for HBM Centric Computing
ABSTRACT. For the first time, this paper proposes a high bandwidth memory network switch (HBM-NS) architecture. This architecture expands the memory capacity of the GPU-HBM module and reduces the latency and energy consumption for efficient HBM centric computing by near-memory computing (NMC) cores in HBM's logic die. HBM-NS is the crossbar type switch chip which controls the data direction flow. Each HBM-NS is placed among the four HBM's corner side to reduce the interconnect length of data path between HBMs. For verification of our architecture, we designed HBM-NS's on-chip interconnect and extracted RLC lumped-circuit components using 3D EM simulator to evaluate the latency and energy consumption. The result shows that the proposed architecture reduces energy consumption and latency compared to the architecture without HBM-NS by up to 32.1% and 55%, respectively.
Design and Analysis of Scalable On-Package Memory Expansion Architectures for GPU-HBM System Considering Signal Integrity
ABSTRACT. In this paper, we propose scalable on-package memory expansion architectures to address the growing memory demands of large-scale AI inference workloads. To achieve high bandwidth and low latency, memory management logic (MML) is embedded in the high bandwidth memory (HBM) base die, enabling unified and transparent access to the expanded memory. The extended memory is interconnected via Universal Chiplet Interconnect Express (UCIe) interface, enabling compatibility with diverse memory types. Two heterogeneous integration strategies are explored: one integrates the extended memory on a shared silicon interposer using UCIe-advanced (UCIe-A), while the other routes it through the package substrate via UCIe-standard (UCIe-S). The interconnects are designed and evaluated using electromagnetic solver and SPICE simulations to analyze signal integrity (SI). The results confirm that both architectures satisfy UCIe specifications, demonstrating feasibility of supporting high-speed, high-capacity on-package memory scaling. These findings suggest a promising architectural approach for enhancing on-package memory capacity in next-generation AI accelerators.
Design and Analysis of Power/Ground TSVs considering Static IR Drop for Next-generation HBM
ABSTRACT. In this paper, we investigate how Power/Ground (P/G) Through-Silicon-Vias (TSVs) design—including placement and geometry—affects static IR drop in next-generation High Bandwidth Memory (HBM). As large-scale AI computing demands higher memory bandwidth and more DRAM stacking, IR drop has become a critical challenge in designing power delivery network (PDN). We evaluate three representative design cases— first, expanding P/G TSVs placement across the base and core dies; second, increasing the number of voltage source ports; and third, adjusting P/G TSVs density and dimensional parameters within unit area. Simulation results show that while both P/G TSVs placement and geometry help reduce static IR drop, the former has a greater impact. Therefore, efficient IR drop mitigation requires balancing P/G TSVs placement, density, and power delivery design.
Power Integrity Design and Analysis of High Bandwidth Memory-centric Memory Pooling Computing Architecture
ABSTRACT. In this paper, we design and analyze a proposed High Bandwidth Memory (HBM)-centric memory pooling computing architecture in terms of power integrity (PI). The proposed architecture enables consolidated memory pooling across multiple processors, reducing data copying latency. To implement this architecture, diverse I/O interfaces must be integrated on a common VDDQ domain for achieving compact and simplified power delivery network (PDN) in the customized HBM base die. However, integrating various interfaces on a common VDDQ domain deteriorates PI in terms of simultaneous switching noise (SSN). To solve this issue, we evaluate three PDN design cases that use different splitting points within the hierarchical PDNs. The results show that the case of fully unified hierarchical PDN design can achieve lower SSN compared to others. Our work provides design guidelines for common VDDQ domains in heterogeneous I/O integration scenarios for proposed HBM-centric memory pooling computing architecture.