View: session overviewtalk overview
10:30 | Ultra dense SRAM Cell Test Challenges PRESENTER: Uma Srinivasan ABSTRACT. This paper discusses test techniques used to create an exceptionally reliable processor using an ultra-dense SRAM cell. The test challenges are addressed from the context of stability, in a high-performance processor chip. We will discuss methods to alleviate the read stability fails by enabling the most optimal set of repairs to the highly repairable custom cache arrays without overrunning the total repair capacity of the chip. This paper demonstrates off chip repair calculation strategies to efficiently repair outlier SRAM cells by prioritizing BIST algorithms, test temperature, voltage, and other parameters. |
11:00 | A Fine and Massive Test Methodology for Analyzing Core Characteristics in the Development of Next Generation DRAM PRESENTER: Min-Kyu Kim ABSTRACT. The exact characterization of circuits in packaged chips have been essential to design the semiconductor product. However, it is hard to directly measure massive circuits at the desired location in packaged chips. In this paper, we study the method for characterizations of circuits in the chip by implementing a unique operation that is not used in DRAM normal operation. By using our technique, mismatch in millions of sensing amplifier can be extracted in the chip level, and the performance of circuit to cancel mismatch can be also measured, which can be used to develop next-generation sensing amplifier. |
11:30 | Sisyphus: Cross-Layer Efficiency Across NVM Technologies in Compute-in-Memory Architectures PRESENTER: Mehdi Tahoori ABSTRACT. Compute-in-Memory (CiM) with Non-Volatile Memory (NVM) offers promising performance and power efficiency for data-intensive tasks. NVM properties impact CiM architecture efficiency, requiring rapid design space exploration. We introduce Sisyphus, the first cross-layer framework for architecture research, integrating STT-MRAM, ReRAM, and PCM-based CiM designs into gem5 for performance, power, and resilience evaluation. Sisyphus enables holistic analysis by running workloads on CPU-CiM systems, comparing them to CPU-only baselines. Our experiments show how Sisyphus identifies the best NVM type based on optimization goals. |
10:30 | Stress Aware Quiescent Current Test Optimization PRESENTER: Shubhendu Shrivastava ABSTRACT. Voltage stress-Iddq signature has supported zero defect efforts, but scaling and test time constraints have caused latent Gate Oxide (GOx) shorts. This paper proposes three methods to optimize signature detection in digital IC testing: 1) Critical Thickness Model (CTM), identifying the minimum stress time for MOSFETs with GOx < 3nm, reducing test costs and yield loss. 2) Stress Coverage Quantification Algorithm (SCQA), evaluating actual stress coverage. 3) Coverage Maximization Algorithm (CMA), reducing voltage stress test escapes. CTM reported minimum stress time 10e-3 lower than current practice, SCQA reported a 6.2% coverage difference at the transistor level compared to ATPG, and CMA reduced voltage stress test escapes by 10%, improving quality. |
11:00 | Enhancing Timing Predictability in Automotive Electronics: Addressing Aging and Temperature Distributions PRESENTER: Jason W.-Y. Cheng ABSTRACT. Traditional timing-prediction methods that assume constant temperature for automotive electronics can lead to errors of up to 17% compared to models that incorporate temperature distribution over 30 years of aging, resulting in inaccurate reliability assessments. To address this limitation, this paper introduces a highly-reliable timing-prediction framework that integrates functional behaviors, aging effects, and temperature distributions for accurate analysis. By leveraging two-phase machine-learning approach for cell-delay prediction and eliminating false paths at design level, our solution achieves maximum critical path timing error of only 2.22% compared to SPICE simulations. Additionally, it significantly improves computational efficiency, achieving speed-ups of up to 203.97×. |
11:30 | Full enablement of Very Low Voltage testing to deliver Zero Defect Quality automotive products PRESENTER: Stephen Traynor ABSTRACT. Scan VLV testing is an established industry practice to screen latent defects. To be effective these tests must be run near intrinsic voltages of silicon. We discuss the experiments that helped productize this in single digit FinFET technology. |
11:45 | Embedded Trace: A Key Enabler for Silicon Lifecycle Management PRESENTER: Vivek Chickermane ABSTRACT. A key requirement for SLM is to detect and isolate functional failures that escape structural tests. Embedded trace can collect time stamped data to analyze the trajectory of the software transactions involving the CPU, memory, I/Os, peripherals, and other sub-systems, is a key component of any silicon analytic system. This presentation will use an industrial case study of an embedded trace system developed based on the Efficient Trace for RISV-V (E-Trace) specification to highlight the central position it occupies in building a comprehensive silicon debug and continuous monitoring solution. Results on some industrial benchmarks demonstrate the efficiency of this approach. |
11:49 | CP-Bench: A PyTorch Test Suite to Detect AI Hardware Failure, Performance Degradation, and Silent Data Corruption PRESENTER: Xun Jiao ABSTRACT. The growing complexity in manufacturing and operating the hardware in AI clusters leads to significant challenges in reliability. To tackle this issue, we present CP-Bench, an open-source, Configurable and Parameterizable, PyTorch-level test suite designed to test AI hardware failure, performance degradation, and silent data corruption (SDC). Built upon open source projects, CP-Bench contains 30+ AI workloads (e.g., Llama). We have deployed CP-Bench for use cases throughout Meta’s hardware lifecycle, from manufacturing to in-production diagnostics, based on which we detect/reproduce various hardware issues such as SDC. Notably, some of these issues were not caught by vendor’s tooling. |
11:53 | In-Field Testing using In-System Embedded Deterministic Test as a solution to alleviate Silent Data Corruption in AI designs PRESENTER: Varun Sehgal ABSTRACT. In-Field Test has for long relied on using Built-in self-test (BIST). This involves power-on, power-off and testing during device operation. Hyperscalar datacenters that often-run large-scale AI/ML applications need to run periodic testing of device in-field. This is necessary to prevent interruptions caused by Silent Data Corruption (SDC). For this, targeted portions of the device must be accessible for testing, while rest of the device is running in functional mode. This poster explores method to test the device In-field using In-System Embedded Deterministic Test (IS-EDT) patterns delivered through Streaming Scan Network (SSN). The In-System Test Controller (ISTC) runs IS-EDT patterns and can also use an IJTAG network to run BIST capabilities. |
10:30 | Power Side-Channel Vulnerabilities of a RISC-V Cryptography Accelerator Integrated into CVA6 via Core-V eXtension Interface (CV-X-IF) PRESENTER: Behnam Farnaghinejad ABSTRACT. Modern RISC-V designs integrate cryptographic accelerators to boost performance, yet their power side-channel vulnerabilities remain underexplored. This work presents a pre-silicon evaluation methodology using simulated power traces, employing KL divergence, Correlation Power Analysis (CPA), and Differential Power Analysis (DPA) at the RTL level. Demonstrated on a CVA6-based AES accelerator integrated via the Core-V-Extension Interface (CV-X-IF), the method reveals leakage trends consistent with FPGA-based power measurements. While hardware AES shows improved resilience over software, key extraction remains feasible, underscoring the need for early-stage analysis. The approach offers a generalizable framework for assessing side-channel leakage in RISC-V cryptographic accelerators. |
11:00 | QuEST: Quantitative Entropy-based Security and Trojan Detection Framework for Confidentiality Verification PRESENTER: Domenic Forte ABSTRACT. Modern semiconductor design heavily relies on the integration of IPs from 3PIP vendors to improve design efficiency. However, such collaboration introduces security concerns as adversaries can insert hardware Trojans as outsourced IP vendors. Typically, confidentiality verification is applied for hardware Trojan detection, as Trojans function by leaking sensitive information. In this paper we present QuEST, a confidentiality verification framework that identifies data leakage by analyzing statistical dependencies between inputs and outputs. Moreover, QuEST applies Shannon entropy-based metrics to quantify the extent of data leakage. The proposed framework successfully detects date leakage caused by hardware Trojans in 11 Trust-hub benchmarks. |
11:30 | Pseudo Random Low Power Built in Self Test PRESENTER: Dale Meehl ABSTRACT. In this paper, we present new PRPG gating concepts that can help control the input scan switching activity but still allow scan chains to get pseudo-random 0/1’s during the scan load operations. This allows for a predictable Low Power pseudo random scan input data to help achieve fault mark off without the need to shut off scan chains or custom PRPG logic. |
11:45 | Improving Error Tolerance and Scalability in Pseudo-Boolean SAT-based Generic Side-Channel Analysis PRESENTER: Shakil Ahmed ABSTRACT. Pseudo-Boolean Satisfiability (PBSAT) can be used to perform automated power side-channel analysis, i.e. recover secret keys from the power consumption information of a generic Boolean circuit. These PBSAT procedures however have higher complexity due to the NP-hard nature of PBSAT solving, and can be more sensitive to noise. In this paper we propose a formal circuit slicing procedure to improve runtime, and procedures based on pseudo-Boolean optimization (PBOPT) to improve error tolerance. We show orders of magnitude improvement in runtime via the slicing routine, and improvements in key accuracy in the face of error that would cripple the original PBSAT routines on a set of generic benchmark circuits. |
11:49 | Glitter PUF: A Passive Anti-Tamper PUF Based On Images Of Glitter Reflection PRESENTER: Noeël Moeskops ABSTRACT. In this paper we introduce a passive physical anti-tampering Physical Unclonable Function (PUF) based on glitters that can protect the entire IC and/or PCB. As a case study, a prototype of the proposed glitter based PUF has been developed using a Raspberry Pi (RPi) camera, a resin coating layer containing glitters and a 3D printed case. Using actual drill measurements, our findings indicate that even drilling with a 0.1mm diameter drill can be detected and lead to a wrong key. |
11:53 | An SMT-Based Method for Identifying State-Holding Elements in Extracted Netlists PRESENTER: Aric Fowler ABSTRACT. Hardware description language (HDL) netlists extracted from reverse-engineered integrated circuits (ICs) are described at the transistor level, thereby obscuring any internal sequential circuitry. Existing methods to extract sequential behavior from transistor-level netlists rely upon prior knowledge of the design, such as the location of memory state cells or the number of state-holding nets, which may not be available. Toward identifying state-holding elements in an extracted netlist without help from any such information, we propose a new methodology which combines graph searching with satisfiability modulo theories (SMT) solving to detect and locate state-holding nets. |
13:30 | A Novel Omnidirectional 3D Test Access Architecture for Future System-on-Wafer (SoW) Applications PRESENTER: Hiroyuki Iwata ABSTRACT. System-on-Wafer (SoW) technology integrates multiple chiplets on a single wafer substrate, delivering enhanced performance for high-computation applications. This paper introduces an innovative 3D test access architecture for complex SoW systems. Our proposed omnidirectional test access architecture, compliant with IEEE Std. 1838, provides a scalable, plug-n-play testing solution for comprehensive multi-directional testing. It enables efficient scan path configuration, optimizing test scheduling while minimizing power consumption and timing impacts. A case study of a SoW system with four chiplets connected in a 2x2 configuration validates the architecture feasibility and effectiveness, demonstrating its potential to improve SoW testability and reliability. |
14:00 | Chiplets' Die-to-Die Interconnect Repair Language (IRL) PRESENTER: Po-Yao Chuang ABSTRACT. Chips with multiple interconnected dies offer significant advantages over those with a single monolithic die, driving the rapid adoption of multi-die packages in the market. Die-to-die interconnects are typically realized as large, dense arrays of fine-pitch micro-bumps or hybrid bonds, prone to manufacturing defects like shorts and opens. Typically spare interconnects are included to "repair" defective ones. This paper introduces an Interconnect Repair Language (IRL), based on Google's Protocol Buffers, to describe all repair provisions. We demonstrate IRL with an example for UCIe-Advanced 2.0 and present cost/benefit metrics for repair solutions alongside potential EDA tools based on the proposed IRL. |
14:30 | Fault Modeling and Testing of Chiplet-to-Chiplet Interconnects in Fan-out Wafer-Level Packaging PRESENTER: Partho Bhoumik ABSTRACT. Fan-out wafer-level packaging enables high-performance chiplet integration but faces manufacturing challenges such as warpage, die shift, and delamination, leading to various defects in Cu pillar, redistribution layers and solder balls. To address this, we propose a defect analysis and testing framework that maps defects to equivalent faulty circuits for precise fault characterization. A built-in self-test architecture with an embedded ring oscillator detects weak opens, bridging, and coupling faults while quantifying the size of these defects. Our framework mitigates fault aliasing across voltage corners and transistor sizes, enhancing diagnostics, yield learning, and silicon lifecycle management. HSPICE simulations are performed to validate the effectiveness of this framework in 7 nm CMOS technology. |
13:30 | DRONE: Delay Defect and Marginality Targeted Scan Tests to Observe Insidious Errors PRESENTER: Chaitali Oak ABSTRACT. Small defects and excessive process variability can result in timing failures. To expose such failures, we discuss five scan test flavors targeting timing-critical faults. Silicon results on a recent cloud server microprocessor are described. |
14:00 | Efficient Delay Fault Characterization of Resistive Open Defects in Standard Cells Using Resistive Fault Dominance PRESENTER: Gowsika Dharmaraj ABSTRACT. Delay characterization of standard cells under resistive open defects is of increasing concern due to aggressive timing margins in digital circuits. The problem is made worse by the large number of open defect sites in standard cells combined with a wide range of defect resistance values for each site. To alleviate the resultant simulation complexity, we propose the concept of resistive fault dominance (RFD) for resistive open defects. RFD eliminates simulations of certain open defects with intermediate defect resistance values that are guaranteed to exceed specified timing margins for standard cells based on tests for specific “dominant” resistive open defects. An algorithmic delay characterization methodology is developed. |
14:30 | Small Delay Defect Diagnosis via Timing-Aware Fault Simulation with Variant Delay Insertion PRESENTER: Hao-Yu Yang ABSTRACT. As semiconductor technology advances, small delay defects have become a major concern in System-on-Chip testing due to shrinking timing margins.This paper presents an SDD diagnosis method integrating timing-aware fault simulation with injected delay selection and a mismatch-weighted score method to enhance diagnostic accuracy. Rather than relying on fixed delay values, the proposed method determines injected delays based on the slack of transition paths, generating multiple delay sizes to improve fault simulation resolution. The mismatch-weighted score calculation adjusts contributions of failures, enhancing defect ranking and mitigating process variation effects. Experimental results demonstrate that the proposed method significantly improves SDD localization and reduces fault candidates, outperforming conventional approaches in accuracy and efficiency. |
13:30 | IC-PEPR: PEPR Testing Goes Intra-Cell PRESENTER: Chris Nigh ABSTRACT. The rise of datacenters that employ large volumes of advanced chips has enabled the realization and quantification of negative impacts from manufacturing test escapes, including field failures and silent data corruptions. More comprehensive fault models have been proposed to close exposed gaps in test quality, but have shown significant increases in fault lists over traditional fault models. In this work, we propose an enhanced test metric that targets physical regions by flattening cell-internal physical structures within the full circuit layout, then developing an optimized fault set to exhaustively test each region's signals. |
14:00 | Defect-Finding with Timing-Partitioned Small-Delay-Defect Methodology: Silicon Practice on N2 PRESENTER: Hao-Yu Yang ABSTRACT. As semiconductor manufacturing processes continue to shrink, the impact of small-delay defects becomes increasingly significant. Traditional Automatic Test Pattern Generation (ATPG) methods often fail to detect these defects, as they are masked by critical paths. This work proposes a Timing-Partitioned Small-Delay-Defect (TPSDD) methodology to address this issue. By analyzing timing path delays and grouping transition paths based on their delay sizes, ATPG patterns can be generated to specifically test these groups, revealing hidden small-delay defects. Additionally, techniques for identifying outlier dies and defect candidates are presented, validated through testing on the N2 process node. Our silicon results confirm the effectiveness of TPSDD in detecting small-delay defects. |
14:30 | Using Unique Fail Bits to Maximize Chain Diagnosis Coverage for Silicon Defects PRESENTER: Wu-Tung Cheng ABSTRACT. Diagnosis simulation based on stuck-at faults cannot expose all diagnosis problems because silicon defects don’t behave as stuck-at faults. The failure-trigger conditions of silicon defects are generally more complicated and act as intermittent stuck-at-faults. A unique fail bit exists on one fault but not on another fault and is used in this paper to distinguish an intermittent fault from others. Diagnosis coverage can be calculated using the number of unique fail bits to estimate silicon diagnosis resolution before volume production. Diagnosis test patterns and diagnosis points can be added to increase unique fail bits to maximize diagnosis coverage for silicon defects. |