View: session overviewtalk overview
09:00 | Design for Security: The Hardware-Up Principle SPEAKER: Simha Sethumadhavan ABSTRACT. In this talk, I will describe a new design principle for security: the hardware-up principle. Hardware-up security means that security must be engineered like hardware instead of being built like software. I will discuss how systems designed for security from hardware-up offer unique advantages unavailable in current protection systems: a smaller attack surface, energy-efficient execution, and the ability to reason about security compositionally. I will illustrate hardware-up benefits through case studies. For the first hardware-up case study, I will discuss how we can prevent attackers from taking advantage of unintentional hardware design flaws. Taking microarchitectural side channels as an example, I will discuss a new methodology that computer architects can use to reason micro architectural side-channels at processor design time. Attackers can also intentionally weaken hardware to break systems. In the second case study, I will discuss how hardware itself can be created in a manner that provides assurance that its security has not been compromised due to design-time backdoors. I will describe the first static analysis tool for detecting hardware backdoors and our technique for silencing backdoors. I will mention a prototype built using our technique that incurs less than 8% area overhead and negligible performance overheads. Finally, time permitting, I will describe a hardware malware detector, a first of its kind, that is vastly simpler to implement compared to a traditional software malware detector. |
10:30 | A 86 nA and sub-1 V CMOS voltage reference without resistors and special devices SPEAKER: Yanhan Zeng ABSTRACT. A sub-1 V and ultra-low power consumption voltage reference has been implemented in a standard 0.18 μm CMOS process, without using resistors and special threshold voltage devices. A temperature coefficient (TC) of 37.8 ppm/◦C in a temperature range of -40◦C~60◦C is achieved. The supply voltage ranges from 1 V to 3 V, and the line sensitivity (LS) is 0.02%/V. When VDD is minimum, the supply current measured is 86 nA at room temperature, and the power supply rejection ratios (PSRRs) without any filter capacitor at 100Hz and 10MHz are lower than -56 dB and -9.5 dB, respectively. |
11:00 | A Low-Offset Dynamic Comparator with Area-Efficient and Low-Power Offset Cancellation SPEAKER: Xiaopeng Zhong ABSTRACT. A low-offset two-stage dynamic comparator has been proposed for parallel multi-channel processing. Low offset is achieved from two aspects: 1st-stage offset cancellation and 2nd-stage offset suppression. A fully dynamic offset cancellation scheme based on current auto-zeroing is adopted to effectively cancel out the 1st-stage offset. It features small area overhead and low energy consumption. For the 2nd-stage offset suppression, a high gain of the 1st-stage dynamic amplifier is designed by optimizing the overdrive voltage of input transistors. To maintain low offset performance across a wide range of input common-mode voltages, stable low overdrive voltage of the input pair is required. Thus, a tail current source is employed for the 1st stage to ensure constant common-mode current. As a result, the overdrive voltage is stably kept low in various operation conditions. The proposed comparator has been designed in a CMOS 0.18 um process. It operates under a supply voltage of 1.2 V at 10 MHz. Simulation results have verified the low-offset property of the comparator. The offset (1 sigma) is reduced from 19.25 mV to 1.296 mV after cancellation and it remains constant with the input common-mode voltage changing from 0 V to 0.8 V. The energy consumption is 149.17 fJ/Conv. |
11:30 | Σ-Δ Based Force-Feedback Capacitive Micro-machined Sensors: Extending the Input Signal Range SPEAKER: Ayman Ismail ABSTRACT. Operating MEMS capacitive sensors in negative feedback mode results in improved bandwidth, and lower sensitivity to process and temperature variation. To overcome, the non-linearity of the quadratic voltage-to-force relation in capacitive feedback, a two-level voltage feedback signal is often used. Therefore, a single-bit Σ-Δ modulator represents a practical way to implement force-feedback sensors interface systems. However, single-bit Σ-Δ modulators have a limited input-range that is less than the available full-scale, dictated by the actuation voltage value. This is caused by quantizer overload, and the consequent reduction in quantizer effective gain as the input signal approaches full-scale. In this work, a solution is proposed that allows extending the input signal range of Σ-Δ based capacitive sensors beyond the limit imposed by single-bit operation. The proposed technique is applied to the design of a MEMS based accelerometer, and results in an increase in the input signal range from 35g to 40g, and an improvement in signal-to-noise ratio from 130.2dB to 137dB, at the same actuation voltage level. |
10:30 | Analyzing the Behavior of FinFET SRAMs with Resistive Defects SPEAKER: Thiago Copetti ABSTRACT. The miniaturization of CMOS technology is likely to reach its limit due to short-channel effects. New transistor technologies, including FinFET technology, were developed to deal with this effect and enable the continuous scaling-down of technological nodes. Alongside the constant scale-down of integrated circuits technology, the increasing need to store more and more information has resulted in the fact that Static Random Access Memories (SRAMs) occupy great part of Systems-on-Chip (SoCs). Thus, it remains unknown if fault models used to characterize faults in CMOS memory circuits are sufficiently accurate to represent the behavior of FinFET-based memories. In this context, a study of functional implications of manufacturing resistive defects in FinFET-based SRAMs is presented. In more detail, a fault model for FinFET-based SRAMs as well as a complete analysis of the static and dynamic faults are presented. The proposed analysis has been performed through SPICE simulations adopting a 20nm technology library. |
11:00 | Library Pruning and Sigma Corner Libraries for Power Efficient Variation Tolerant Processor Pipelines SPEAKER: Mini Jayakrishnan ABSTRACT. Error tolerance techniques are widely used to protect the processor pipelines from variation induced timing errors. The redundancy inside error detection flip-flops results in power and area overheads. In this paper, we propose two standard cell library tuning techniques to optimize an error tolerant processor pipeline for power and area savings. The design utilizes positive slack available in the pipeline stages and re-distributes it to the preceding error-prone critical paths using slack balancing flip-flops. Library pruning analyses the power and area metrics of the flip-flop cells to derive a power efficient subset of the original library. We use statistical sigma corner libraries to replace the critical flip-flop fan-in cone for further power optimization. Results show that the proposed library tuning techniques gives power reductions of 47% and area reductions of 2.8% in the execute stage module of the processor pipeline. |
11:30 | Improving post-silicon error detection with topological selection of trace signals SPEAKER: Binod Kumar ABSTRACT. Drastic growth in design complexity of VLSI circuits has increased the chances of bugs escaping to first released silicon. This has resulted in an increased emphasis on post-silicon validation and debug which is typically hindered by limited observability of internal signals. Trace buffers assist in curbing this bottleneck by storing selected signal states for limited clock cycles. For efficient use of these on-chip buffers, devising a proper selection criteria is of utmost importance. Maximization of restoration of untraced signals is a widely utilized signal selection principle. However, this approach is not very effective for error detection. This paper proposes a trace signal selection technique based on error transmission, taking into account the topology of the design. The proposed selection methodology can be effectively applied to trace as well as a combination of trace and scan based observability techniques. Experimental evaluation of the proposed trace signal selection methodology on different design errors indicates improvement in error detection as compared to restorability based selection methodologies. |
13:00 | Enabling Efficient System Design Using Vertical Nanowire Transistor Current Mode Logic SPEAKER: Joonseop Sim ABSTRACT. Vertical Nanowire-FET (VNFET) is a promisingcandidate to succeed in industry mainstream due to its superiorsuppression of short-channel-effects and area efficiency. However,to design logic gates, CMOS is not an appropriate solutiondue to the process incompatibility with VNFET, which createsa technical challenge for mass production. In this work, wepropose a novel VNFET-based logic design, calledVnanoCML(Vertical Nanowire Transistor-based Current Mode Logic), whichaddresses the process issue while significantly improving powerand performance of diverse logic designs. Unlike the CMOS-based logic, our design exploits current mode logic to overcomethe fabrication issue. Furthermore, we reduce drain-to-sourceresistance ofVnanoCML, which results in higher performanceimprovement without compromising the subthreshold swing. Inorder to show the impact of the proposedVnanoCML, we presenttwo key logic designs, SRAM and full adder, and also evaluatethe application-level effectiveness of digital designs for imageprocessing and mathematical computation. Our proposed designimproves the fundamental circuit characteristics including outputswing, delay time and power consumption compared to conven-tional planar MOSFET-based (PFET) circuits. Consequentiallyour architecture-level results show thatVnanoCMLcan enhancethe performance and power by 16.4×and 1.15×, respectively.Furthermore, we show thatVnanoCMLimproves the energy-delay product by 38.5×on average compared to PFET-based designs. |
13:30 | Defect-Aware Synthesis for Reconfigurable Single-Electron Transistor Arrays SPEAKER: Juinn-Dar Huang ABSTRACT. As fabrication process exploits even deeper submicron technology, power consumption is becoming one of the most critical obstacles in electronic circuit and system designs nowadays. Meanwhile, the leakage power is dominating the power consumption. Various emerging nanodevices have been developed to tackle the leakage power issue in recent years. The single-electron transistor (SET) is regarded as one of the most promising devices since several works have successfully demonstrated that it can operate with only few electrons at room temperature. Therefore, the reconfigurable SET array has been proposed to continue Moore’s Law due to its ultra-low power consumption. Nevertheless, most existing synthesis algorithms assume the given SET array is defect-free. Hence, mapping a correct synthesis outcome onto a faulty SET array still yields an erroneous result. In this paper, we propose a new synthesis algorithm that guarantees the correct functionality in the presence of defects. Furthermore, the proposed technique can sometimes benefit from those defects to further reduce the mapping area. In certain cases, the required area in a faulty SET array is even smaller than that in a fault-free one. Experimental results show that our new algorithm can synthesize moderately large circuits in a reasonable runtime and achieve an area reduction of 14% as compared to the prior art. |
14:00 | A Wearable Neuro-Degenerative Diseases Detection System based on Gait Dynamics SPEAKER: Wala Saadeh ABSTRACT. Neurodegenerative disorders (NDDs) are chronic diseases of the human central nervous system that cause degradation in mobility and cognitive functioning. Continuous assessment of gait for patients with NDDs is a crucial element of future care and treatment. This paper presents a wearable NDD detection system that monitors the person’s gait and infers 3 key gait features: stride time, its fluctuation and autocorrelation decay factor based on data extracted from an unobtrusive force resistive sensor embedded in patient’s shoe. It is designed to distinguish between different NDDs: (Huntington’s disease (HD), Parkinson Disease (PD), and Amyotrophic Lateral Sclerosis (ALS)) and healthy individuals using only 3 features. The proposed NDD classification algorithm is verified experimentally using a full FPGA implementation with patients’ recordings from Physionet Gait Dynamics dataset. It achieves a classification accuracy of 93.8%, 89.1%, 94% and 93.3%, for ALS, HD, PD, and healthy person, respectively, from a total set of 64 subjects. |
13:00 | A Low Power, Programmable Bias Inverter Quantizer (BIQ) Flash ADC SPEAKER: Mahesh Kumar Adimulam ABSTRACT. In this paper a low power, programmable bias inverter quantizer (BIQ) flash ADC for communication and bio-potential signal processing applications is presented. The comparator of the proposed BIQ flash ADC is designed by using digital inverter with cascode PMOS and NMOS bias transistors in the top and bottom, the voltage range of the bias transistors will control different switching threshold values of the inverter. The BIQ flash ADC increases the operating frequency due to reduced input gate load capacitance, the area and power consumption of BIQ ADC are reduced due to smaller device dimensions compared to conventional comparator based flash ADC and inverter based ADCs. The BIQ ADC is programmable for 5-bit, 6-bit, 7-bit and 8-bit resolutions by using control bits which scales the power consumption based upon ADC operation. The BIQ flash ADC results are compared with conventional comparator based flash ADC and inverter based ADC for different resolutions. The proposed ADC is designed in 90nm standard CMOS process which occupies a core area of 0.0676 mm2. The performance parameters of the BIQ flash ADC design are found to be, differential non-linearity (DNL) of ±0.30 LSB, integral non-linearity of ±0.51 LSB, signal-to-noise-and-distortion ratio (SNDR) of 46.18 dB, effective number of bits (ENOB) of 7.38 at 1.0 V supply voltage. The power consumption of this ADC at DC/lower input frequencies is 220 nW and at higher input frequencies upto 2 GHz it is 18.5 mW. |
13:30 | Low-Jitter Plain Vanilla CMOS CDR with Half-Rate Linear PD and Half Rate Frequency Detector SPEAKER: Solomon Serunjogi ABSTRACT. This paper presents a dual loop Clock and Data Recovery (CDR) circuit for high-end, low data rate, wireless transfer (100-200kb/s). Firstly, design tradeoffs for the single loop variant of the CDR are formulated which include jitter transfer (JT) function in frequency domain and long term jitter in time domain. These design rules are then used for the realization of a dual loop CDR consisting of tristate half rate frequency detector (FD), half rate linear phase detector (PD), bootstrapped current switch charge pump (CP) and ring based 4-phase VCO. All building blocks (except CP) are realized with plain vanilla CMOS digital circuits. In the proposed design, the output of the tri-state FD is zero when in lock and has no contribution to VCO jitter. In addition, the linear PD yields zero phase difference under the same lock condition. As a consequence, the CDR circuit can work with low jitter for low power applications. |
14:00 | A Pulsed-Decimal Technique for Single-channel, Dynamic Signaling for IoT Applications SPEAKER: Shahzad Muzaffar ABSTRACT. Pulsed-Index Communication (PIC) is a recent technique for single-channel communication which is based on the principle of transferring the indices of only the ON bits in the form of a series of pulse streams. In this paper, we present a modified version of PIC which is based on the same underlying idea but with key improvements in data rate and reliability. The proposed technique is called Pulsed-Decimal Communication (PDC). Like PIC, PDC is a protocol for single-channel, high-data rate, low-power dynamic signaling that does not require any clock and data recovery. It however achieves higher data rates by introducing a three-step algorithm, comprising a segmentation, an encoding, and a sub-segmentation step. The segmentation step is used to split the data word into smaller segments and therefore smaller decimal numbers to represent them. The encoding step reduces the number of ON bits in the data and relocates them to lower indices. The sub-segmentation step is used to split further the segments into smaller sub-segments. The complete process reduces the number of pulses required to transmit binary data, thus improving the data rate. Compared with PIC, PDC achieves a 78% improvement in data rate and is more reliable as it eliminates the variations in the number of symbols to be transmitted. An FPGA and an ASIC (65nm technology) implementation of the protocol show that the low-power operation and small footprint of PIC are maintained in PDC, which consumes around 25uW of power at a clock frequency of 25MHz with a gate count of approximately 2150. |
15:00 | Template based synthesis for high performance computing SPEAKER: Masahiro Fujita ABSTRACT. It is one of the general trends in high performance computing fields to implement given software on top of custom hardware realized by FPGA and other programmable devices. It can give us not only higher performance but also energy efficient computing. However efficient implementation algorithms as hardware and software can be significantly different, and it is highly desired for hardware implementation to use intensive pipelined operations and mostly-local communications especially for throughput directed computing, such as big data analysis. Typical high-level synthesis methods may not concentrate on these issues, as they are targeting general hardware designs. Performance directed synthesis targeting throughput based computations rather than transitional high-level synthesis techniques is proposed based on "template-based" approaches. With templates, given data flow graphs are automatically converted into the ones for high performance with FPGA by using SAT-based automatic refinement methods. Several experimental examples with the proposed synthesis method are presented in order to demonstrate the proposed approach. The performance of the FPGA based implementation can be enhances by orders of magnitude. |
15:30 | Synthesis of Multi-variate Stochastic Computing Circuits SPEAKER: Kiyoung Choi ABSTRACT. Stochastic computing (SC) is a promising technique to enhance computing efficiency in terms of area, power, and error tolerance with slight compromise of the accuracy. This paper presents a novel approach to automatic synthesis of an SC circuit from a given arithmetic expression with multiple variables. It first extracts building blocks called iSC kernels from the given expressions and then synthesizes an SC circuit by using the iSC kernels. Experimental results demonstrate the efficiency of the proposed technique in terms of synthesis time and the quality of the generated circuit. |
15:00 | Continuous Authentication of UAV Flight Command Data using Behaviometrics SPEAKER: Abdulhadi Shoufan |
15:30 | A Multiple Valued Logic Approach for the Synthesis of Garbled Circuits SPEAKER: Stelvio Cimato |
16:00 | Evolution of Logic Locking SPEAKER: Muhammad Yasin |