View: session overviewtalk overview
12:00 | Femto-Watt CMOS Voltage Reference Design ABSTRACT. This paper presents the design and sizing of a femto-watt voltage reference based on the temperature compensation of N-type and P-type standard transistors. A short-channel dimension transistor working in triode region is added between the source and gate terminals of the traditional self-bias structure, which demonstrates efficient low-power and low-voltage operations. The circuit design was carried out in a 180 nm process. Simulation results verified the proper circuit operation and produced a reference voltage of 65.7 mV. With a minimum supply voltage of 120 mV, the circuit consumes 252 fW at room temperature, has mean and best temperature coefficients (TCs) of 90.31 ppm/°C and 10.07 ppm/°C, respectively, from -40 to 120 °C, and presents a power supply rejection ratio (PSRR) of -79.9 dB. |
12:20 | Pseudo-Differential Time-Domain Integrator Using Charge-Based Time-Domain Circuits ABSTRACT. This work proposes a pseudo-differential time-domain integrator using half-delay time-domain registers and adders relying on charge-based time-domain circuits. It is implemented using a 65-nm CMOS Technology and performs first order integration of time-domain information within the range [4 ns, -4 ns] across temperature -40oC to 80oC. It consumes 740 µW with a supply voltage of 1.2 V at a 100 MHz clock frequency. A delay-locked-loop (DLL) based foreground calibration is used to compensate for process and temperature variations. |
12:40 | A 1V, 450pS OTA Based on Current-Splitting and Modified Series-Parallel Mirrors ABSTRACT. This work presents a low power, pS transconductor (OTA) based on a new transconductance reduction technique. The circuit is intended for bio-sensing in low-power wearable biomedical devices. Instead of pseudo-resistors based techniques, the proposed OTA-C allies two circuit techniques: current splitting and modified series-parallel current mirrors. With a power supply at 1V, the new concept is demonstrated through a 450 pA/V transconductor, which was used to build a low-pass filter with a large time constant while keeping a small-size capacitor. The circuits and simulations were done with TSMC 0.18 um node using Cadence's IC design tools. The new topology combines characteristics such as low transconductance, low power, and moderate gain, which fills some important needs in bio-sensing circuit design. |
13:00 | On-chip Diffusion Charge Redistribution Ladder Converter for Photovoltaic Systems with Mismatch ABSTRACT. Mismatch losses are one of the main causes of reduced performance in photovoltaics. Not only it drops the Maximum Power Point (MPP) level, but also multiple maxima local points are created making more difficult the MPP tracking (MPPT). Diffusion Charge Redistribution (DCR) strategy allows to use the internal capacitance of the solar cells to mitigate the impact of mismatch. This work explores the 3:2 ladder DCR converter towards fully integrated DCR. The proposed circuit uses 130 nm CMOS technology and avoids any external capacitor. It shows a improvement in the maximum output power up to 60%, under shading conditions. Additionally, simulation shows the impact of the shading over the MPP according to the affected cell of the DCR. We also present a MPPT based on an Artificial Neural Network, which is able to compute the MPP in three steps with a error of 6.1% when used in transistor level simulations and trained with an ideal simulation dataset. |
13:20 | ISFET Array Readout System with Integrated 12 bit A/D Conversion for Lab-on-Chip Applications PRESENTER: Hugo Hernandez ABSTRACT. This work presents a current-mode readout frontend for a 128x64 array of ISFETs integrated in 180 nm 1P6M CMOS technology. The most relevant ISFET systems have not been implemented using single device configurations, instead, ISFETs in an array configuration has been used. The proposed front-end architecture linearly digitizes the output current of the ISFET array through a current conveyor, a transimpedance amplifier, and a 12 bit SAR ADC. A slave I2C digital circuit controls the array ISFET selection decoders and serializes the ADC output. Linearity of R2 = 99.94%, current consumption of 450 μA, and a low-frequency ENOB of 11.24 bit were achieved by post-layout simulation. The implemented chip occupies 0.52 mm2. Prototypes just arrived from the foundry and experimental measures will be performed soon. |
13:40 | Simulating large neural networks embedding MLC RRAM as weight storage considering device variations PRESENTER: Markus Fritscher ABSTRACT. In this paper we present a method to evaluate the behavior of neuronal network (NN) architectures, concerning the error rate of RRAM devices used as weight storage, relative to fabrication variances. While the behavior of non-ideal RRAM devices can cause system failures (e.g. due to bit flips) in traditional computer architectures, NN exhibit inherent redundancy which makes these applications more tolerant against device variabilities. Therefore, we analyze the fabrication variances of RRAM cells which are used as weight storage in systolic array - based NN architectures, and bring these device level properties to the system level to show, if and how a NN application will be affected. Previous works were based on Mixed Signal simulations and lack the needed throughput to be able to evaluate nets of meaningful size. Our approach uses modern neural network libraries along with an abstraction of the device properties and can thus run five to six orders of magnitudes faster compared to the results of a traditional approach. |
12:00 | A TensorFlow and System Simulator Integration Approach to Estimate Hardware Metrics of Convolution Accelerators ABSTRACT. GPUs became the reference platform for both training and inference phases of Convolutional Neural Networks (CNN), due to their tailored architecture to the CNN operators. However, GPUs are power-hungry architectures. A path to enable the deployment of CNNs in energy-constrained devices is adopting hardware accelerators for the inference phase. The design space exploration of CNNs using standard approaches, such as RTL, is limited due to their complexity. Thus, designers need frameworks enabling design space exploration that delivers accurate hardware estimation metrics do deploy CNNs. This work aims to propose a framework to explore hardware accelerators' design space, providing power, performance, and area (PPA) estimations. The heart of the framework is a system simulator with TensorFlow as front-end and as back-end performance estimations obtained from the physical synthesis. Results evaluate the energy trade-off varying the number of convolutional layers. |
12:20 | Design Considerations for the Development of Computational Resistive Memories ABSTRACT. Resistive memories are nowadays being highly considered for the development of future computational memory. Several works have demonstrated logic gates which use data stored in the state of the resistive switching devices (memristors) as logic input/output. However, requirements of topological nature, such as crossbar array-compatibility and word-wise memory operation, along with the impact of device nonidealities, have not always been considered. In this work we present some key design considerations contributing towards the development of resistive RAM (ReRAM)-based computational memories. We comment on proper logic design styles which are independent of memristor device technology features and tolerant to variability. Moreover, we present a segmented 1T1R ReRAM architecture with functional features that benefit latency and flexibility of data movement, required for multi-level (sequential) in-memory logic computations and arithmetic operations. |
12:40 | A New QDI Asynchronous Pipeline with Two-Phase Delay-Insensitive Global Communication ABSTRACT. Nowadays, digital circuits are implemented in the Ultra Deep-Sub-Micron (UDSM) MOS technology. In UDSM_MOS technology, communication between subsystems may require several clock cycles, bringing about the communication of the two senses, an increasing penalty in latency and power. The QDI (Quasi Delay Insensitive) asynchronous circuits class is a design alternative in UDSM-MOS technology because this class does not use a clock signal, so it eliminates the problems related to the clock signal and is highly robust to PVT (process, supply voltage, and temperature) variations, which is pertinent to UDSM-MOS technology, due to its greater variability. Usually, the QDI class operates on the four-phase protocol, but this incurs a cost in communication between subsystems. In this paper, we propose a new asynchronous QDI pipeline architecture. Unlike other proposals, this architecture performs the communication with the environment is carried out using the QDI LEDR (Level-Encoded Dual-Rail) protocol, which is two-phase, therefore allows in communication a reduction in latency and power when compared to the QDI protocol of four phases. Through a case study, we show the efficiency of the new QDI pipeline. |
13:00 | Chronos: an Abstract NoC-based Manycore with Preserved Temporal and Spatial Traffic Distribution ABSTRACT. The time spent to assess the application performance through clock-cycle simulators is a bottleneck of an NoC-based manycore design; thus, requiring higher abstraction levels at early design stages. However, high-level synchronization of processing and communication in such systems is an enormous challenge. This work develops and validates Chronos, an untimed abstraction of an NoC-based manycore, built with Open Virtual Platform, that seeks precise traffic modeling in such a way to preserve the temporal and spatial distributions of the physical implementation. Results show the similarity of the temporal and spatial traffic distributions compared to a reference RTL-level platform. |
13:20 | FPGA Implementation of a New PUF Based on Galois Ring Oscillators PRESENTER: Miguel Garcia-Bosque ABSTRACT. In this paper, a new Physically Unclonable Function (PUF) has been implemented and tested. The idea behind this PUF is to compare the bias of the same Galois ring oscillator implemented in different locations within the FPGA. The implemented PUF has been compared to a Ring Oscillator PUF in terms of reproducibility, random-like response and uniqueness. Furthermore, the spatial correlation of the bias of the Galois oscillators has been studied and compared to the spatial correlation of the frequency of regular ring oscillators. |
13:40 | Hardware Trojan with Frequency Modulation ABSTRACT. The use of third-party IP cores in implementing applications in FPGAs has given rise to the threat of malicious alterations through the insertion of hardware Trojans. To address this threat, it is important to predict the way hardware Trojans are built and to identify their weaknesses. This paper describes a logic family for implementing robust hardware Trojans, which can evade the two major detection methods, namely unused-circuit identification and side-channel analysis. This robustness is achieved by encoding information in frequency rather than amplitude so that the Trojan trigger circuitry's state will never stay constant during 'normal' operation. In addition, the power consumption of Trojan circuits built using the proposed logic family can be concealed with minimal design effort and supplementary hardware resources. Defense measures against hardware Trojans with frequency modulation are described. |
12:00 | A Wearable Wireless Sensing System for Capturing Human Arm Motion ABSTRACT. Compact wearable technology is required in a wide variety of fields including engineering, medical and sport science. The usability of wearable technology is versatile in its application, where it can be used to monitor and track the human movement including clinical applications. Classification studies of trajectory data are required for a diversity of hand and limb movements tracking experiments. Automatic classification using machine learning techniques has the potential to increase the reliability and efficiency of predicting the outcome of results without the need of human manual intervention. This work presents a wearable sensing electronic device for tracking and classify real-time hand strike motion in a combat sport activity. The developed low-cost system consists of a small footprint printed-circuit board of dimensions 11 mm x 24 mm x 4 mm with integrated motion sensors operating at 3.3 V, contributing to a battery running time of 3 hours. This meets the requirements for the targeted application. The K-Nearest Neighbor machine learning algorithm was adopted for the classification of hand combat techniques, yielding a classification and an optimal strike prediction accuracy of 99%, using only 20% of the available dataset. |
12:20 | Study of a Voltage-Mode Readout Configuration for Micromachined CMOS Transistors for Uncooled IR Sensing PRESENTER: Elisabetta Moisello ABSTRACT. Micromachined CMOS transistors, dubbed as ``TMOS", have been developed in recent years as a novel type of uncooled thermal sensors. The TMOS consists of a thermally isolated suspended transistor, fabricated in a 130-nm process and released by dry etching, which absorbs thermal radiation, inducing an increase of the transistor temperature and, therefore, generating a signal by changing the transistor I-V characteristics. With respect to conventional thermal sensors, as the TMOS is an active sensing element, it features advantages in terms of internal gain, resulting in high temperature sensitivity, which makes the TMOS particularly appealing. The TMOS sensing performance depends on the transistor operating region and on its configuration. In this paper, different configurations are investigated by means of Cadence simulations, in order to identify the voltage-mode readout configuration which maximizes the sensor performance. Voltage-mode, and not current-mode, readout is considered in order to be able to directly compare the TMOS performance with the one of an integrated micromachined thermopile sensor, which, given its characteristics, only supports voltage-mode readout. |
12:40 | Estimating Cole-Impedance Parameters from Limited Frequency-Band Impedance Measurements ABSTRACT. Particle-swarm optimization is applied to estimate the parameters of 4 different Cole-impedance models using impedance measurements limited to frequencies within 1 kHz to 1 MHz, representative of typical ranges for bioimpedance applications. The accuracy of extracted parameters are compared for cases with N = 201, 25, and 5 datapoints collected using a Keysight E4990A impedance analyzer and N = 8 collected using a MAX30001 integrated circuit. The Cole-impedance estimations using the 201 and 25 frequency datasets show < 0.5% relative error compared to the ideal values. Estimations using 5 and 8 frequency datasets do show significant deviations when measurements do not capture plateauing impedance regions at low and high frequency. |
13:00 | Design and implementation of a trans-impedance amplifier for a miniaturized saturated absorption spectrometer PRESENTER: Kevin Sosa ABSTRACT. This paper presents the design, manufacture, and testing of a trans-impedance amplifier for a miniaturized saturated absorption spectrometer. The amplifier has a bandwidth of 32 kHz, a dc gain of 80.3 dB, and an output voltage swing of 4.84 Vpp. The device can handle an input optical power up to 800 uW, where the photodetection electronics have a Noise Equivalent Power of 0.3 pW/sqrt(Hz). The circuit board has a size of 27 x 27 mm2. Signals acquired with our miniaturized device perfectly match those recorded with a commercial table top optical setup. |
13:20 | Radiation-Hardness-by-Design Latch-based Triple Modular Redundancy Flip-Flops ABSTRACT. The paper presents an alternative Single-Event Effect (SEE)-tolerant Triple Modular Redundancy (TMR) circuit topology for space applications. The proposed D-flip-flop circuit scheme is fully digitally designed and consists of local Single Event Transient (SET) filter for SET mitigation on the datapath. A latch-based master-slave decomposition with the usage of commercially available unhardened standard cell components is selected. A resulting high dense cell layout of about 95% is obtained by a multi-row cell placement arrangement. The baseline concept is examined by analog simulation as well as verified by electrical measurements. Corresponding shift register test vehicles are implemented in 0.13µm BiCMOS technology. An SEE robustness of a selected candidate with an LET of 46.1 MeV cm²/mg is measured in a proof of concept. |