Program for Thursday, February 25th

LASCAS2021: 12TH IEEE LATIN AMERICAN SYMPOSIUM ON CIRCUITS AND SYSTEMS

PROGRAM AUTHORS KEYWORDS

PROGRAM FOR THURSDAY, FEBRUARY 25TH

Days:

View: session overview talk overview

08:00-09:30 Session 18: Tutorial: Systematic Design of Analog CMOS Circuits Using Lookup Tables

09:30-10:00 Session 19: Networking

10:00-11:00 Session 20: Keynote G. De Micheli: Digital Design Algorithms for Emerging Technologies

Chair:

11:00-12:00Break

12:00-14:00 Session 21A: Digital Techniques for Emerging Applications

Chair:

12:00	Ricardo Escobar, Luis Miguel Prócel, Lionel Trojman, Marco Lanuzza and Ramiro Taco High-Speed and Low-Energy Dual-Mode Logic based Single-Clock-Cycle Binary Comparator ABSTRACT. This paper presents an energy-efficient single-clock-cycle binary Dual-Mode Logic (DML)-based comparator optimized to operate in the dynamic mode. The parallel-prefix architecture is implemented to ensure high speed, whereas low power consumption is guaranteed by reducing the switching activities of internal nodes. Domino Logic (DL) and DML implementations are compared in terms of delay and energy for different supply voltages in the 32 nm technology. We demonstrate an average improvement of 5% in both energy and delay when the DML design is operating in the dynamic mode compared to its conventional domino counterpart. Moreover, the DML design operating in the static mode allows to save up to 43% energy consumption compared to the equivalent domino logic-based implementation.
12:20	Mariana Toledo, Sandro Matheus Marques, Thiarles Soares Medeiros, Fábio Rossi, Marcelo Caggiani Luizelli, Antonio Carlos Schneider Beck Filho and Arthur Lorenzon EDP Optimization of Parallel Applications via CPU Frequency Scaling on AMD Processors ABSTRACT. Dynamic Voltage and Frequency Scaling (DVFS) has been widely used to improve the use of computational resources when a system is executing parallel applications. On top of that, as parallel applications have different behavior (e.g., CPU usage and shared memory accesses), DVFS methods must be able to deal with the application characteristics at hand. However, as we show in this paper, DVFS governors already available in many Linux Operating System distributions are not capable of dealing with such a scenario, providing a trade-off between performance and energy consumption (energy-delay product - EDP) that is distant from the best possible. Given that, we propose a run-time and dynamic approach to optimize the EDP of parallel applications running on AMD processors that automatically selects the ideal CPU frequency and Boosting operating mode according to the characteristics of the application at hand. When executing sixteen well-known parallel applications on two multicore architectures, we show that our approach provides EDP optimizations of up 38% when compared to the ondemand DVFS governor.
12:40	Henrique Kessler, Marcello Muñoz, Plínio Finkenauer, Leomar da Rosa Jr. and Vinícius V. Camargo Electrical Evaluation of Logic Network Generation Methods for Supergates using SwitchCraft ABSTRACT. Recent developments in electronic design automation tools vastly reduce the design cost of static CMOS complex gates (SCCG), enabling an alternative approach to the logic synthesis. Despite many design strategies targeting the transistors network in SCCGs, their comparisons are often limited to metrics such as the number of transistors used or circuit total stack, lacking an in-depth electrical evaluation. This work presents an electrical comparison of three different design techniques. The study evaluates the 3982 logical functions of the 4 input P-class, and it shows that topologies that optimize both pull-up and pulldown networks individually presented better overall electrical characteristics. The results also suggest that reducing the logic gate stack or the number of transistors does not necessarily lead to better performance, showing that focusing only on optimizing for these parameters does not reflect in electrical improvement.
13:00	Pedro Pereira, Guilherme Paim, Guilherme Ferreira, Eduardo Costa, Sérgio Almeida and Sergio Bampi Exploring Approximate Adders for Power-Efficient Harmonics Elimination Hardware Architectures ABSTRACT. This paper explores approximate adders (AA), in an harmonic elimination system using Least Mean Square (LMS) filters. The AA Lower-Part-Or Adder (LOA), Error Tolerant Adder (ETA-I), Truncation adder (Trunc) and Copy adder are used in all the harmonic elimination system. Since the filtering systems is a 16-bit circuit the approximate part of the adders varies the approximation level parameter (K) from 1 to 8. The Root-Mean-Square Error (RMSE) and Mean Absolute Error (MAE) metrics show that the filtering is efficient for all AA circuits with K=4. However, the LOA AA remains efficient for filtering the signal until K=5, but with higher power dissipation. Therefore, the results point to the "Copy_b" (K=4) as the most efficient AA to be applied in the harmonic elimination system with 7.5% less area and 21.8% less power than the one with precise adders.
13:20	Guilherme Ferreira, Pedro T. L. Pereira, Guilherme Paim, Eduardo Costa and Sergio Bampi A Power-Efficient FFT Hardware Architecture Exploiting Approximate Adders ABSTRACT. This work presents an energy-efficient Fast Fourier Transform (FFT) hardware architecture exploiting approximate adder circuits. The FFT hardware architecture consists of a fixed-point fully sequential architecture with a radix-2 butterfly with decimation in time (DIT). In this paper, we explore a set of approximate adders (LOA, ETA-I, Copy-A, Copy-B, Trunc0, Trunc1) in the butterfly by varying the approximation level (K term). The Root-Mean-Square Error (RMSE) metric shows which approximate level term allows the FFT processing without widely signal losses. The results show that our best-proposed FFT employing Trunc0 approximate adder with K=10 saves up to 35% of power dissipation compared to the FFT with the original radix-2 butterfly using the synthesis tool operators.

12:00-14:00 Session 21B: Power and Energy Circuits and Systems

Chair:

Alejandro Oliva

12:00	Thomas Eleftherios Dimitrios Kizas, Lorenzo Crespi, Piero Malcovati and Andrea Baschirotto A Library of High-Level Models for the Simulation of DC-DC Converters ABSTRACT. Custom integrated DC-DC converters are frequently used to power system-on-chip architectures for achieving high-quality voltage regulation at the best possible efficiency. However, high-level models of such converters are usually incompatible with analog simulators and require the designer to re-assemble the circuit in the final design environment to verify the design with real transistor models. This process is both time consuming and error-prone. This paper presents a library of parametrizable, high-level macro-models for DC-DC converters, consisting of custom Verilog-A modules and cells from the standard libraries. The macro-models are compatible with analog simulators and offer a significant amount of flexibility to the designer.
12:20	Lionel Trojman, David Rivadeneira, Marco Villegas, Eliana Acurio, Marco Lanuzza, Luis-Miguel Procel and Ramiro Taco RF-DC Multiplier for RF Energy Harvester based in 32nm and TFET technologies ABSTRACT. In this work, we are studying the effect of the technology scaling for different full-wave rectifier topologies using the Cross-Coupled Differential Drive (CCDD) strategy. For a conventional CCDD scaling from 90nm to 32nm, the PCE and VCE are maintained the same while a large degradation of the dynamic range and sensitivity are observed. This effect could be slightly limited by using a self-body bias CCDD topology. However, the use of TFET enables to avoid this degradation and provide a large VCE and output voltage for input voltage lower than 300mV. To extend this VCE for input voltage > 300mV, we use a CCDD topology increasing the loading drive capability. Interestingly, this resulted not only on increasing the output voltage for large Vin but also demonstrated large PCE than expected for this topology.
12:40	Alexandre Quenon, Evelyne Daubie, Véronique Moeyaert and Fortunato Carlos Dualibe On the Possibility to Use Energy Harvesting on Beta Radiation in Nuclear Environments PRESENTER: Alexandre Quenon ABSTRACT. This paper presents experimental results showing that the energy contained in beta radiation can be harvested by using diodes. A single BPW34 photodiode generates around 12 pA dc, so enough to power integrated analog blocks.
13:00	Carlos A. Pinheiro Jr., Fabián Olivera and Antonio Petraglia A Three-Stage Charge Pump with Forward Body Biasing in 28 nm UTBB FD-SOI CMOS ABSTRACT. Energy harvesting techniques provide solutions for powering battery-free circuits or even for charging storage elements such as batteries or super capacitors. In this paper, a three-stage charge pump that is appropriate for thermoelectric and photo-voltaic energy harvesting is carried out in a 28 nm ultra-thin buried oxide (UTBB) fully-depleted silicon-on-insulator (FD-SOI) CMOS technology. Taking advantage of the FDSOI substrate characteristics, the forward-body-biasing (FBB) technique is used in order to improve the switch conductances. Extensive simulation results validate the proper operation of the harvesting system at minimum input voltage of 200 mV and show a maximum efficiency peak of 56% at input voltage of 300 mV and load current of 100 nA.

12:00-14:00 Session 21C: Electronic Design Automation and Digital Circuit Design

Chair:

Victor Grimblatt

12:00	Augusto Hoppe, Juergen Becker and Fernanda Kastensmidt High-speed Hardware Accelerator for Trace Decoding in Real-Time Program Monitoring ABSTRACT. Multicore processors are currently the focus of new and future critical-system architectures. However, they introduce new problems in regards to safety and security requirements. Real-time control flow monitoring techniques were proposed as solutions to detect the most common types of program errors and security attacks. We propose a new way to use the latest debug and trace architectures to achieve full and isolated real-time control flow monitoring. We present an online trace decoder FPGA component as a solution in the search for scalable and portable monitoring architectures. Our FPGA accelerator achieves real-time CPU monitoring with only 8% of used resources in a Zynq-7000 FPGA.
12:20	Anselm Breitenreiter, Oliver Schrape, Marko Andjelkovic and Milos Krstic Reliability Analysis in Less than 200 Lines of Code ABSTRACT. Answer Set Programming (ASP) is proposed as a compact and versatile approach to circuit analysis. By the example of upsets in registers we demonstrate how to perform reliability analysis in less than 200 lines of code. By an efficient problem encoding we achieve an input data format similar to a Verilog netlist so that extensive preprocessing is avoided. No development of algorithms is required as the analysis relies on elaborate and highly optimized ASP solvers. Exemplary results for a wide range of circuits are presented and potential optimizations are pointed out.
12:40	Muhammed Mustafa Kızmaz and Salih Ergün A CMOS Implementation of the Tent Map for Random Number Generation ABSTRACT. A new tent map based random number generator (RNG) is designed in TSMC 65 nm CMOS technology. Simulation results verify that the generated random sequences successfully pass the randomness tests in the FIPS-140-2 and NIST 800-22 test suites. Superior to other studies in the literature, our RNG satisfies the randomness tests without post processing. Moreover, the bit generation rate can be increased in exchange of more power consumption. Thus, with the architecture used in this work, robust RNGs needed for security applications can be implemented with higher data rates.
13:00	Isadora Oliveira, Marcelo Danigno, Paulo Butzen and Ricardo Reis Benchmarking Open Access VLSI Partitioning Tools ABSTRACT. Most algorithms used in VLSI CAD tackle NP-hard problems, and face scalability issues arisen from the ever-increasing circuit size. Partitioning enables the use of divide-to-conquer strategy, allowing the usage of complex algorithms by reducing the problem in instances of smaller sizes.Several open-access tools have arisen to tackle partitioning problem.This work performs a comprehensive evaluation of four partitioning tools, investigating the performance in actual benchmarks. We investigate graph and hypergraph partitioning, considering different graph models of hypergraphs and edge-weighting schemes. The analysis compares graph and hypergraph partitionings in terms of hyperedge cut, number of terminals, and runtime. Finally, the robustness of the tools is evaluated by testing different numbers of final partitions and how they are balanced.The results present difference up 2X in hyperedge cut and more than 1000X in runtime. The presented data can define the applicability and advantages of each tool and possible methods to alleviate those downfalls.
13:20	Jeferson González-Gómez, Steven Ávila-Ardón, Jonathan Rojas-González, Andrés Stephen-Cantillano, Jorge Castro-Godínez, Carlos Salazar-García, Muhammad Shafique and Jörg Henkel TailoredCore: Generating Application-Specific RISC-V-based Cores ABSTRACT. One challenge imposed by ubiquitous computing of embedded systems is the need for power and energy-efficient implementations, particularly because many of them are operated with batteries. In this sense, tailored application-specific processors can meet the resource requirements of a specific application in the most efficient way. In this paper, we present TailoredCore, a design methodology to generate application-specific processors based on a core architecture implementation. This methodology analyzes the application to be executed and produces a customized RISC-V core with the resources required, while reducing the hardware overhead due to, for instance, instructions and registers not needed. Using TailoredCore, we achieve up to 38% savings in registers and 12% in logic elements when generating cores for five CHStone benchmark applications and implementing them on an FPGA. These savings in the area also correspond to a reduction of the required power and energy.
13:40	Konstantinos Touloupas and Paul Peter Sotiriadis Analog and RF Circuit Constrained Optimization Using Multi-Objective Evolutionary Algorithms ABSTRACT. This paper presents a simulation-based optimization method for automatic sizing in analog and RF IC blocks. It introduces a combination of a state-of-the-art Multi Objective evolutionary algorithm (EA) with a new constraint handling approach to effectively explore the high-dimensional constrained design space, typical in every analog and RF IC block design. An additional modification in the core of the EA is also proposed for handling efficiently mixed continuous-integer parameter search spaces. The methodology is illustrated in a Nested-Current- Mirror amplifier and a Wideband Low Noise Amplifier achieving better results than typical constraint handling approaches.

12:00-14:00 Session 21D: Iberchip

14:00-14:30 Session 22: Closing Session

Disclaimer | Powered by EasyChair Smart Program