View: session overviewtalk overview
The iCub is a humanoid robot designed to support research in embodied AI. At 104 cm tall, the iCub has the size of a five year old child. It can crawl on all fours, walk and sit up to manipulate objects. Its hands have been designed to support sophisticate manipulation skills. The iCub is distributed as Open Source following the GPL licenses and can now count on a worldwide community of enthusiastic developers. The entire design is available for download from the project’s repositories (http://www.iCub.org). More than 30 robots have been built so far which are available in laboratories across Europe, US, Korea, Singapore, and Japan. It is one of the few platforms in the world with a sensitive full-body skin to deal with the physical interaction with the environment including possibly people. I will present the iCub project in its entirety showing how it is evolving towards fulfilling the dream of a personal humanoid in every home.
Power consumption has become the most important design goal in a wide range of electronic systems especially when dealing with smart/autonomous sensing systems for application domains such as Internet of Things (IoT), Wearable Devices, Robotics, and Prosthetics. The continuing device scaling and ever-increasing demand for higher computing power are two driving forces toward ultra-low power design strategies: for instance, the typical power consumption in some current sensory systems is on the order of 100 milliwatts and is expected to be 100 times more in order to respond to the application demands. Seeking to improve the energy efficiency, designers have turned to optimization methods in several ways from system level down to transistor device level. On the other hand, power supply represents a limiting factor in smart sensory systems whose form factor constrains battery size. Endowing the sensory systems with harvesters that collect energy from the environment will represent a promising solution to achieve the long-life goal for truly energy-autonomous (selfpowered) devices. In this perspective, this special session aims to illustrate the challenges facing designers of highperformance and energy-efficient/autonomous circuits for sensory systems. It will address a cross-layer approach and span various methodologies, techniques, and architectures paving the way towards Energy Efficient Autonomous Smart Sensory Systems.
The presentation goes through the evolution of the Smart Power technologies, from ICs integrating a few power elements, for motor drive or DC/DC converters, to a huge numbers of HV drivers (up to 200V) for many actual application fields, like systems for echography medical imaging or MEMS actuator drivers. A particular attention will be given to the analysis of the main design difficulties and relative solutions implemented to reach special performances.
14:45 | Interface circuits based on FPGA for tactile sensor systems SPEAKER: unknown ABSTRACT. Development of tactile sensing systems is motivated by the possibility of application in many domains such as robotics, prosthetics, and industrial automation. This paper provides a functionality assessment of an interface electronic circuit prototype for tactile sensing systems. The circuits are based on the DDC112U and an FPGA Xilinx Spartan-6. An experimental setup is carried out to measure the signals generated from a single tactile sensor. Experimental results validate the correct functionality of the proposed interface when the measured voltage and charge are analyzed in terms of the input force. Moving to the new system may pave the way towards the fully integrated SoC for the e-skin development. |
14:45 | Approximate FPGA Implementation of CORDIC for Tactile Data Processing using Speculative Adders SPEAKER: Marta Franceschi ABSTRACT. In most robotic and biomedical applications, the interest for real-time embedded systems with tactile ability has been growing. For example in prosthetics, a dedicated portable system is needed for developing wearable devices. The main challenges for such systems are low latency, low power consumption and reduced hardware complexity. In order to improve hardware efficiency and reduce power consumption, approximate computing techniques have been assessed. This strategy is suitable for error-tolerant applications involving a large amount of data to be processed, which perfectly fits tactile data processing. This paper presents the first case study of applying Inexact Speculative Adders (ISA) to the FPGA implementation of a Coordinate Rotation Digital Computer (CORDIC) module within the Machine Learning algorithm of a tactile data processing system. The design has been synthesized and implemented on a Xilinx ZYNQ-7000 ZC702 device. Preliminary results have shown dynamic power reduction up to 40% and delay latency reduction up to 21% compared to a conventional CORDIC module, at the cost of a negligible average relative error of 0.049% for sine and 0.003% for cosine computations. |
14:45 | Investigation on the optimal pipeline organization in RISC-V multi-threaded soft processor cores SPEAKER: unknown ABSTRACT. Internet-of-Things end-nodes with relatively limited production volume can take benefit from FPGA implementations of computing platforms based on dedicated computational units controlled by a soft processor core. The inherently multi-tasking nature of the processor operation demands for a cost-effective and energy-efficient multi-threaded execution, synthesized as multi-core architecture or multi-threaded single-core. This work presents an experimental exploration of microarchitecture design solutions for multi-threaded single-cores, specifically addressing soft processor core implementations on FPGA. We report detailed quantitative results on resource utilization, performance and energy efficiency of the different solutions, varying the pipeline organizations, thread pool size, active thread count and voltage. |
14:45 | A Convolutional Neural Network Fully Implemented on FPGA for Embedded Platforms SPEAKER: unknown ABSTRACT. The Convolutional Neural Network (CNN) algorithm allows fast and precise image recognition. Today, this ability is highly requested for quickly analysing complex video-streams in the embedded system domain. In this paper, we present an FPGA implementation designed addressing portability and power constraints. The designed architecture implements a full CNN model on an embedded FPGA device, reducing external memory requirement by 84% with respect to the software version. Power and performance characterization results show that the proposed implementation is 3 times more power efficient than a serial version on a general purpose CPU, and equivalently efficient to a 16-times parallelized version on the same processor. |
14:45 | A 10-bit Radiation-Hardened by Design (RHBD) SAR ADC for Space Applications SPEAKER: unknown ABSTRACT. This work presents a rad-hard by design (RHBD) 10-bit 1MHz SAR ADC for space applications. The goal is to design a radiation tolerant SAR ADC by using radiation-hardened by design (RHBD) techniques both at circuit and layout levels. The design takes into account the various effects of the radiation that could damage the circuits in ionising radiation environments. A conventional SAR ADC with charge redistribution capacitive DAC has been the starting point to whom RHBD techniques have been applied. The SAR was implemented and fabricated in a 0.15-um CMOS standard process by LFoundry. The prototype active area is 212x285 um2 and consumes 1.23mW. Measurement results show an ENOB equal to 9.6 bits in the band of interest, [1 - 10]KHz, at full-scale input voltage. The resulting figure of merit is 792 fj/conversion-step. |
14:45 | Design & Analysis of a nanowire SGFET-based 10GHz Frequency Synthesizer SPEAKER: Sotoudeh Hamedi-Hagh ABSTRACT. A low-power frequency-synthesizer PLL designed using the nanowire SGFET technology is proposed. The output frequency of the system is 10.3125 GHz, synthesized from a reference input of 156.25 MHz. The design utilizes a phase-frequency detector, a charge-pump, a 2nd order passive loop-filter, a current-starved ring voltage-controlled oscillator, and a frequency divider that consists of both TSPC and static flip-flops. The PLL has a wide tuning range from 2.8 GHz to 14.4 GHz, a phase margin of 54.53 degrees, a closed-loop bandwidth of 9.47 MHz, and a dc power consumption of 34.82 uW with 1V power supply. |
14:45 | A Cross-Coupled Redundant Sense Amplifier for Radiation Hardened SRAMs SPEAKER: unknown ABSTRACT. This paper proposes a redundant sense amplifier for radiation-hardened SRAMs, based on a latched cross-coupled topology. First, the most common issues related to SRAMs reliability and performances in radiation environments are discussed. Then, after an analytical study on the transient effects induced by radiations, based on a systematic design flow, a novel redundant sensing scheme is presented and discussed. Simulation results, carried out in a standard 130-nm CMOS, demonstrate the effectiveness of the solution. |
14:45 | Partially Reconfigurable IP Protection System with Ring Oscillator Based Physically Unclonable Functions SPEAKER: unknown ABSTRACT. The size of counterfeiting activities is increasing day by day. These activities are encountered especially in electronics market. In this paper, a countermeasure against counterfeiting on intellectual properties (IP) on Field-Programmable Gate Arrays(FPGA) is proposed. FPGA vendors provide bitstream ciphering as an IP security solution such as battery-backed or non-volatile FPGAs. However, these solutions are secure as long as they can keep decryption key away from third parties. Key storage and key transfer over unsecure channels exposes risks for these solutions. In this work, physical unclonable functions (PUFs) has been used for key generation. Generating a key from a circuit in the device solves key transfer problem. Proposed system goes through different phases when it operates. Therefore, partial reconfiguration feature of FPGAs is essential for feasibility of proposed system. |
14:45 | An Analytical Model of the Delay Generator for the Triggering of Particle Detectors at CERN LHC SPEAKER: unknown ABSTRACT. This paper presents an analytical model of a tapped shift-register based delay generator, which is currently implemented in the High Momentum Particle Identification Detector (HMPID) at CERN and will be upgraded in the coming years. This work aims to verify whether this delay generator can be optimized to provide a delay range of 525 ns with a resolution of 1 ns. In particular, this paper studies how the clock jitter affects the delay generated and its linearity and predict how the current architecture will perform at a higher frequency of operation. The conclusions drawn via the analytical model, are then verified using both a simulation model and an FPGA implementation of the delay generator. |
14:45 | Feasibility Study of an Ultra High Speed Current-Mode SAR ADC SPEAKER: unknown ABSTRACT. This paper presents the feasibility study of a low-power 5-bit synchronous current-mode SAR ADC primarily targeted for ultra high speed applications. The circuit uses a voltage-current converter at the front end and a current steering DAC. The discrete analog output is digitized by a latch and a standard SAR logic in its feedback by using a binary search algorithm. The proposed scheme exploits the compatibility of SAR ADCs with advanced technology nodes and provides an excellent opportunity to achieve ultra high speeds. The circuit exhibits a sampling rate upto 4 GS/sec with a full scale differential current of 1 mA(pk-pk) or differential voltage of 300 mV(pk-pk). The proposed circuit is designed in a 28-nm CMOS process, achieves a figure of merit of 18.3 fJ/conv.-step and dissipates about 2.35 mW with a 0.9 V supply voltage. |
After decades during which clocked logic has imposed its discipline across all fields of digital design, there is today a world-wide resurgence of interest in asynchronous logic design techniques, so that asynchronous logic can be expected to win niches in the digital electronics business within the next few years. The main reason is because an asynchronous design paradigm is capable of addressing the impact of increased process variability, power and thermal bottlenecks, high fault rates, aging, and scalability issues prevalent in emerging densely packed integrated circuits. Starting from the current limitations of synchronous/clocked design and from the common arguments for migrating to an asynchronous design style, this special session provides a pragmatic survey on the state-of-the-art in asynchronous design techniques and in one of its most promising emerging application areas, namely Globally Asynchronous Locally Synchronous (GALS) systems. Thanks to this special session, NGCAS audience will be able to stay technically up-to-date about one of the hottest debates in the digital design community: does clockless design really have a future as an effect of the growing clock distribution concerns and/or of the growing need for fine-grained and adaptive power management? Far from providing a comprehensive answer, the special session aims to keep the debate alive by presenting the latest research outcomes from leading European experts in the field, spanning from challenging asynchronous circuit design issues to novel system design concepts and prototypes, going through emerging clockess communication architectures and associated synthesis tool flows.
16:15 | Asynchronous and GALS Design – Overview and Perspectives SPEAKER: unknown ABSTRACT. Asynchronous circuit design has been introduced many decades ago, however until now with the limited industry application. This paper summarizes the basic asynchronous principles and techniques, as well as the latest developments as well as the outlook. |
16:35 | Asynchronous Arbitration Primitives for New Generation of Circuits and Systems SPEAKER: unknown ABSTRACT. This paper presents an overview of a family of asynchronous arbitration primitives designed to increase the resilience and efficiency of the new generation of circuits and systems. We cover primitives for synchronisation and decision-making with an emphasis on interfacing analog and digital worlds, sampling of non-persistent signals, and efficient handling of correlated sensor events. |
16:55 | Cost-Effective and Flexible Asynchronous Interconnect Technology for GALS Networks-on-Chip SPEAKER: unknown ABSTRACT. Fine-grained power management of largely-integrated manycore systems is becoming mainstream in order to deal with tight power budgets. As a result, some level of asynchrony is becoming inevitable for correct system-level operation. Asynchronous interconnection networks naturally provide such asynchrony, however their commercial uptake depends on the capability to overcome two fundamental barriers: their area and dynamic power overhead as well as the limited computer-aided design (CAD) tool support for their automated design. This paper presents a novel design point for on-chip asynchronous communication, combining design flexibility with small footprint and cost effectiveness. It relies on a parameterizable switching fabric designed on top of a two-phase communication protocol and a bundled-data encoding scheme, combined with a predictable hierarchical synthesis flow with mainstream industrial tools. |
17:15 | 3D Asynchronous Network-on-Chip or How to Extend the GALS paradigm to 3D Architecture SPEAKER: Pascal Vivet ABSTRACT. For High Performance Computing (HPC), the never ending quest of additional computing capability is hitting strong limits, mostly the power wall and the memory wall. The 3D technology by using so-called TSVs (Through Silicon Via) offers the possibility to integrate more cores, closer to the memories, with reduced power consumption thanks to smaller communication distances. For such 3D technologies, the main challenges are to design energy efficient 3D communication infrastructure, to offer test strategy and to handle related power and thermal dissipation issues. The Globally Asynchronous Locally Asynchronous (GALS) paradigm has been widely studied in order to decouple the timing domains in large System-on-Chip. GALS is a perfect enabler of Network-on-Chip communication infrastructure, where IPs are locally synchronous while the Network-on-Chip is fully asynchronous. In this context, we present an innovative 3D asynchronous Network-on-Chip as a solution to implement efficiently a 3D GALS system, providing robust inter layer communication, avoiding inter layer clocking, and providing fast and energy efficient 3D links. Recent circuit results exhibits the feasibility of such approach and paves the way to further system perspectives. |
17:35 | Timing Organization of a Real-Time Multicore Processor SPEAKER: unknown ABSTRACT. Real-time systems need a time-predictable computing platform. Computation, communication, and access to shared resources needs to be time-predictable. We use time division multiplexing to statically schedule all computation and communication resources, such as access to main memory or message passing over a network-on-chip. We use time-driven communication over an asynchronous network-on-chip to enable time division multiplexing even in a globally asynchronous, locally synchronous multicore architecture. Using time division multiplexing at all levels of the architecture yields in a time-predictable multicore processor where we can statically analyze the worst-case execution time of tasks. |