View: session overviewtalk overview
09:30 | Robustness Challenges in the Internet of Things SPEAKER: Yervant Zorian ABSTRACT. The Internet of Things (IoT) is an extremely fragmented market and can be defined as anything from sensors to small servers. It is estimated that over 30 billion IoT devices will ship by 2020. The ability to sense countless amounts of information that communicates to the cloud is driving innovation into IoT applications, such as in wearable devices (for health, fitness or infotainment applications) and in machine-to-machine applications (in smart appliances, smart cities or commerce). It has become crucial for today’s IoT chips to use a range of new solutions during the design stage to ensure the robustness of manufacturing test, field reliability and security. DFT designers need to use new test and reliability solutions to enable power reductions during test, concurrent test, isolated debug and diagnosis, pattern porting, calibration, and uniform access. Moreover, the per unit IoT price remains a key factor in high volume production. Thus, minimizing the test cost while meeting the above technical issues is one of the major challenges of the IoT industry. This presentation, besides discussing the key trends and challenges of IoT, will cover solutions to handle the wide range of potential robustness challenges during all periods of the IoT lifecycle from design, post silicon bring-up, volume production, to in-system operation. |
11:00 | Targeting Inter Set Write Variation to Improve the Lifetime of Non-Volatile Caches using fellow sets SPEAKER: Sukarn Agarwal ABSTRACT. High density and low static power exhibited by Non-Volatile technologies (NVM) have made them popular candidates in the memory hierarchy, including caches. Writes within a cache set are governed by the access pattern as well as replacement policies, leading to a large variation. This variation is of concern as it leads to an early breakdown of the NVM cells thus reducing the effective lifetime. This paper presents a technique to improve the lifetime of non-volatile caches by reducing the inter-set write variation. Our policy partitions the cache sets into groups called fellow groups. Every set has two logical parts: Normal and Reserved. Sets within a fellow group can use the reserved parts from their fellow sets to distribute the writes uniformly. Experimental results using full system simulation show that the proposed technique shows the significant reduction in inter-set write variation over the baseline and existing techniques. |
11:30 | Low-Overhead Asymmetric Frequency Control for On-Chip Network Interconnects SPEAKER: Pedro Campos ABSTRACT. The growing impact of the network on the overall power consumption of many-core systems introduces a need for mechanisms that reduce the power required for data communication without significantly impacting performance. This paper proposes a low-overhead mechanism for frequency control of individual channels in a Network-on-Chip system. The proposed mechanism is low-overhead, distributed and easy to tune for varying traffic, which are crucial aspects in a many-core context. This is primarily achieved by the asymmetry of the controller, which defines different responses for increases and decreases in performance requirements, allowing a channel to operate at the minimum required frequency without adversely affecting stability, necessary for subsequent voltage scaling. |
12:00 | Power-aware and Cost-efficient State Encoding in Non-volatile memory based FPGAs SPEAKER: Chengmo Yang ABSTRACT. Non-volatile memory (NVM)-based FPGAs are expected to replace traditional SRAM-based FPGAs to achieve higher scalability, lower leakage power, and better reliability. The flip-flops on FPGAs can be implemented with NVM elements such as Magnetic Tunnel Junctions (MTJ). Flip-flops are used to implement finite state machine (FSM) in sequential logic, and the design cost including power and hardware may vary significantly by applying different state encoding strategies. In this work, to reduce power consumption and hardware cost, a new state encoding algorithm is proposed to reduce bit flips during state transitions within limited number of flip-flops. The proposed scheme, consisting of a transition graph model, an encoding graph conflict removing algorithm and a one-pass dynamic encoding algorithm, can reduce state transition bit flips by up to 47.6% and the number of flip-flops by up to 96% compared with current popular encoding solutions. |
11:00 | Intelligent Embedded and Real-Time ANN-based Motor Control for Multi-Rotor Unmanned Aircraft Systems SPEAKER: Theocharis Theocharides ABSTRACT. Constant technological advancements in commercial multirotor unmanned aerial vehicles (drones), resulted in their deployment in more and more applications, ranging from entertainment to disaster management, and many more domains. However, in contrast to their powerful and diverse entrance to our lifestyle and society, they do not yet provide sufficient intrinsic fail-safe mechanisms to prevent accidents that can arise due to technical problems or unforeseen flight incidents such as turbulent winds, inexperienced pilots, and so on. In this work therefore, we propose the use of an integrated intelligent motor controller, which is trained to recognize incidents directly from the on-board sensors (barometer, gyroscope, compass and accelerometer) and reacts in real-time adjusting the drone’s motors. The goal is to provide a small, low power, intelligent, real-time, built-in controller for multirotor UAVs that will be able to understand a dangerous scenario right before it happens and start taking counter measures to keep the drone safe, and provide the pilot with a bigger reaction-time window. We propose the use of a artificial neural network, implemented in a lightweight embedded processing board, that is able to recognize and react in real-time various turbulent situations. Experimental results suggest that our controller is able to respond properly and timely to wind changes (turbulence) allowing the drone to maintain its expected state and path. |
11:30 | Extending OpenVX for Model-based Design of Embedded Vision Applications SPEAKER: Nicola Bombieri ABSTRACT. Developing computer vision applications for low-power heterogeneous systems is increasingly gaining interest in the embedded systems community. Even more interesting is the tuning of such embedded software for the target architecture when this is driven by multiple constraints (e.g., performance, peak power, energy consumption). Indeed, developers frequently run into system-level inefficiencies and bottlenecks that can not be quickly addressed by traditional methods. In this context OpenVX has been proposed as the standard platform to develop portable, optimized and power-efficient applications for vision algorithms targeting embedded systems. Nevertheless, adopting OpenVX for rapid prototyping, early algorithm parametrization and validation of complex embedded applications is a very challenging task. This paper presents a methodology to integrate a model-based design environment to OpenVX. The methodology allows applying Matlab/Simulink for the model-based design, parametrization, and validation of computer vision applications. Then, it allows for the automatic synthesis of the application model into an OpenVX description for the hardware and constraints-aware application tuning. Experimental results have been conducted with an application for digital image stabilization developed through Simulink and, then, automatically synthesized into OpenVX-VisionWorks code for an NVIDIA Jetson TX1 board. |
12:00 | Energy-aware Task Scheduling for Near Real-time Periodic Tasks on Heterogeneous Multicore Processors SPEAKER: Takashi Nakada ABSTRACT. Near real-time periodic tasks, which are popular in multimedia streaming applications, have deadlines that are longer than the input intervals thanks to buffering. For such applications, the conventional frame-based scheduling cannot realize optimal scheduling due to their shortsighted deadline assumption. To realize globally optimal executions of these applications, we propose a novel task scheduling algorithm, which takes advantage of the long deadline. We confirmed our approach can take advantage of the longer deadline and reduce the average power consumption by up to 65%. |
12:30 | Building World Class R&D Capacity in Semiconductor Technology SPEAKER: Rafic Makki ABSTRACT. This talk provides an overview of the strategies and initiatives employed to develop a world-class R&D ecosystem in Abu Dhabi. The talk shows how industry, government, and academia came together to develop all the required components of the ecosystem, including infrastructure, human capital and access to leading-edge process technologies. The talk discusses the current state of the ecosystem and its future prospects. |
13:45 | On the In-field Testing of Spare Modules in Automotive Microprocessors SPEAKER: Davide Piumatti ABSTRACT. Currently, the most of the available strategies devised for in-filed testing of microprocessor cores in the automotive market, are mainly oriented to test the more representative functional modules. However, most of the modern architectures also include a series of sparse computational components that perform very specific functionalities. For example, merging modules, masked and reverse operation modules, circular buffers, and special counters able to speed up the final application. In this paper, we provide a set of guidelines for the generation of Software-Based Self-Test programs that can be used to functionally test these modules during the in-field operation of the processor core. For every one of these modules, ad-hoc techniques are illustrated. The experimental results were gathered on a multi-core design manufactured by STMicroelectronics. |
13:45 | Restoration protocol: Lightweight and secure devices authentication based on PUF SPEAKER: Brisbane Ovilla-Martinez ABSTRACT. Several authentication protocols based on Physically Unclonable Functions (PUF) have been proposed to authenticate low cost hardware devices. The PUFs are hardware primitives able to generate devices' unique identifiers that depend on inherent device variations. The preliminary steps of a PUF authentication protocol are to obtain and store the reference response by the manufacturer (used as a device's secret identifier). This reference response is compared (accurately or with a small threshold) with the response given by the device during its normal use. However, the response provided by a PUF is unstable. Consequently, correction mechanisms have been provided to increase the response stability, although this solution is very expensive. Generally, the correction mechanisms consume more area than the PUF; hence, these solutions are unfeasible for restricted devices. This article presents an ultra-lightweight device authentication protocol based on PUF noise characterization. The proposed PUF protocol adapts the reference response to the generated response of the device without leaking any information to an adversary. In addition, the restoration protocol is implemented without using complex mechanisms like fuzzy extractors. The workload is performed on the server side, which has more resources. The security analysis and the experimental validation, using real PUF responses obtained from TERO-PUF, demonstrate the viability of the proposed protocol. Moreover, the proposed algorithm increases notably the stability of the generated identifier with any hardware overhead on the device side. |
13:45 | Implementation and Analysis of Hotspot Mitigation in Mesh NOCs by Cost-Effective Deflection Routing Technique SPEAKER: Abhijit Das ABSTRACT. Network-on-Chip (NoC) serves as an efficient communication framework among the components of Chip Multi-Processors (CMPs). With the increasing number of computation intensive applications, communication between cores also increases, which creates high congestion resulting in network performance degradation. Handling congestion is a key network management issue in NoC. Hotspots are non-uniform traffic formation near cores, where some cores have to handle a relatively higher traffic compared to other cores. Prolonged presence of these hotspots increases the communication latency of packets flowing through them. This work proposes a novel approach to identify destination hotspots and upon identification, packets are de-routed away from these cores using a cost effective deflection routing technique. Experimental results show that in highly congested networks, our approach detects destination hotspots with great accuracy. De-routing of packets away from hotspots help such cores to achieve congestion relief and thereby decreasing the flit latency of packets flowing in the network. |
13:45 | Non-Regression Approach for the Behavioral Model Generator in Mixed-Signal System Verification SPEAKER: Chien-Nan Liu ABSTRACT. Building the behavioral model for each analog circuit is an efficient approach for mixed-signal system verification. If an automatic model generator is available, it is useful for designers to reduce the extra efforts. Instead of modeling the relationship between circuit inputs and outputs directly, a divide and conquer approach is proposed in [8] to divide the circuit into several small building blocks and model the behavior of each block easily. Although the regression efforts have been greatly alleviated in this structure-based approach, the preparation of the training patterns is still a big issue. In this work, a different approach is proposed to build the behavioral model of each internal block in structure-based approach without regression. Therefore, no training patterns are required in the calibration process. As shown in the experimental results, the model accuracy is still kept in the proposed approach while the efficiency of behavioral model generator is greatly improved. |
13:45 | A sub-µW Bio-potential Front End in 65nm CMOS SPEAKER: Yonatan Kifle ABSTRACT. A bio-potential amplifier intended for continuous monitoring of vitals characterized by its long operational lifetime is required to operate at the lowest power budget possible. Moreover, compact active area directly related to portability is essential. This paper presents a 0.55μW auto gain controlled bio-potential amplifier implemented in 65nm 1P7M CMOS for ECG signal classifier SoC. A chopper stabilized amplifier is designed at 0.6V supply voltage to mitigate the near DC offset and flicker noise. The input ECG signal level is further set by the gain level of the variable gain amplifier (VGA) to achieve maximum swing to the ADC. The whole system is integrated in a core are of 0.10mm2 and can operate at a wide range of 0.6-1.2V supply voltage. |
13:45 | Applying IJTAG-Compatible Embedded Instruments for Lifetime Enhancement of Analog Front-Ends of Cyber-Physical Systems SPEAKER: Ahmed Ibrahim ABSTRACT. In safety-critical cyber-physical systems, analog front-ends combined with many-processors are being increasingly employed. An example is an imminent collision detection chip for cars. Such a complex system requires zero downtime and a very high dependability despite aging issues under harsh environmental conditions. By on-line monitoring the health status of the processor cores and taking appropriate counteractions if required, we have accomplished this goal in the past via IJTAG compatible embedded instruments and appropriate embedded software. This paper extends this approach to the analog / mixed-signal frontends of these systems, thereby creating a new uniform approach in design & test methodology, as well as a streamlined fault management. An IJTAG- compatible voltage monitor is introduced, for measuring aging-generated offset in OpAmps and SAR ADCs, as well as a delay-monitoring embedded instrument for detecting timing issues in ADCs. In addition, two-stage counter measures, like digitized recalibration and subsequent replacement, are presented to increase the lifetime by factors of the analog front-end of Cyber-Physical Systems-on-Chips. |
15:00 | A New Approach For Constructing Logic Functions After ECO SPEAKER: Amir Masoud Gharehbaghi ABSTRACT. Engineering change orders (ECO) are small changes in the design due to last minute bug fixes or spec changes. In this paper, we focus on functional ECO in a logic design and try to construct the new logic function, reusing the existing logic as much as possible. Traditional approaches try to find appropriate locations in the original design that their modifications may result in the new functionality. However, those methods usually fail if additional inputs are required. We propose a new approach based on iterative SAT solving to find the inputs of the function for the given internal nodes, or the primary outputs that are the target of the ECO, out of all the internal signals and primary inputs such that it is guaranteed to be able to correct the functionality without explicitly generating the functions. Our experimental results on ITC’99 benchmarks shows the efficiency and effectiveness of our approach, specially for hard ECO cases. |
15:30 | Exploring the Use of the Finite Element Method for Electromigration Analysis in Future Physical Design SPEAKER: Matthias Thiele ABSTRACT. Addressing electromigration (EM) during physical design has become crucial to ensure reliable integrated circuits. Simulation methods, such as the finite element method (FEM), are increasingly overwhelmed by the complexity of the task. It is predicted that FEM will not be usable anymore for a full-chip EM analysis with further technology scaling due to complexity reasons. To address this bottleneck, we present a new methodology of FEM-based full-chip EM analysis for future technologies down to 10 nanometer feature sizes. Our solution reduces analysis costs significantly by establishing pre-validated layout patterns without loosing accuracy of the verification results. Our full-chip meta-model EM analysis allows speedups of at least 10X compared to current FEM-based verification methods. |
16:00 | Multiple Reset Domains Verification Using Assertion Based Verification SPEAKER: Islam Ahmed ABSTRACT. Current System on Chip (SoC) designs operate in multiple domains such as clock, reset and power domains. This is done to afford various functionalities existing on different IPs that can work in different configurations. Data propagating across multiple reset domains with the absence of correct synchronizers may be corrupted and unreliable. This paper presents an efficient technique to dynamically validate multiple reset domains violations. The proposal is to first automatically model these violations using System Verilog Assertions (SVA), then instrument the design with the generated assertions and then verify the instrumented design using simulation or formal methods to prove existence of these violations. The paper first categorizes all the possible reset violations, then writes a unique assertion logic for every category that when de-asserted implies a bug. The framework proves effectiveness in finding issues on real designs with multiple resets. |
15:00 | Process-Aware Side Channel Monitoring for Embedded Control System Security SPEAKER: Farshad Khorrami ABSTRACT. Cyber-physical systems (CPS) are interconnections of heterogeneous hardware and software components (e.g., sensors, actuators, physical systems/processes, computational nodes and controllers, and communication subsystems). Increasing network connectivity of CPS computational nodes facilitates maintenance and on-demand reprogrammability and reduces operator workload. However, such increasing connectivity also raises the potential for cyber-attacks that attempt unauthorized modifications of run-time parameters or control logic in the computational nodes to hamper process stability or performance. In this paper, we analyze the effectiveness of real-time monitoring using digital and analog side channels. While analog side channels might not typically provide sufficient granularity to observe each loop iteration of the code in the CPS, the temporal averaging inherent to side channel sensory modalities enables observation of persistent changes to the contents of a computational loop through their resulting effect on the level of activity of the device. Changes to code can be detected by observing readings from side channel sensors over a period of time. Experimental studies are performed on an ARM-based single board computer. |
15:30 | DFS Covert Channels on Multi-Core Platforms SPEAKER: Jeyavijayan Rajendran ABSTRACT. Covert channels provide a secret communication medium between two malicious processes to exfiltrate information stealthily that violates the security policy of a system. In this paper, we demonstrate a new covert timing channel attack that exploits the CPU operating frequencies with different power governors in real system environment. In particular, we establish how two colluding processes---a trojan and a spy can modulate the CPU frequency to create a powerful, high-capacity and robust covert channel. We implement this covert channel both in a single threaded and simultaneous multi-threading (SMT) environment and show the feasibility of such a communication. Our experiments on Intel Xeon server platform demonstrate dynamic frequency scaling covert channels that can achieve up to 20 bits/second. |
16:00 | Pushing the Limits Further: Sub-Atomic AES SPEAKER: Markus Stefan Wamser ABSTRACT. While throughput has for a long time been the main focus of optimisation, with the IoT leaving the state of prototypes, the need for compact and lightweight implementations of cryptographic primitives is on the rise again. Along with development of new tailored primitives and standards, such as PRESENT, the search for small implementations of the AES has gained momentum again. This culminated in the recent publication of the AtomicAES architecture by Banik et al., who reported a design size of just over 2000~GE. In this work we design a new 8-bit serial architecture from scratch that enables us to push the area requirement for a fully featured AES primitive further down by almost 10% of the theoretical gap left by AtomicAES for optimisation. Aside from setting a new record for an architecture with full functionality for encryption and decryption including keyschedule as well as for a pure encryption architecture, our design is flexible enough to allow arbitrary replacement of the SBox architecture from single-cycle approaches to multi-stage pipelined approaches as are required for high operation frequencies or for protection against side-channel attacks, e.g. through Threshold Implementations. We also answer in the affirmative the open question whether the AES reverse keyschedule can be implemented without additional hardware over the forward keyschedule. |
16:30 | A Self-powered IoT SoC Platform for Wearable Health Care SPEAKER: Mohammed Ismail ABSTRACT. This talk will focus on an IoT Systems-on-Chip (SoCs) presented as an example of IoT work in the UAE and as part of the UAE SRC (Semiconductor Research Corp) Center of Excellence on Energy Efficient Electronic Systems (aka ACE4S http://www.src.org/program/grc/ace4s/) involving researchers from 5 UAE Universities looking at developing new technologies aiming at innovative self-powered wireless sensing and monitoring SoC platforms. The research targets applications in self-powered chip sets for use in public health, ambient intelligence, safety and security and IoT. ACE4S is the first SRC center of excellence outside the US. One such application, which we will discuss in details, is a ground breaking self-powered IoT SoC platform for wearable health care. More specifically we will present a novel fully integrated ECG signal processing system for the prediction of ventricular arrhythmia using a unique set of ECG features extracted from two consecutive cardiac cycles. Two databases of the heart signal recordings from the American Heart Association (AHA) and the MIT PhysioNet were used as training, test and validation sets to evaluate the performance of the proposed system. The system achieved an accuracy of 99%.The ECG signal is sensed using a flexible, dry, Graphene-based technology and the system is powered up by harvesting human thermal energy. The system architecture is implemented in Global foundries’ 65 nm CMOS process, occupies 0.112 mm2 and consumes 2.78 micro Watt at an operating frequency of10 KHz and from a supply voltage of 1.2V. To our knowledge, this is the first SoC implementation of an ECG-based processor that is capable of predicting ventricular arrhythmia hours before the onset and with an accuracy of 99%. |
17:00 | Development of Advanced Microclimate and Urban Energy Analysis Modeling Environment and Its Validation by Wide Area Sensor Networks and Remote Sensing for Future Adaptation of Urban Infrastructure SPEAKER: Prashanth Marpu ABSTRACT. The last decade has seen a big shift towards urbanization all over the world and this trend is even more prominent in the arid lands of the developing world. Desert cities, especially in the Middle East, have seen tremendous growth in the last decade housing an ever-increasing population. The link between urbanization and regional micro-climate adaptation is expected to have a major role in the future development of desert cities, whose quest for sustainable solutions often clashes with the challenges imposed by an extremely harsh environment. With varying landscape and extents, the rapidly developing desert cities represent intense micro-climate scenarios. Monitoring the urban environment, especially temperature and thermal comfort is the first step to increase understanding of the urban environment to plan future urban planning efforts more efficiently. The Internet of Things (IoT) related technological advances of the recent years have presented a wide set of tools to monitor and model urban microclimate with greater detail. Ubiquitous connectivity, availability of sensors, low cost and power computing platforms, as well as a plethora of affordable cloud services have set a highly favorable and low-effort scene in the development and deployment of urban monitoring network sensors. Leveraging the aforementioned tools and taking into account the extent of the required environmental variables for modelling an intense urban microclimate such as Abu Dhabi’s, we developed a network of specialized urban weather stations (UWSs). Each UWS is capable of performing high resolution and accuracy measurements of a large set of environmental variables including air temperature (at different heights), humidity, land surface temperature, global horizontal irradiation (GHI), wind speed, building facade temperature, etc. Furthermore, the UWSs feature two-way communication with an online server for data transmission, health monitoring and over-the-air updates using an HSPA/WCDMA modem. The data is used in the modeling framework to simulate urban thermal flows thereby mapping urban thermal comfort and estimating building energy usage. The final intent is to develop a tool for city managers and planners to conduct simulation-aided design of Abu Dhabi’s downtown, showing them how the construction of an additional building, park, street, or other infrastructure will impact the urban heat flow. |