DCIS 2023: 38TH CONFERENCE ON DESIGN OF CIRCUITS AND INTEGRATED SYSTEMS
PROGRAM FOR FRIDAY, NOVEMBER 17TH
Days:
previous day
all days

View: session overviewtalk overview

09:00-10:20 Session 14A: R5A Programmable Devices
Location: Room A
09:00
Implementing a CNN in FPGA Programmable Logic for NILM Applications

ABSTRACT. Non-Intrusive Load Monitoring (NILM) techniques are gaining popularity in the field of energy savings. Generally implemented through the use of smart meters, the main challenge with these devices is that they operate at very low sampling rates. To address this issue, FPGA-based systems have been proposed to capture instantaneous currents and voltages at higher sampling frequencies in the kHz range. However, the limitation of these architectures lies in the fact that the acquired windows are often transmitted upstream to the cloud for the application of load classification algorithms based on machine learning, relying on high-bandwidth communications available onsite. This work proposes an alternative approach by implementing the classification algorithms in the same acquisition and processing system, by using custom convolutional neural networks on mid-range FPGA devices. Brevitas and FINN frameworks are used for the quantization-aware training, as well as for the generation of a peripheral that may be integrated into any FPGA-based SoC (System-on-Chip) architecture. The proposed approach allows the whole processing involved in NILM techniques to be integrated into a single embedded system. Preliminary experimental results demonstrate the effectiveness of the proposed approach.

09:20
Flexible Deep-pipelined FPGA-based Accelerator for Spiking Neural Networks

ABSTRACT. Spiking neural networks (SNNs) promise to perform tasks currently done by classical artificial neural networks (ANNs) faster, in smaller footprints and using less energy. To gain more insight into these new information processing architectures, researchers need tools to simulate their networks. However, these structures can be challenging in terms of complexity and size, so there is an increasing need for acceleration. In contrast to CPU and GPU-based systems, FPGAs are ideal as their distributed architecture and parallel nature is very well suited for spiking neuron implementation.

This work proposes a clock-driven FPGA-based spiking neuron processing core architecture, which is flexible enough to simulate groups of neurons with arbitrary neuron models. Taking advantage of deep pipelining and fast distributed memory, a particular implementation of this core with 2048 Leaky-Integrate-and-Fire neurons and 128 synapses per neuron is able to use less area per neuron than current state-of-the-art implementations while also reaching low simulation step times.

09:40
A Simple Power Analysis of an FPGA implementation of a polynomial multiplier for the NTRU cryptosystem

ABSTRACT. As quantum computing technology advances, the security of traditional cryptographic systems is becoming increasingly vulnerable. To address this issue, Post-Quantum Cryptography (PQC) has emerged as a promising solution that can withstand the brute force of quantum computers. However, PQC is not immune to attacks that exploit weaknesses in implementation, such as Side Channel Attacks (SCAs). SCAs can extract secret keys by analyzing the physical characteristics such as power consumption of the device while performing cryptographic operation. Simple Power Analysis (SPA) is a type of SCA that uses power consumption measurements to extract sensitive information. By applying SPA to a specific hardware implementation of a PQC algorithm such as the NTRU, potential vulnerabilities can appear in the Arithmetic Unit (AU) in charge of the multiplication operation. The effectiveness of this analysis to extract sensitive information has been evaluated through extensive experiments in which different countermeasures and strategies have been proposed, as well as an accelerated algorithm has been implemented. The results demonstrate that SPA can point out security breaches in the NTRU implementation, indicating an issue that can affect the PQC in the future.

10:00
A Security Comparison between AES-128 and AES-256 FPGA implementations against DPA attacks

ABSTRACT. As the AES is the standard symmetric cipher selected by NIST, is the best-known and the most widely used block cipher. Consequently, security threats are constantly rising and increasingly powerful. With the addition of the upcoming scenario of quantum computing, these threats have become a front-line concern in the crypto-community. Although is claimed that using larger key sizes in symmetric key algorithms for implementing quantum-resistant implementations is enough to counteract brute force attacks, this paper shows that both AES-128 and AES-256 are vulnerable to Power Analysis attacks. This paper presents a security comparison against Differential Power Analysis (DPA) attacks over both AES 128-256. Through experimental attacks in FPGA AES implementations, results show that although AES-256 reaches a greater level of security than AES-128, is still vulnerable to this kind of attack. Specifically, we have obtained 75% of the bytes needed to find the original key for AES-128 while only 28.125% for AES-256 by performing the same attack.

09:00-10:20 Session 14B: R5B Embedded Systems and SoC
Location: Room B
09:00
Timing requirements on multi-processing and reconfigurable embedded systems with multiple environments
PRESENTER: Sara Alonso

ABSTRACT. In recent years, Multi-Processor System on Chip (MPSoC) devices are becoming more and more common in embedded systems, due to the possibilities and flexibility they offer. They allow running different environments simultaneously in a single platform, usually one for general purpose applications and another one for real-time processes. This is possible with a software technique such as hypervisors or Asymmetric Multi- Processing (AMP) frameworks, however, they can significantly affect the system in terms of latency. This work summarizes the main latencies a reconfigurable multi-core device with virtualized environments can have and proposes methods to measure and quantify latencies.

09:20
Definition of a SoC Architecture for a High-Rate Correlator Bank
PRESENTER: David Molto

ABSTRACT. Interest in indoor optical positioning systems has increased over the past decade, as they achieve an accuracy in the range of centimeters in three dimensions (3D) using light emitting diodes (LEDs) and photoreceptors. Due to the need of a high-rate position updating, these systems often pose challenges and issues when trying to achieve real-time prototypes capable of processing the incoming signals from the receivers to provide a final position estimate. In this context, this work presents a System-on-Chip (SoC) architecture for implementing the signal processing associated with an infrared positioning system, based on a set of four QADA photoreceptors acting as anchors at known positions and on a single LED to be positioned (more LEDs may be involved by implementing a medium access technique). The definition and design of the specific peripheral, that is integrated in the SoC architecture, is proposed, analyzing the effect of the corresponding fixed-point notation. The proposal has been successfully validated in a preliminary way by comparing the proposal’s results with certain test patterns at every stage.

09:40
Digital Spectroscopy Channel Integrated into a SoC for Testing and Analysis

ABSTRACT. The spectroscopy systems used to measure particles parameters nowadays are all digital and can be developed by the use of digital processing on low-cost SoC platforms. These systems have fast ADCs that allow signal processing algorithms to be applied in hardware, enabling realtime data processing. By using a low-cost instrumentation SoC, real-time spectroscopy data can be obtained. Additionally, by integrating a DAC into the system, signals can also be generated for laboratory testing. Finally, by programming the hardware in the FPGA and the software in the SoC, an efficient, flexible, and easily reprogrammable system can be obtained while keeping the cost low.

09:00-10:20 Session 14C: R5C Communications - Design of power aware circuits and systems
Location: Room C
09:00
Making Digital N-Path Mixers
PRESENTER: Hasan Moussa

ABSTRACT. This paper focuses on a Digital N-Path Mixer for radiofrequency systems, which is synthesized from a VHDL RTL model in a 65 nm CMOS technology. The proposed architecture targets a 4-path mixer including a Finite Impulse Response filter of order 3, and is driven by a Self-Timed Ring Oscillator used as a multiphase generator. This digital system exhibits a power consumption of 0.73 mW, occupies an active area of 0.004 mm2, and can operate in a frequency range of up to 1 GHz. These performances are compared to their analog counterpart.

09:20
ADC Architectural Study for Digitally-Assisted Multi-Gigabit Data Communication Transceivers
PRESENTER: Pedro Barba

ABSTRACT. This paper presents a methodology for comparing both SAR and pipeline-SAR ADC architectures based on a target ENOB. First, a system-level model is used to select the parameters needed to achieve a given ENOB. Then, the power consumption of the solution is estimated. Finally, the different architectures are compared based on that estimation. A sample rate of 25GHz, 7 bits of ENOB and a metastability probability of 10^-12 have been used as a reference considering the requirement for a PAM-4, 25Gbps ADC-based transceiver. A design space for these specifications has been obtained.

09:40
SET and SEU Hardened Clock Gating Cell

ABSTRACT. Clock gating is a common approach for reduction of dynamic power consumption in digital designs. It is achieved by insertion of special clock gating cells in the circuit, enabling to switch off the clock signal to selected flip-flops. However, as the clock gating cell is composed of a latch and a logic gate, it may be affected by the Single Event Upsets (SEUs) and the Single Event Transients (SETs) in radiation environment such as space. Given that a single clock gating cell may be connected to many flip-flops, a fault in one clock gating cell may lead to the circuit or system malfunction. Therefore, the solutions for SET and SEU mitigation in clock gating cells are necessary for rad-hard designs. This work introduces the clock gating cell design based on the use of delay element and guard gate to filter input SETs, and triple modular redundancy to mitigate SEUs. The TMR approach also provides enhanced immunity to permanent errors. The proposed designs have been optimized to minimize the number of transistors and enhance the SET robustness of internal nodes.

10:00
A Compact Double-Exponential Circuit for Single Event Transient (SET) Emulation
PRESENTER: Sebastian Bota

ABSTRACT. Single event transients (SETs) have been regarded as a source of malfunction in nano-scale CMOS components. A detailed evaluation of the SET related effects and application of appropriate measures for their mitigation are fundamental tasks in the conception of reliable radiation hardened integrated circuits. In this paper we present a compatible CMOS circuit to emulate the SETs produced by radiation. The circuit has a control input to modulate the value of the injected charge, which can facilitate the experimental measurement of the critical charge related to a given component. We have found that using a 65 nm technology it is possible to reproduce a double exponential pulse with an error of 1.2% in the peak current and 8.2% in pulse width for an injected charge of 10 fC.

10:20-11:00Coffee Break
10:20-12:00 Session 15: Poster session
FPGA Implementation of Sherman-Morrison Formula Using High-Level Synthesis and Graphical Blocks Programming

ABSTRACT. Matrix inversion is used in many applications. While the inversion methods are well known mathematically, few articles present the optimal operation scheduling structure for FPGA implementation. This FPGA optimization problem seems to be the case for the general Sherman-Morrison formula. This algorithm is well known for its low complexity when solving a system of linear equations where matrices vary in time. The smaller the variation, the lower the complexity is. This paper proposes reformulating the equations to optimize the FPGA implementation regarding resources and latency. The presented work is aimed toward the electric circuit simulation but is kept in a general form. The implementation used high-level synthesis and graphical blocks programming from Vivado-HLS™ and Vivado System Generator™, respectively. The results show that the proposed implementation is efficient in a floating-point format. The latency is lower than comparable works in floating-point and almost as efficient as some works in fixed-point. Allowing us to diminish the precision loss from Sherman-Morrison.

Complexity Reduction of Baseband Volterra Modeling for Low Memory Cases
PRESENTER: Stanislas Dubois

ABSTRACT. This paper deals with the behavioral modeling of nonlinear systems. In particular, the modeling of near-carrier inter-modulation (IMD) distorsion in baseband, i.e. on complex signals, after I/Q demodulation. A reduction and a simplification of the Volterra BaseBand Series model (VBBS) are thus proposed. The proposed models are then compared to the literature according to several criteria, the computational complexity, and the accuracy of the modeling on the variation in frequency of the distortion. Simulations carried-out show a good trade-off between computational complexity and lineariza- tion performance for low memory depths, when compared to literature models.

RTL Modeling of the RV32I Architecture with SystemC

ABSTRACT. SystemC is a library for modeling and simulation of complex systems that allows several levels of abstraction, from transaction-level and data-flow, to register-transfer-level (RTL). In this paper, the process of modeling the RV32I ISA of the RISC-V processor in RTL is described. The architecture is fully-pipelined and cycle accurate. For simplicity, a Harvard memory architecture is considered with a latency of 1 cycle. The process of adding extensions is illustrated using integer multiplication and division as examples. Finally, simulation speed is assessed for a number of programs.

Maximum Operating Frequency Self-Tuning System on FPGAs Using Dynamic Reconfiguration

ABSTRACT. This paper proposes the use of dynamic clock frequency reconfiguration technique to optimize the performance of a circuit by changing its clock frequency to make it work at its maximum operating frequency. The proposed technique utilizes a FPGA reconfigurable clock generator circuit that change the generated clock frequency for a circuit in real time. By systematically increasing the clock frequency and monitoring the circuit's response, the real maximum operating frequency can be determined. The effectiveness of the proposed technique is demonstrated through simulation and with experimental results with the developed of an experimental system. Results show that it can accurately detect the maximum operation frequency of a circuit while maintaining its reliability and integrity.

Ciber-physical System Arquitecture for AUTOSAR Automotive Framework based on 5G communications

ABSTRACT. This paper presents a cyber-physical system architecture for the automotive sector which includes the design and development of an AUTOSAR-compliant ECU-5G hardware and firmware, which enables Edge Computing applications and acts as a gateway between the AES128 encrypted CAN network of the vehicle and a Control Centre via a secure 5G connection thanks to the implementation of a VPN connection and the use of SSL certificate. The control centre is based on a microservices architecture that includes a non-relational database that guarantees the integrability of vehicle data, heterogeneous by nature, a Cloud Computing data processing service that allows the hosting of the desired machine learning algorithms and an intuitive user interface that enables the management of the monitored vehicles and assets by the end user. Finally, the solution is experimentally validated through its integration in a real Ciber-Physical framework composed of a racing vehicle, where a ECU 5G together with a ECU BMS responsible for monitoring its battery are deployed, and a Control Centre as well as through a comparison with other systems available on the market.

Exploring Open-Source and Proprietary Design Tools to Implement a Symmetric Cipher on FPGAs

ABSTRACT. In the age of digital systems ubiquity, designing specialized solutions requires tailored tools. Proprietary software and hardware may present limitations that hinder future adaptability, thus, open-source design tools have emerged as an alternative solution to overcome these issues. They offer benefits such as being free, flexible, customizable, and community-driven. This paper evaluates and compares the design flow of a symmetric cipher implemented on Xilinx Zynq 7000 SoC devices using Xilinx Vivado software against Lattice iCE40 LP/HX devices using APIO open-source software tools. The comparison has been made in terms of performance, functionality, and cost. The methodology involved evaluating design requirements, features, support options, and testing to provide an analysis of the advantages and disadvantages of each tool highlighting the trade-offs between open-source and proprietary design tools.

Design and Evaluation of a RISC-V based SoC for Satellite on-board Networking
PRESENTER: Armando Astarloa

ABSTRACT. SpaceWire is a communication protocol that has become widely used in spacecraft for connecting instruments to data processors, mass-memory, and control processors. Field-Programmable Gate Arrays (FPGAs) have been a popular choice for implementing SpaceWire nodes due to their flexibility to meet unique requirements of each program or product. This paper presents a comparative study of two implementations of SpaceWire nodes, based on two different FPGA technologies, AMD-Xilinx SRAM-based and Microchip (Microsemi) FLASH-based. The study compares the resource requirements and estimated power consumption of both implementations, using the same HDL SpaceWire IP core, with the SRAM-based one incorporating a 32-bit Microblaze soft-CPU, and the FLASH-based one using a 32-bit RISC-V CPU. The obtained results are compared, and the paper concludes that FLASH-based FPGAs are more suitable for applications that require high reliability, tamper resistance, and fast, reliable restarts. In contrast, SRAM-based FPGAs are preferred in applications that require high performance and reconfigurability. The study shows that both FPGA technologies are capable of implementing SpaceWire nodes effectively and efficiently, and designers can choose the technology that best suits the specific requirements of each project.

Improvement of the estimation of execution cycles of Application SW for Cross-Compiled Simulation of RISC-V Platforms using AI
PRESENTER: Eugenio Villar

ABSTRACT. The open-HW RISC-V architecture opens the way for the development of ad-hoc digital systems in which the processing platform is fully configured to optimize the global performance of the application being implemented. The final system configuration is decided using HW/SW co-design. Efficient HW/SW co-design requires flexible SW simulation able to easily accommodate any change in the HW architecture. Cross-Compiled simulation can provide this flexibility. In cross-compiled simulation each basic block in the application code is annotated with the time it would take in the target processor being simulated. Usually, this time is obtained as the addition of the estimated execution times of the instructions in the block. In this paper, we propose the use of a Neural Network to improve the accuracy of the method. The application asm for the target CPU is analyzed and the number of cycles required by each basic block, back-annotated. The execution of the code on the host CPU provides the performance during simulation. Results obtained show that Machine Learning improves accuracy w.r.t. traditional bavch-annotation in which each instruction is associated to a fixed number of cycles.

Time-domain Architectures for Interfacing Phase Change Memory
PRESENTER: Amadeo de Gracia

ABSTRACT. Phase Change Memory (PCM) is one of the leading non-volatile memory technologies offering high-density, multi-level solutions. Unfortunately, the circuitry required to read and write such devices has not been fully solved due to the highly non-linear behaviour of these devices and the complex voltage and current ranges required for reliable operation. Most of the proposed interfaces require very large area and power consumption, and are accompanied by an odd form factor. In this paper, we explore the time-domain interfaces among the possible solutions for driving a PCM cell. These proposed designs relate the physical quantities of the devices to a time variable. In this work, we provide a proof of concept for the implementation of time domain interfacing architectures on this particular type of memristive cell.

Experimental cartography generation methodology for Electromagnetic Fault Injection Attacks

ABSTRACT. The Electromagnetic Fault Injection (EMFI) is one of the methods to inject faults in the circuits with different purposes, from the security analysis point of view to the study of resilience against environmental conditions of the circuits. Focusing on secure cryptographic applications, in order to study the vulnerability of a circuit and perform successful attacks, it is necessary to induce the electromagnetic (EM) field in a very specific point in the surface of the circuit. This aspect, together with extra inconveniences as for example metal shields of the last metal layers of chips, results in a very poor efficiency in the fault injection through this technique, hindering the possibilities to perform the attacks. This paper presents a experimental cartography generation methodology to carry out automatic EMFI attacks. The presented methodology allows to improve the efficiency in fault injection attacks, showing the areas where the circuit presents greater vulnerabilities against the EM disturbances, allowing to focus the attacks on those points. As demonstrator vehicle, a SRAM is used. Results show that following the steps of the proposed methodology, it is able to detect the point with the maximum fault injection efficiency, along with a great precision of the fault injection, reaching up to been able to inject one bit single fault in the SRAM.

Exploration of Fast Sinewave Pattern Generation and Projection in a SoC-based System for Spatial Frequency Domain Imaging Applications
PRESENTER: Pallab Sutradhar

ABSTRACT. SFDI is an optical imaging technique that can extract optical properties from images with non-invasive procedures. This approach is being explored for a number of pre-clinical and clinical applications such as oncology and face transplant surgeries. The use of this technique requires the projection of patterns onto the surface of the target biological tissue. For current experiment, the patterns are aimed to be projected onto brain tissue, and then hyperspectral (HS) images are to be captured to extract optical properties. Earlier researches has used standard desktop computers to generate pattern projections. However, embedded systems for similar purposes have not been vigorously explored. Two embedded devices, Raspberry Pi (RPI) and a Zynq-7000 SoC have been chosen for Single Board Computer (SBC) based system and System-on-Chip (SoC) based system. Both of these systems experimented for the generation of 8-bit depth cosinewave patterns and their projection through a Digital Light Processing (DLP) projector on to the brain tissue. To the best of our knowledge, this has not been done before in any SoC environment. It takes around 15.36 ms and 31.11 ms, respectively, for the RPI and Zynq-7000 SoC to generate one pattern frame, which shows the Zynq-7000 SoC gives 2x more speedup than the RPI. And it has also been found that the Zynq-7000 SoC is able to compute 1 pixel per clock for each pattern frame. Furethermore, the Root Mean Squared Error (RMSE) values found for different phase inputs in this experiment shows that the pixel values computed by the SoC are almost similar to the RPI. Therefore, it can be concluded that the SoC and multiprocessor system-on-chip (MPSoC) are potential candidates for further implementation and experimentation of the complete pattern projection system.

QChain Node: an IoT based Mote for Remote Medicine Quality Monitoring

ABSTRACT. Several monitoring tools and systems are available today that can be utilized to keep medicine quality under control while being stored in medical environments. Those tools are mandatory with medications that are sensitive to changes in the environment, light, or mechanical stress due to the number of critical factors that must be watched out. When the patient is discharged from the hospital because he/she can carry out the treatment at home, he/she must take the medications with him/her, which, as has been demonstrated, are not stored diligently, in large part because the equipment available in our homes does not perform adequately. In this paper we present a proposal for a sensor node capable of remotely monitoring the conservation status of the drug. The solution is integrated into a possible IoT platform that would provide real-time data to the hospital pharmacy department. The design of the node, the characteristics of the contemplated sensors and the implementation are detailed in this work. The validation results show that an efficient use of the battery has been achieved and a very versatile solution to respond to multiple remote monitoring scenarios.

12:00-13:00 Session 16A: R6A New Computing Paradigms
Location: Room A
12:00
Novel Iterative Hebbian Learning Rule for Oscillatory Associative Memory
PRESENTER: Manuel Jiménez

ABSTRACT. Alternative paradigms to the von Neumann computing scheme are currently arousing huge interest. Oscillatory neural networks (ONNs) using emerging phase-change materials constitute an energy-efficient, massively parallel, brain-inspired, in-memory computing approach. The encoding of information in the phase pattern of frequency-locked, weakly coupled oscillators makes it possible to exploit their rich nonlinear dynamics and their synchronization phenomena for computing. A single fully connected ONN layer can implement an auto-associative memory comparable to that of a Hopfield network. Hebbian learning rule is the most widely adopted method for configuring ONNs for such applications, despite its well-known limitations. Other approaches that perform better than the Hebbian rule are not useful for ONN training due to the constraints imposed by its physical implementation. This paper proposes a new approach and compares it with previous work. The proposed method has been shown to produce competitive results in terms of pattern recognition accuracy with reduced precision in synaptic weights, and to be suitable for online learning.

12:20
Comparative Analysis of Neural Network Implementations for NILM Applications
PRESENTER: Alvaro Hernandez

ABSTRACT. Non-Intrusive Load Monitoring (NILM) comprises a set of techniques that try to disaggregate the energy consumption in a household or building, based on the measurements coming from a single-point smart meter. In this process, a key step is the load identification of the different appliances that may switched on/off during a certain interval under analysis. For that purpose, machine-learning techniques, including deep neural networks, have recently been involved. These classification algorithms may often imply a high computational cost, that might compromise a possible edge-computing implementation on the local smart meters. In this context, this work presents a preliminary comparison between two different real-time implementations of both neural networks, a dense one and a convolutional one, applied to load classification. The implementations are based on a FPGA approach and on a processor. In general terms, preliminary results show that the FPGA solution provides lower latencies than the processor one, at the expense of requiring a higher design effort and the appearance of a fixed-point quantization error.

12:40
Stochastic Computing-based on-chip Training Circuitry for Reservoir Computing Systems

ABSTRACT. Reservoir Computing (RC) is considered an optimal computational framework for the analysis of temporal data, where the training of RC systems is usually implemented through the use of a linear regression. Most of the RC hardware solutions present in the literature perform the training process off-chip at the server level, which increases processing time and overall power dissipation. This work proposes a non-iterative supervised learning method for RC systems that can be implemented in hardware using Stochastic Computing techniques. The proposal presents considerable advantages in terms of energy efficiency, hardware complexity and processing speed compared to traditional off-chip learning methods.

12:00-13:00 Session 16B: R6B Embedded Systems and SoC
Location: Room B
12:00
SoC Architecture for Acquisition and Processing of the EMG Signal

ABSTRACT. EMG (electromyography) is a technique used to measure the electrical activity of muscles and nerves, during contraction and relaxation, and is used in a variety of clinical and research applications. In particular, it is very useful for the diagnosis and evaluation of neuromuscular disorders and control electromechanical devices or prostheses. This signal has a low SNR and needs to undergo conditioning processes including amplification, filtering and digitisation for further processing. In order to improve the SNR of the EMG signals, this work describes a System-on-Chip (SoC) architecture for the acquisition and processing of the EMG signals, offering a modular and high performance solution. These signals could be applied to the movement of a therapeutic exoskeleton with the aim of improving active rehabilitation therapies for patients with incomplete spinal cord injury. The proposed architecture provides a modular solution that allows signal digitization, performed as close as possible to the electrode and minimizing transmission losses, signal noise and artifacts. In addition, sampling is performed at a higher sampling rate than commercial acquisition systems, while supporting significant processing throughput. The architecture uses the ADS1298R integrated circuit for multichannel acquisition and perform the correct conditioning of EMG signals, as well as integrate the communication module through Serial Peripheral Interface (SPI) interface to carry out the configuration and data transfers. In this case, the acquired signal is processed using a moving average-based algorithm and its thresholding to establish the muscle activity. The identified muscle activity could be used as an active reference to activate the therapeutic exoskeleton. The results show the validation of the proposed architecture with an EMG signal with a sampling rate of 8 kSPS.

12:20
Time-Sensitive Networking to meet Hard-real Time Boundaries on Edge Intelligence Applications
PRESENTER: Jesús Lázaro

ABSTRACT. This paper introduces an AI video analytic application that has been implemented on an edge-computing device. The application is designed to perform real-time object detection, specifically targeting road signaling cones. The primary focus of this work is to demonstrate the device's capability to accelerate the inference of AI models and video compression using dedicated hardware. Moreover, the critical information, including the location and size of the objects detected, is transmitted as hard-real-time traffic using deterministic Ethernet. This use case has been implemented in the context of the NATO Generic Architecture for Land Systems (NGVA). The paper provides an overview of the approach taken, including the hardware and software used, as well as the design flow followed to implement the solution. The results of the implementation are discussed in the concluding section of the paper, along with areas for future work.

12:40
SoC FPGA-based Multichannel Data Acquisition System with Linux-Baremetal AMP for Applications in the Field of Astrophysics

ABSTRACT. This paper describes the design of a System-on-Chip FPGA-based system for acquisition and processing of up to thirty-two simultaneous channels, its integration in a DAS module and its validation. The MPSoC receives the digital signals from two high resolution and high speed ADC (24 bits and 256 KSPS), adds UNIX real-time stamps to the samples with second or millisecond resolution, process the samples (e.g. does an average of temporal series of samples) and sends the processed samples as binary files to a remote server using Ethernet connection. In addition, the design allows the module to be controlled and configured by commands or by an interactive configuration menu, due to the DAS is composed of a configurable gain amplifier and analogue-to-digital conversion boards and a controllable internal power supply, and is even capable of shut down it in the event of a power supply failure. The processing system is implemented as a hardware/software solution. The Processing System implements an Asymmetric Multi-Processing System, coordinated using OpenAMP for data transfer and synchronization. Applications are client-server running on Linux OS, where the server runs on the ZedBoard and the client runs on a remote Linux computer. Both applications communicate via a TCP/IP connection.

13:30-15:00Lunch Break