Program for Friday, November 28th

DCIS2025: 40TH CONFERENCE ON DESIGN OF CIRCUITS AND INTEGRATED SYSTEMS

PROGRAM AUTHORS KEYWORDS

PROGRAM FOR FRIDAY, NOVEMBER 28TH

Days:

View: session overview talk overview

08:00-09:00 Session 12: Registration

09:00-10:30 Session 13

Keynote: TBD

10:30-11:00Coffee Break & Posters

11:00-12:30 Session 14A: Hardware Accelerators

Chairs:

Javier Uceda and Jaime Jiménez

11:00	Francisco Albertuz and Mario Garrido Hardware-Efficient Gaussian and Sobel Filters for Real-Time Image Processing on FPGA ABSTRACT. This work presents compact Gaussian and Sobel convolutional filters for efficient image smoothing and edge detection on field-programmable gate arrays (FPGAs). To design the architectures, we mathematically analyze the convolution operations performed by both filters and develop computationally efficient implementations. We also apply several digital circuit optimization techniques, such as resource sharing, operation sequencing, and memory optimization, to further reduce area usage and power consumption. The proposed approaches are also suitable for real-time high-speed applications, since they are designed to work with a continuous data stream at a high clock frequency. The implemented circuits have been analyzed regarding resource usage, speed, and power consumption. A comparison with several previous FPGA implementations from the literature shows that the proposed designs significantly reduce logic usage and latency. Memory usage and power consumption are also reduced compared to generic convolutional architectures.
11:30	Lluís Ribas-Xirgo Hardware implementation of the Hungarian algorithm for optimum task assignments ABSTRACT. Optimal solutions of task assignment problems like, e.g., subcarrier allocation for OFDM, are solved in polynomial time with the Hungarian algorithm. The sequential nature of the related procedure makes it difficult to accelerate its execution. In this work, several hardware implementations of a state version of the algorithm are presented and compared. Results show that the single-memory architecture delivers similar energy consumption than the software version, but multi-block memory systems can significantly cut execution times as well as hardware resources.
12:00	Pablo Hormigo-Jimenez and Javier Hormigo Configurable Ultra-High-Throughput QRD FPGA Accelerators for small matrices ABSTRACT. QR decomposition is an essential operation in matrix algebra that is applicable in many fields, such as signal processing, automatic control, communications, and physics simulations. The QRD computation is the system's bottleneck in many of these applications. This paper presents a configurable ultra-high-throughput accelerator for FPGAs designed using High-Level Synthesis language. The accelerator is arranged as a 2D-systolic array of Givens rotators based on the CORDIC algorithm. Its dimension can be configured at compilation time to fit the matrix size. Similarly, the degree of parallelism in the rotators can be configured, such as the computation throughput goes from one n x n matrix every 2n clock cycles up to one matrix every clock cycle.
12:15	Rubén Nieto, Silvia Iniesta, Santiago Murano, Pedro R. Fernández and Susana Borromeo FPGA-Based Implementation of sEMG Feature Extraction and Movement Classification with MLP ABSTRACT. The difficulty in determining movement intention in patients with partial spinal cord injury or stroke sequelae has created the need to develop efficient and intelligent frameworks to decode movement intention through muscle signals. However, the analysis of surface electromyography (sEMG) signals from the lower limbs presents challenges due to their susceptibility to noise and the complexity of extracting meaningful features and establishing robust classification models. Despite these challenges, sEMG signal analysis enables non-invasive assessment of muscle activity, which is fundamental in rehabilitation and assistive technologies, optimizing the monitoring and control of motor function. This study proposes a method for sEMG feature extraction and lower limb movement classification using a neural network based on a multilayer perceptron (MLP). The objective is to explore an efficient method for classifying lower limb movements through sEMG signals, implementing the neural network on a System-on-Chip (SoC) device based on FPGAs. The classification is based on the acquisition of sEMG signals from eight muscle groups to differentiate between sitting, standing up, remaining still, and walking movements. To achieve this, features such as Root Mean Square (RMS), Mean Absolute Value (MAV), Integrated sEMG (IEMG), and Simple Squared Integration (SSI) are extracted from the sEMG signal and fed into the network. The results show an average classification accuracy of 92.50%. From a hardware perspective, it is shown that implementation on a Zynq-7000-based SoC device is feasible, suggesting future development directions for real-time applications.

11:00-12:30 Session 14B: RF & Communications

Chairs:

Roc Berenguer and Jose Ángel Miguel Díaz

11:00	Uxua Esteban-Eraso, Gesler Ramos, Santiago Celma, Francisco José Torcal-Milla and Carlos Sanchez-Azqueta Design of a CMOS Transmitter Chain for Satellite on the Move Communications ABSTRACT. One of the main features that are expected for 6G networks will be their ability to have a three-dimensional (3D) extension. In contrast with current communications networks, in which the transimission of the signals is carried out at the surface level, this will allow true global coverage including oceans and vast unpopulated areas. 3D global coverage has become a strategic goal and efforts are being put into its implementation with the current deployment of 5G networks and in satellite communications (SATCOM). This requires minimizing the energy losses experienced by electromagnetic waves in the GHz bands. To achieve this goal, the most promising solution is the use of active antenna arrays incorporating smart beamforming techniques to achieve directionality in signal transmission. This work presents the design and simulation of the transmitter path of an active antenna array operating within the European downlink frequency band for SATCOM on the move (SOTM) applications (17.7 to 21.2 GHz). It incorporates a compact phase shifter based on a vector-sum architecture and a power amplifier based on the concatenation of neutralized differential common-source pairs. The transmitter has been designed in a 65nm CMOS MM/RF process. It achieves a phase resolution of 11.25° over the 360° range, with a gain of approximately 22.5 dB using a 5-bit control word while consuming 30mW.
11:30	Alvaro Urain, David Del Rio, Andoni Beriain, Hector Solar, Roc Berenguer and Aleksei Nerushenko Design of a 160-210 GHz SiGe HBT Square-Law Detector for Total Power Radiometers ABSTRACT. This paper presents a square-law detector with a performance suitable for radiometric applications that is designed with the SiGe 0.13~$\mu$m BiCMOS SG13G2 technology offered by IHP. The implemented topology is based on an HBT transistor in common-emitter topology with a differential output. The proposed detector is centered around 178~GHz and presents a great balance between the post-layout responsivity, noise, and power consumption performance when compared to the rest of the works of the SoA, with a simulated maximum responsivity of 160~kV/W, a minimum NEP of 1.37~pW/$\sqrt{Hz}$, and a power consumption of 0.32~mW.
12:00	F. Bonfiglio-Buendía, Natalia-Abel Fernández-García, P. López, Victor M. Brea and Diego Cabello CMOS SPDT Switch Topologies in the Frequency Range of 6 to 20 GHz ABSTRACT. This work builds on a recent contribution from the literature to improve isolation on single-pole single-throw switches through the combination of an additional transistor on the gate of the series transistors and a custom fabrication process. This paper explores how said gate transistor affects the different figures of merit of different topologies of single-pole double-throw switches in the 6 to 20~GHz range for a 130~nm technology process with standard CMOS RF transistors. The insights drawn from post-layout simulations show that the additional gate transistor leads to higher isolation at the cost of a worse overall figure of merit that combines area, isolation, power, and insertion loss.

12:30-14:00 Session 15A: AI-Driven Development of High-Performance Electronic Systems and Applications-2

Chairs:

Pablo Sanchez and Soledad Escolar

12:30	Juan Gallego, José Ferreira, Luís Alves, Daniel Vázquez, João Bispo, Alfonso Rodríguez, Nuno Paulino and Andrés Otero Acceleration of C/C++ Kernels and ONNX Models on CGRAs with MLIR-Based Compilation ABSTRACT. Executing AI at the edge is challenging due to tight energy and computational constraints. Heterogeneous platforms, particularly those incorporating CGRA, offer a compelling trade-off between hardware specialization and programmability, supporting spatially distributed and energy-efficient computation. Despite their potential, the deployment of applications on CGRA accelerators remains limited by the lack of practical toolchains and methodologies. In this work, we propose a compilation flow based on MLIR to enable the seamless integration of both C/C++ kernels and ONNX-based AI models into a RISC-V system augmented with a CGRA accelerator. Our approach extracts the underlying DFG from the high-level representation. It maps it onto the CGRA using an ILP mapper that accounts for the accelerator's architectural constraints. A custom backend completes the toolchain by generating the necessary binaries for coordinated execution across the RISC-V processor and the CGRA. This framework enables the practical deployment of heterogeneous edge workloads, combining the flexibility of software execution with the efficiency of hardware acceleration.
13:00	Maryam Katebzadeh, Daniel Vaquez, Andres Otero and Alfonso Rodriguez A Framework for Automated CGRA Design Space Exploration with Genetic Algorithm Optimization ABSTRACT. The rapid growth of compute-intensive applications has created a pressing need for computing architectures that effectively balance flexibility, efficiency, and performance. While Field-Programmable Gate Arrays (FPGAs) offer a good level of flexibility, they suffer from high configuration overhead and energy consumption. Coarse-Grained Reconfigurable Architectures (CGRAs) provide a more energy-efficient alternative with lower configuration costs. They can be customized for domain-specific applications by modifying their coarse-grained processing elements to execute particular sequences of operations. In fact, their domain-specific nature can be used to further improve their energy efficiency and reduce their area overhead by exploiting computing fabric specialization. This can be achieved by replacing homogeneous processing elements with a subset of heterogeneous, more optimized ones that are specifically suited to the target application domain. However, achieving an optimal CGRA configuration requires extensive design space exploration (DSE), which involves evaluating many architectural possibilities. Existing CGRA frameworks struggle with slow and inefficient exploration due to long runtimes and constrained customization options. These issues make it hard to find the best configurations rapidly. To tackle these challenges, this paper presents Genetic Algorithm-based CGRA Generator (GA-CG), a framework that enhances DSE in the CGRA design process. GA-CG uses a genetic algorithm to discover an efficient structural configuration, thereby improving resource utilization and reducing power consumption.
13:30	Irene Merino-Fernandez, José Manuel Cruz Acosta, Javier del Pino and Sunil L. Khemchandani Machine Learning for Microwave Pixelated Structures Design ABSTRACT. The growing complexity of wireless communication systems has highlighted the need for innovative methods to optimize the design of passive radiofrequency (RF) networks. This work presents a novel AI-driven approach for the electromagnetic-free design and optimization of square-shaped passive RF filter models. The method relies on 16×16 matrices composed of randomly placed metallic squares and ports. These structures are used to generate a large and diverse dataset, which feeds a deep artificial neural network (ANN) trained to predict scattering parameters (S-parameters) with high accuracy. Due to the vast design space, with more than $2^{256}$ possible configurations, genetic algorithms (GAs) are used to guide the optimization process, employing the ANN for real-time evaluation. This strategy eliminates the reliance on time-consuming electromagnetic simulations while enabling the efficient exploration of complex square-based architectures, ultimately achieving high-performance RF filter designs with minimal computational cost.

12:30-14:00 Session 15B: Quantum and low power

Chairs:

Eduard Alarcon and Jose Maria Lopez Villegas

12:30	Aleksei Nerushenko, Hector Solar, Roc Berenguer and Alvaro Urain A 1.15 mW SiGe BiCMOS Cryogenic LNA for Superconducting Qubit Readout with 4.5 K Noise Temperature from 4 to 9 GHz ABSTRACT. This work presents the design and post-layout simulation of a cryogenic SiGe BiCMOS low-noise amplifier (LNA) for superconducting transmon-qubit readout in quantum processors scaling beyond hundreds of qubits. The target specifications are first established and justified by surveying state-of-the-art of cryogenic LNAs and modeling the dispersive qubit readout process. The LNA is implemented with three cascaded stages in common-emitter configuration and employs tuned inductive matching and parallel peaking networks in each stage to optimize noise, gain flatness, and bandwidth while maintaining minimal DC power consumption. The amplifier draws only 1.15 mW from a 0.15 V supply and occupies 0.252 mm². Post-layout simulations confirm input/output S-parameter matching better than –10 dB, 41–44 dB gain with <3 dB ripple, <5 K noise temperature across 4–9 GHz, and a worst-case OP1 dB compression point of –19.96.6 dBm. A comparative analysis demonstrates that SiGe BiCMOS offers a favorable trade-off between InP HEMT’s low noise and CMOS’s integration potential for large-scale quantum processors.
13:00	Ainhoa Leal, Luis Montal, Aleksei Nerushenko, Hector Solar and Roc Berenguer A Methodology for Cryogenic Modeling of CMOS Technology Based on BSIM-BULK ABSTRACT. This paper presents a methodology for adjusting the BSIM-BULK model (formerly BSIM6) to simulate the behavior of NMOS and PMOS transistors at cryogenic temperatures, down to 4.2 K. The study analyzes existing characterization data for 28 nm bulk CMOS processes, identifying the threshold voltage (VTH), subthreshold swing (SS), and low-field mobility (μ0) as the transistor parameters most significantly impacted by cryogenic temperatures. Based on this analysis, it is shown that VTH is expected to increase by 100-150 mV from 300 K to 4.2 K, SS approximately 60 mV/decade from 300 K to 4.2 K, and μ0 increases, approximately doubling from 300 K to 4.2 K. In addition, a practical strategy is proposed to modify specific parameters that capture these temperature dependencies on the BSIM-BULK model. The different methods to verify the cryogenic behavior are described and applied to 3 μm/28 nm NMOS and PMOS transistors at simulation level for validation.
13:15	Muhammad Umer Khalid, Trond Ytterdal and Snorre Aunet Robust DTMOS Schmitt-Trigger Circuits in 130 nm SOI CMOS for Sub-100 mV Supply Voltage ABSTRACT. This work proposes to utilize a dynamic threshold voltage MOSFET (DTMOS) technique for Schmitt-Trigger-based circuits that significantly enhances the Ion/Ioff ratio while improving robustness against process variations and mismatch. We designed and validated DTMOS Schmitt Trigger (DST) inverter and NAND gates in a commercial 130 nm SOI CMOS technology. Comprehensive post-layout simulations compared noise margins, power dissipation and propagation delay against standard Schmitt-Trigger implementations. Monte Carlo analysis demonstrates that our proposed circuits achieve 99.9% yield at an ultra-low supply voltage of 60 mV. Evaluation of 11-stage inverter- and NAND-based ring oscillators revealed that the DST-based circuits deliver 24-27% improved energy efficiency and 30-37% reduced delay at 90 mV operation, with only a small area overhead. Furthermore, the minimum operating voltage is reduced by 12.5%, with the DST inverter demonstrating functionality at voltages as low as 40 mV.
13:45	Tom Bergmann, Joel Damiens, Alfonso Hildebrand Rueda, Stephane Lacouture, Remy Cellier and Nacer Abouchi A Programmable, Negative, and Dynamically Biased Sampler for Ultra-Low Power Body-Bias Generators in 18nm FD-SOI PRESENTER: Tom Bergmann ABSTRACT. With the rapid growth of Internet of Things (IoT), the demand for Ultra-Low Power (ULP) circuits increased, making power management circuits a critical need. In this context, Fully Depleted Silicon on Insulator (FD-SOI) technology is indicated thanks to an enhanced body biasing possibility’s. An Adaptive Body-Biasing (ABB) circuit is therefore needed and must meet the ULP constraints. To address this challenge, a programmable negative sampling solution optimized for ABB circuit in 18 nm FD-SOI technology is proposed. The sampling rate ranges from 10 kHz to 100 MHz and introduces specific design techniques to enhance power consumption and area efficiency. The circuit is composed of a modified 6-bit binaryweighted Capacitive Digital to Analog Converter (CDAC) coupled with a dynamic comparator. This solution exhibits only dynamic power consumption, making it a suitable solution for frequency regulation of the ABB circuit. This implementation can achieve a power reduction up to x460 at 10 kHz and a silicon area reduction by x7.5 compared to a previously implemented design that relies on static bias functions in the same technology.

14:00-14:30 Session 16: Closing Ceremony

Chairs:

Eugenio Villar and Marisa Lopez-Vallejo

14:30-16:00Lunch

Disclaimer | Powered by EasyChair Smart Program