Understanding the Effect of Task Granularity on Execution Time in Asynchronous Many-Task Runtime Systems
SparCity: An Optimization and Co-Design Framework for Sparse Computation
A GPU Architecture Aware Fine-Grain Pruning Technique for Deep Neural Networks
An Automata–Based Approach to Profit Optimization of Cloud Brokers in IaaS Environment
Kernel Fusion in OpenCL
Integrating Fog Computing and Blockchain Technology for Applications with Enhanced Trust and Privacy
European Processor Initiative and EU Projects Towards Exascale Computing
A Log-Linear (2+5/6)-Approximation Algorithm for Parallel Machine Scheduling with a Single Orthogonal Resource
FleCSI 2.0: the Flexible Computational Science Infrastructure Project
Enabling Support for Zero Copy Semantics in an Asynchronous Task-Based Programming Model
Data Management in EpiGraph COVID-19 Epidemic Simulator
An MPI-Parallel Algorithm for Mapping Complex Networks onto Hierarchical Architectures
On Using Modern C++ and Nested Recursive Task Parallelism for HPC Applications with AllScale
Decentralisation over Privacy: an Analysis of the Bisq Trade Protocol
An Experimental Study of SYCL Task Graph Parallelism for Large-Scale Machine Learning Workloads
Low-Overhead Reuse Distance Profiling Tool for Multicore
Algorithm Design for Tensor Units
G-Morph: Induced Subgraph Isomorphism Search of Labeled Graphs on a GPU
High Performance Computing with Java Streams
Scalable Hybrid Parallel ILU Preconditioner to Solve Sparse Linear Systems
Accelerating Graph Applications Using Phased Transactional Memory
A Fault Tolerant and Deadline Constrained Sequence Alignment Application on Cloud-Based Spot GPU Instances
A Novel Bi-Objective Optimization Algorithm on Heterogeneous HPC Platforms for Applications with Continuous Performance and Linear Energy Profiles
Smart Contract Based Public Procurement to Fight Corruption
Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization
Exploring the Impact of Node Failures on the Resource Allocation for Parallel Jobs
Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems
Automatic Low-Overhead Load-Imbalance Detection in MPI Applications
Efficient GPU Computation Using Task Graph Parallelism
Towards a Broadcast Time-Lock Based Token Exchange Protocol
HPC for Bioinformatics: the Genetic Sequence Comparison Quest for Performance
Sustaining Performance While Reducing Energy Consumption: a Control Theory Approach
Porting Sparse Linear Algebra to Intel GPUs
GPU Accelerated Mahalanobis-Average Hierarchical Clustering Analysis
Continuous Self-Adaptation of Control Policies in Automatic Cloud Management
Interferences Between Communications and Computations in Distributed HPC Systems
Application-Based Fault Tolerance for Numerical Linear Algebra at Large Scale
SMART: a Tool for Trust and Reputation Management in Social Media
Particle-In-Cell Simulation using Asynchronous Tasking
Plan-Based Job Scheduling for Supercomputers with Shared Burst Buffers
Towards an Efficient Sparse Storage Format for the SpMM Kernel in GPUs
Data Management Model to Program Irregular Compute Kernels on FPGA: Application to Heterogeneous Distributed System
Collaborative, Distributed, Scalable and Low-Cost Plat-Form Based on Microservices, Containers, Mobile Devices and Cloud Services to Solve Compute-Intensive Tasks
Monitoring Collective Communication Among GPUs
E2EWatch: an End-to-End Anomaly Diagnosis Framework for Production HPC Systems
Elastic Deep Learning Using Knowledge Distillation with Heterogeneous Computing Resources
Locality-Aware Scheduling of Independent Tasks for Runtime Systems
Taming Tail Latency in Key-Value Stores: a Scheduling Perspective
Fault-Tolerant LU Factorisation Is Low Cost
Outsmarting the Atmospheric Turbulence for Ground-Based Telescopes Using the Stochastic Levenberg-Marquardt Method