TALK KEYWORD INDEX
This page contains an index consisting of author-provided keywords.
| 3 | |
| 3D reconstruction | |
| A | |
| Activation Sparsity | |
| actor-based concurrency | |
| Adaptive scheduling | |
| agentic-ai | |
| AI | |
| AI Accelerators | |
| AIPC | |
| Algorithmic Skeletons | |
| AlphaTensor | |
| AMR | |
| Analog in-memory computing | |
| analytics | |
| Apache Spark | |
| Application malleability | |
| Approximate Computing | |
| Approximation ratio | |
| ARM SVE | |
| Articulation points | |
| Asynchronous | |
| Asynchronous Data Processing | |
| Asynchronous Programming | |
| auction | |
| Auto-scaling | |
| auto-scheduling | |
| Autonomy Loops | |
| Autoscaling | |
| B | |
| Batch processing | |
| Benchmark Suite | |
| benchmark suite optimization | |
| Benchmarking | |
| Benchmarks | |
| Biconnected components | |
| Big Data | |
| Bioinformatics | |
| bottleneck detection | |
| Branch prediction | |
| Byzantine failures | |
| C | |
| Cache | |
| Cache management | |
| Caching / Paging | |
| Caching update | |
| carbon cost | |
| Carbon emissions | |
| CAS | |
| CCLs | |
| ccNUMA | |
| CFD | |
| Chapel | |
| Checkpointing | |
| cloud | |
| cloud computing | |
| Cloud computing | |
| Cloud Continuum | |
| Cloud Robotics | |
| cloud-to-thing | |
| Clouds | |
| CloudSim | |
| Clustering | |
| CNN Inference | |
| Co-Design Framework | |
| co-running | |
| Code generation | |
| collaborative system of systems | |
| Collective Algorithms | |
| Collective communication | |
| Competitive analysis | |
| compilation | |
| Compiler Optimization | |
| Composability | |
| Compressed Communication | |
| Computational Efficiency | |
| Computational Fluid Dynamics | |
| computational storage device | |
| computational workflows | |
| compute | |
| Compute Express Link (CXL) | |
| Computed Tomography | |
| Computer architecture | |
| Computing Continuum | |
| Concurrent kernel execution | |
| Consensus | |
| Continuous Profiling | |
| continuum | |
| Convolutional Neural Networks | |
| Convolutional Neural Networks (CNN) | |
| counting queries | |
| CPU | |
| CPU utilization | |
| Critical Path | |
| cross-facility workflows | |
| CUDA | |
| CUDA/HIP | |
| Cut vertices | |
| D | |
| DAG | |
| Data Analysis of Scientific Computing | |
| Data Augmentation | |
| Data Center | |
| Data Classification | |
| Data Compression | |
| data logistic | |
| Data preprocessing | |
| data redistribution | |
| data streaming | |
| Data structure | |
| data structures | |
| Data-intensive applications | |
| database | |
| datacenter | |
| Dataflow Optimization | |
| dataflow programming | |
| Debugging | |
| Decentralized | |
| Decentralized Federated Learning | |
| Decentralized Systems | |
| Decoupled AllReduce | |
| deep domain adaptation | |
| Deep Learning | |
| Deep Learning Serving Systems | |
| Deep Neural Network | |
| dense vectors | |
| Dependency Flagging System | |
| Dependency-aware Transaction Processing | |
| Device Heterogeneity | |
| DevOps | |
| Diffusion Model | |
| Diffusion Model Accelerator | |
| Diffusion Transformers | |
| Digital Twin | |
| Directed Acyclic Task Graph (DATG) scheduling QR factorization | |
| Disaggregated memory | |
| Distributed Computing | |
| Distributed computing | |
| Distributed deep learning | |
| Distributed Dense Linear Algebra | |
| Distributed Sparse Linear Algebra | |
| Distributed Systems | |
| Distributed training | |
| Distributed Training and Inference | |
| distributed workflows | |
| distributed-computing | |
| DMA | |
| DNN Training | |
| Docker containers | |
| DPDK | |
| DSL | |
| Dual-Cache | |
| Dynamic caching method | |
| Dynamic Graphs | |
| Dynamic programming | |
| Dynamic Resource Allocation | |
| dynamic resource allocation | |
| Dynamic Resource Management | |
| Dynamic resource management | |
| Dynamic Resources | |
| E | |
| eBPF | |
| Edge Accelerators | |
| Edge AI | |
| Edge computing | |
| Edge Network | |
| edge platform | |
| Edge-AI | |
| Edge-Cloud Continuum | |
| Education | |
| Efficiency | |
| Efficient Inference on Local platforms | |
| Elastic Computing | |
| Elastic HPC | |
| Electronic Design Automation | |
| Elixir | |
| Embedding Table | |
| Emerging Memory System | |
| Empirical Comparison | |
| energy awareness | |
| Energy consumption | |
| Energy Efficiency | |
| Energy measurement | |
| energy performance | |
| Energy-Aware 3D Gaussian Splatting | |
| energy-aware algorithms | |
| Energy-Aware Scheduling | |
| Energy-aware software engineering | |
| Ethernet | |
| Evolutionary computation | |
| evolving applications | |
| F | |
| FaaS | |
| Fault Tolerance | |
| Fault-Free | |
| Federated Learning | |
| Federated Learning | |
| FIM | |
| First-principles materials simulation | |
| Floating-Point Non-Associatvity | |
| Flooding | |
| Flowshop Scheduling | |
| fog | |
| FPGA | |
| FPGA Accelerator | |
| FPGA Demonstrator | |
| FPGAs | |
| function-as-a-service | |
| Functional array languages | |
| G | |
| GANs | |
| garbage collection | |
| Gate sizing | |
| GENE | |
| GENE-X | |
| Generate code | |
| Generative Adversarial Networks | |
| Genomics | |
| Gigapixel Images | |
| GPU | |
| GPU Acceleration | |
| GPU allocation | |
| GPU architectures | |
| GPU cache management | |
| GPU code generation | |
| GPU Computing | |
| GPU Computing | |
| GPU parallel | |
| GPU power modeling | |
| GPU programming | |
| GPU scheduling | |
| GPUs | |
| Grace Hopper | |
| Gradient Compression | |
| Graph Neural Network (GNN) | |
| Graph Neural Networks | |
| graph partitioning | |
| Graph Processing | |
| Graph Sampling | |
| GraphBLAS | |
| Graphics Processing Unit (GPU) | |
| Green Computing | |
| Green's Function | |
| Green500 | |
| Grid’5000 | |
| GROMACS | |
| H | |
| hardware acceleration | |
| Hardware Accelerator | |
| Hardware overprovisioning | |
| Hardware-Efficient Inference | |
| Heterogeneous | |
| Heterogeneous architecture | |
| Heterogeneous computing | |
| Heterogeneous Density Problem | |
| High Performance Computing | |
| High performance training | |
| High-Performance Computing | |
| High-Performance Computing (HPC) | |
| High-performance computing (HPC) systems | |
| High-performance numerical computing | |
| HPC | |
| HPC | |
| HPC applications | |
| HPC Cluster | |
| HPC Edge-To-Cloud | |
| HPC workloads | |
| Hybrid DMA-Cache | |
| Hyper-parameter optimization | |
| I | |
| I/O malleability | |
| imperfect verification | |
| Importance Sampling | |
| in situ | |
| Independent Learning | |
| Index structures | |
| Inference | |
| Inference Acceleration | |
| Inference Optimization | |
| Intermediate language | |
| IoT Sensors | |
| IR | |
| iterative algorithm | |
| J | |
| Job Scheduling | |
| K | |
| Kernel pairing | |
| Knowledge Graph (KG) | |
| Kubernetes | |
| Kubernetes | |
| KV cache | |
| L | |
| Large deep neural network training | |
| Large Graph | |
| Large Language Models | |
| Large-scale graphs | |
| latency detection | |
| lazy evaluation | |
| LBM | |
| Livestock Monitoring | |
| LLM | |
| LLM Inference | |
| LLM serving | |
| LLMs | |
| LLVM | |
| Load Balancing | |
| load-balancing | |
| log-structured file system | |
| loop fusion | |
| loop tiling | |
| Low-Power GPU Rendering | |
| M | |
| Machine Learning | |
| Machine Learning Workflows | |
| Malleability | |
| Medical applications | |
| Memory Hierarchy | |
| Memory Mapping | |
| Memory Resource Provisioning | |
| Memory Saving | |
| Metaheuristics | |
| Meteorological model | |
| Mixture-of-Experts | |
| MLIR | |
| Model Compression | |
| Model Partitioning | |
| Modelling | |
| molecular dynamics | |
| molecular dynamics simulation | |
| Monitoring | |
| MPI | |
| MPI Collective I/O | |
| MT-3000 | |
| Multi-Agent Reinforcement Learning | |
| Multi-DNN accelerators | |
| Multi-DNN Inference Serving | |
| Multi-GPU Training | |
| multi-rail communication | |
| multi-site workflows | |
| Multi-threaded | |
| multilinear algebra | |
| N | |
| Neural networks | |
| Neural operators | |
| Neuromorphic Computing | |
| noise injection | |
| nonblocking execution | |
| Nonlinear constrained optimization | |
| Novel Architectures | |
| Nowcasti | |
| numerical linear algebra | |
| O | |
| Observability | |
| Offloading | |
| Omnitrees | |
| On-chip memory | |
| Online trainning | |
| OpenCL benchmarks | |
| OpenMP | |
| Operational data analytics (ODA) | |
| Opreation Fusion | |
| Opreation Split | |
| Optimal Transport | |
| Optimizations | |
| Osteosarcoma | |
| Out-of-core processing | |
| oversubscription | |
| P | |
| PageRank | |
| Parallel | |
| Parallel algorithms | |
| Parallel Branch-and-Bound | |
| Parallel Computing | |
| Parallel Computing on GPUs | |
| parallel computing performance | |
| Parallel Graph Computations | |
| Parallel Processing | |
| Parallel Programming | |
| Parallel Programming Automation | |
| Parallel Programming Models | |
| Parallel SGD | |
| Parallel skeletons | |
| parallel-in-time | |
| Parallelism | |
| Parsl | |
| Particle Swarm Optimization | |
| Partitioning | |
| Peer-to-peer Networks | |
| Performance | |
| performance analysis | |
| Performance Analysis Tools | |
| Performance evaluation | |
| Performance optimaztion | |
| Performance Prediction | |
| Performance Tuning | |
| Permissioned blockchain framework | |
| Phase Analysis | |
| PMIx | |
| Polyhedral compilation | |
| Portability | |
| Portability | |
| Power consumption | |
| Privacy-Preserving Machine Learning | |
| Processing-in-Memory | |
| Program Analysis | |
| Programming | |
| Programming Languages | |
| Programming models | |
| Pruning | |
| PTX | |
| Pyramidal Analysis | |
| Q | |
| QoS | |
| quality of service | |
| Quantization | |
| Quantum Algorithm | |
| Quantum Data Storage | |
| Quantum Image Processing | |
| Quantum Signal Processing | |
| R | |
| Radio Astronomy | |
| Random Walks | |
| Real-Time Rendering Performance | |
| Recommender system | |
| Reduced precision | |
| refactoring | |
| Rejection Sampling | |
| Remote Offloading | |
| Reproducibility | |
| Residual | |
| resilience | |
| Resource Adaptivity | |
| Resource Allocation | |
| Resource Management | |
| Resource usage coordination | |
| RISC-V | |
| RMA | |
| ROS2 | |
| RTL simulation | |
| Run-off | |
| Runtime Analysis Tools | |
| Runtime systems | |
| Rust | |
| S | |
| SaaS | |
| Sampled Simulation | |
| Scalable | |
| Scalable Vector Extension | |
| scalar product | |
| Scheduling | |
| scheduling | |
| Scheduling and resource management | |
| Scientific Workflows | |
| Scientific workflows | |
| Sequencing read alignment | |
| Sequential Least-Squares Quadratic Programming(SLSQP) | |
| Serverless | |
| Serverless Computing | |
| service-level agreement | |
| Shared Memory | |
| silent error | |
| Simulation | |
| Simulation Point | |
| simulations | |
| Skipping Non-Zero (SkipNZ) | |
| SLA | |
| SLACK | |
| Slurm | |
| SMT Processors | |
| Software architecture | |
| Software Development | |
| software reengineering | |
| Sparse Architectures | |
| Sparse LU fatorization | |
| Sparse Matrix Multiplication | |
| Sparse Tensor Cores | |
| sparse vectors | |
| Spiking Neural Network | |
| Staleness | |
| Statistical Analysis | |
| Stencil | |
| stencil operations | |
| Stream Processing | |
| Streaming Graph Processing Systems (SGPSs) | |
| Sub-batching and Sub-batch merging | |
| Subtoken | |
| Sunway architecture | |
| supercomputers | |
| Sustainable AI | |
| sustainable computing | |
| Swirl | |
| SYCL | |
| system throughput | |
| Systems | |
| Systems for Machine Learning | |
| T | |
| Task graph | |
| Task graph computing system | |
| task graph parallelism | |
| Task-Based Programming | |
| Task-Operator Co-Scheduling | |
| Task-parallel linear algebra computations | |
| Tasking | |
| TBA1 | |
| TBA2 | |
| TBA3 | |
| tensor contraction ordering | |
| tensor decomposition | |
| tensor-train decomposition | |
| Termination | |
| testbed | |
| Thread-to-Core Allocation Policies | |
| Tianhe new-generation supercomputer | |
| Top500 | |
| Trace-driven Simulation | |
| Tracing and monitoring | |
| transfer learning | |
| Transform code | |
| Transformer | |
| Triangle Counting | |
| Trusted Execution Environment | |
| Tsunami Forecasting | |
| V | |
| Vector Processor | |
| Vector Unit | |
| Vertical scaling | |
| virtual machines | |
| virtual topologies | |
| Virtualization | |
| Vision Transformer | |
| Visualization | |
| W | |
| Weather Radar | |
| Wedge-Parallel Approaches | |
| WHPC | |
| Workflows | |
| workload prediction | |
| Write optimization | |