ABSTRACT. Range aggregation query is a fundamental operation in
data analysis, which computes statistics such as maximum,
average, and standard deviation of subset of records
specified by the range condition of the query.
Since data analysis has trial-and-error nature, data scientists
repeatedly dispatch tons of range aggregation queries by
changing range conditions, resulting in heavy workloads.
Partial aggregation methods accelerate such repetitive range
aggregation queries by partitioning data into several groups,
caching partial query result on each group, and answering
queries by combining those partial results leveraging
divide-and-conquer characteristic of aggregation operations.
However, conventional partial aggregation methods have
severe trade-off between the amount of cached aggregations
and the performance; it needs to cache n aggregation values
to avoid I/Os where n is the size of the domain of attributes
in selection conditions.
Motivated by the issue, this paper presents
Adaptive Partial Aggregation Tree (APA-Tree),
which can reduce the number of cached aggregation values
to arbitrary defined s. With this constraint, APA-Tree
try to minimize the I/Os under the locality of reference
assumption; APA-Tree finely computes aggregation values
on frequently accessed data and coarsely on rarely accessed
data.
As a result, APA-Tree successfully accelerates
repetitive aggregation queries with a small number of
pre-computed aggregation values compared to conventional
methods. Experimental results confirms that APA-Tree
outperforms conventional partial aggregation methods in
terms of the amount of I/Os especially in skewed workloads.
Enhancement of Algebraic Block Multi-Color Ordering for ILU Preconditioning and Its Performance Evaluation in Preconditioned GMRES Solver
SPEAKER: unknown
ABSTRACT. Algebraic block multi-color ordering is known as a paralleliza- tion method for a sparse triangular solver. In the previous work, we confirmed the effectiveness of the method in a multi- threaded ICCG solver for a linear system with a symmetric coefficient matrix. In this study, we enhance the method so as to deal with an unsymmetric coefficient matrix. We develop a multi-threaded ILU-GMRES solver based on the enhanced method and evaluate its performance in terms of both the runtime and the number of iterations.
A Tool Supported Approach to Precisely Identify Memory Performance Problems
SPEAKER: unknown
ABSTRACT. In high performance computing applications many per-
formance problems are caused by the memory system.
Such performance bugs are hard to identify precisely.
Thus analysis tools play an important role in perfor-
mance optimization. We present a specialized memory
performance analysis tool which relies on Linux Perf
to interface the hardware. Our tool design is simple,
easy to use, easily extend-able and provides support for
many existing and upcoming processors without signif-
icant implementation effort. This tool is able to guide
programmers towards the most promising optimization
opportunities. It is able to report the location in the
source code, the concerned objects and in most cases
point out a specific performance problem occurring at
this location. We demonstrate this in a number of case
studies which include applications from PARSEC.