2015 IEEE High Performance
Extreme Computing Conference
(HPEC ‘15)
Nineteenth Annual HPEC Conference
15 - 17 September 2015
Westin Hotel, Waltham, MA USA
Graph & Sparse Data 1
1:00-2:40 in Eden Vale C1 - C2
Chair: Richard Lethin / Reservoir
Invited Talk: Faster Parallel Graph BLAS Kernels and
New Graph Algorithms in Matrix Algebra
Dr. Aydin Buluc, Research Scientist - Lawrence Berkeley
National Lab
[Best Student Paper Finalist]
Graphulo Implementation of Server-Side Sparse
Matrix Multiply in the Accumulo Database
Dylan Hutchison, University of Washington, Jeremy
Kepner, Vijay Gadepally, MIT Lincoln Laboratory, Adam
Fuchs, Sqrrl
The Apache Accumulo database excels at distributed
storage and indexing and is ideally suited for storing
graph data. Many big data analytics compute on
graph data and persist their results back to the
database. These graph calculations are often best
performed inside the database server. The
GraphBLAS standard provides a compact and
efficient basis for a wide range of graph applications
through a small number of sparse matrix operations.
In this article, we discuss a server- side
implementation of GraphBLAS sparse matrix
multiplication that leverages Accumulo’s native, high-
performance iterators. We compare the mathematics
and performance of inner and outer product
implementations, and show how an outer product
implementation achieves optimal performance near
Accumulo’s peak write rate. We offer our work as a
core component to the Graphulo library that will
deliver matrix math primitives for graph analytics
within Accumulo.
An Accelerated Procedure for Hypergraph
Coarsening on the GPU
Lin Cheng, Hyunsu Cho, Peter Yoon, Trinity College
One of the obstacles in accelerating sparse graph
applications using GPUs is load imbalance, which in
certain cases causes threads to stall. We investigate
a specific application known as hypergraph
coarsening and explore a technique for addressing
load imbalance. The hypergraph is a generalization
of the graph where one edge may connect more than
two nodes. Many problems of interest may be
expressed in terms of optimal partitioning of
hypergraphs where the edge cut is minimized. The
most costly step in hypergraph partitioning is
hypergraph coarsening, the process of grouping
nodes with similar connectivity patterns into one
node to yield a new hypergraph with fewer nodes.
Hypergraph coarsening proves to be computationally
challenging on GPUs because many hypergraphs
exhibit an irregular distribution of connections. To
address the resulting load imbalance, we explore a
novel task allocation scheme to distribute work more
evenly among GPU threads.
[Best Paper Finalist]
A Task-Based Linear Algebra Building Blocks
Approach for Scal-able Graph Analytics
Michael M. Wolf, Jonathan W. Berry, Dylan T. Stark, Sandia
It is challenging to obtain scalable HPC performance
on real applications, especially for data science
applications with irregular memory access and
computation patterns. To drive co- design efforts in
architecture, system, and application design, we are
developing miniapps representative of data science
workloads. These in turn stress the state of the art in
Graph BLAS-like Graph Algorithm Building Blocks
(GABB). In this work, we outline a Graph BLAS-like,
linear algebra based approach to miniTri, one such
miniapp. We describe a task-based prototype
implementation and give initial scalability results.
Sampling Large Graphs for Anticipatory Analytics
Lauren Edwards, Luke Johnson, Maja Milosavljevic, Vijay
Gadepally, Benjamin A. Miller, MIT Lincoln Laboratory
The characteristics of Big Data - often dubbed the
3V's for volume, velocity, and variety - will continue to
outpace the ability of computational systems to
process, store, and transmit meaningful results.
Traditional techniques for dealing with large datasets
often include the purchase of larger systems, greater
human-in-the-loop involvement, or through complex
algorithms. We are investigating the use of sampling
to mitigate these challenges, specifically sampling
large graphs. Often, large datasets can be
represented as graphs where data entries may be
edges and vertices may be attributes of the data. In
particular, we present the results of sampling for the
task of link prediction. Link prediction is a process to
estimate the probability of a new edge forming
between two vertices of a graph, and has numerous
application areas in understanding social or
biological networks. In this paper we propose a
series of techniques for the sampling of large
datasets. In order to quantify the effect of these
techniques, we present the quality of link prediction
tasks on sampled graphs, and the time saved in
calculating link prediction statistics on these sampled
graphs.
Wednesday September 16