2018 IEEE High Performance
Extreme Computing Conference
(HPEC ‘18)
Twenty-second Annual HPEC Conference
25 - 27 September 2018
Westin Hotel, Waltham, MA USA
Database Operations in D4M.jl
Lauren Milechin (MIT EAPS)*; Vijay Gadepally (MIT Lincoln Laboratory); Jeremy Kepner (MIT Lincoln Laboratory)
Each step in the data analytics pipeline is important, including database ingest and query. The D4M-Accumulo database connector
has allowed analysts to quickly and easily ingest to and query from Apache Accumulo using MATLAB®/GNU Octave syntax. D4M.jl,
a Julia implementation of D4M, provides much of the functionality of the original D4M implementation to the Julia community. In this
work, we extend D4M.jl to include many of the same database capabilities that the MATLAB®/GNU Octave implementation
provides. Here we will describe the D4M.jl database connector, demonstrate how it can be used, and show that it has comparable or
better performance to the original implementation in MATLAB®/GNU Octave.
High-Performance Embedded Computing (HPEC) and Machine Learning Demonstrated in Flight Using Agile Condor®
Mark Barnell (Air Force Research Laboratory)*
ABSTRACT: For the first time ever, advanced machine learning (ML) compute architectures, techniques, and methods were
demonstrated in flight (in June-August 2017 and May 2018) on the recently invented high-performance embedded computing
(HPEC) architecture called Agile Condor (U.S. Patent Pending #5497944). The Air Force Research Laboratory (AFRL) Information
Directorate Advanced Computing and Communications Division continues to develop and demonstrate new computing
architectures, designed to provide HPEC ground and airborne (pod-based) solutions to meet operational and tactical, real-time
processing for intelligence, surveillance, and reconnaissance (ISR) mission needs. Agile Condor is a scalable system based on
open industry standards that continues to demonstrate the ability to increase, far beyond the current state-of-the-art, computational
capability within the restrictive size, weight and power (SWaP) constraints of unmanned aircraft systems’ external “pod” payloads.
This system is enabling the exploration and development of innovative system solutions to meet future Air Force real-time HPEC
needs; e.g., multi-mission and multi-function ISR processing and exploitation. The Agile Condor system innovations include: (1) a
cost-effective and flexible compute architecture, (2) support for multiple missions, (3) facilitating realistic, repeatable
experimentation, and (4) enabling related experimentation and applications for operational exploitation of a wide range of
information products. On the recent collection, demonstration and data collection efforts, information was simultaneously processed
in a parallelized approach using two distinct ML approaches. This approach enabled real-time trade-space analyses and the ability
to immediately contrast and compare the approaches. The data processing also included the exploitation of data from multiple
sensors, such as optical, full-motion video (FMV), and radar. Thereby, Agile Condor’s heterogenous computing architecture
continues to accelerate the development of extreme computing technologies and ML algorithms necessary to exploit data on a
neuromorphic compute platform upstream, closer to the sensors. The ML techniques that can be utilized include, but are not limited
to, specialized deep neural networks (DNN), convolutional neural networks (CNN), and recurrent neural networks (RNN) that
support sequential/temporal data products and applications for exploitation, pattern recognition, and autonomous operation.
All-at-once Decomposition of Coupled Billion-scale Tensors in Apache Spark
Aditya Gudibanda (Reservoir Labs)*; Thomas Henretty (Reservoir Labs); Muthu M Baskaran (Reservoir Labs); James Ezick
(Reservoir Labs); Richard Lethin (Reservoir Labs)
As the scale of unlabeled data rises, it becomes increasingly valuable to perform scalable, unsupervised data analysis. Tensor
decompositions, which have been empirically successful at finding meaningful cross-dimensional patterns in multidimensional data,
are a natural candidate to test for scalability and meaningful pattern discovery in these massive real-world datasets. Furthermore,
the production of big data of different types necessitates the ability to mine patterns across disparate sources. The coupled tensor
decomposition framework captures this idea by supporting the decomposition of several tensors from different data sources
together. We present a scalable implementation of coupled tensor decomposition on Apache Spark. We introduce nonnegativity and
sparsity constraints, and perform all-at-once quasi-Newton optimization of all factor matrix parameters. We present results showing
the billion-scale scalability of this novel implementation and also demonstrate the high level of interpretability in the components
produced, suggesting that coupled, all-at-once tensor decompositions on Apache Spark represent a promising framework for large-
scale, unsupervised pattern discovery.
Interactive Launch of 16,000 Microsoft Windows Instances on a Supercomputer
Michael S Jones (MIT Lincoln Laboratory)*; Jeremy Kepner (MIT Lincoln Laboratory)
Simulation, machine learning, and data analysis require a wide range of software which can be dependent upon specific operating
systems, such as Microsoft Windows. Running this software interactively on massively parallel supercomputers can present many
challenges. Traditional methods of scaling Microsoft Windows applications to run on thousands of processors have typically relied
on heavyweight virtual machines that can be inefficient and slow to launch on modern manycore processors. This paper describes a
unique approach using the Lincoln Laboratory LLMapReduce technology in combination with the Wine Windows compatibility layer
to rapidly and simultaneously launch and run Microsoft Windows applications on thousands of cores on a supercomputer.
Specifically, this work demonstrates launching 16,000 Microsoft Windows applications in 5 minutes running on 16,000 processor
cores. This capability significantly broadens the range of applications that can be run at large scale on a supercomputer.
Thursday, September 27, 2018
High Performance Data Analysis 2
1:00-2:40 in Eden Vale C1/C2
Chair: Sid Samsi / MIT