2019 IEEE High Performance Extreme Computing Conference (HPEC ‘19) Twenty-third Annual HPEC Conference 24 - 26 September 2019 Westin Hotel, Waltham, MA USA
Thursday, September 26, 2019 BRAIDS: Boosting Resilience through Artificial Intelligence and Decision Support 1 1:00-2:40 in Eden Vale A1/A2 Chair: Alexia Schulz / MIT-LL, Pierre Trepagnier / MIT-LL, Igor Linkov / ACE, Matthew Bates / ACE Combining Tensor Decompositions and Graph Analytics to Provide Cyber Situational Awareness at HPC Scale James Ezick, Tom Henretty, Muthu Baskaran, Richard Lethin (Reservoir Labs), John Feo (PNNL), Tai-Ching Tuan, Christopher Coley (Univ. Maryland), Leslie Leonard, Rajeev Agrawal, William Glodek, Ben Parsons (U.S. Army ERDC) This paper describes MADHAT (Multidimensional Anomaly Detection fusing HPC, Analytics, and Tensors), an integrated workflow that demonstrates the applicability of HPC resources to the problem of maintaining cyber situational awareness. MADHAT combines two high- performance packages: ENSIGN for large-scale sparse tensor decompositions and HAGGLE for graph analytics. Tensor decompositions isolate coherent patterns of network behavior in ways that common clustering methods based on distance metrics cannot. Parallelized graph analysis then uses directed queries on a representation that combines the elements of identified patterns with other available information (such as additional log fields, domain knowledge, network topology, whitelists and blacklists, prior feedback, and published alerts) to confirm or reject a threat hypothesis, collect context, and raise alerts. MADHAT was developed using the collaborative HPC Architecture for Cyber Situational Awareness (HACSAW) research environment and evaluated on structured network sensor logs collected from Defense Research and Engineering Network (DREN) sites using HPC resources at the U.S. Army Engineer Research and Development Center DoD Supercomputing Resource Center (ERDC DSRC). To date, MADHAT has analyzed logs with over 650 million entries. Proactive Cyber Situation Awareness via High Performance Computing Allan Wollaber, Jaime Peña, Benjamin Blease, Leslie Shing, Kenneth Alperin, Serge Vilvovsky, Pierre Trepagnier (MIT-LL), Neal Wagner (STR), Leslie Leonard (U.S. Army ERDC) Cyber situation awareness technologies have largely been focused on present-state conditions, with limited abilities to forward-project nominal conditions in a contested environment. We demonstrate an approach that uses data-driven, high performance computing (HPC) simulations of attacker/defender activities in a logically connected network environment that enables this capability for interactive, operational decision making in real time. Our contributions are three-fold: (1) we link live cyber data to inform the parameters of a cybersecurity model, (2) we perform HPC simulations and optimizations with a genetic algorithm to evaluate and recommend risk remediation strategies that inhibit attacker lateral movement, and (3) we provide a prototype platform to allow cyber defenders to assess the value of their own alternative risk reduction strategies on a relevant timeline. We present an overview of the data and software architectures, and results are presented that demonstrate operational utility alongside HPC-enabled runtimes. [Best Student Paper Finalist] A Survey of Attacks and Defenses of Edge-Deployed Neural Networks Mihailo Isakov (Boston Univ.), Vijay Gadepally (MIT-LL), Karen M. Gettings (MIT-LL), Michel A. Kinsy (Boston Univ.) Deep Neural Network (DNN) workloads are quickly moving from datacenters onto edge devices, for latency, privacy, or energy reasons. While datacenter networks can be protected using conventional cybersecurity measures, edge neural networks bring a host of new security challenges. Unlike classic IoT applications, edge neural networks are typically very compute and memory intensive, their execution is data- independent, and they are robust to noise and faults. Neural network models may be very expensive to develop, and can potentially reveal information about the private data they were trained on, requiring special care in distribution. The hidden states and outputs of the network can also be used in reconstructing user inputs, potentially violating users' privacy. Furthermore, neural networks are vulnerable to adversarial attacks, which may cause misclassifications and violate the integrity of the output. These properties add challenges when securing edge- deployed DNNs, requiring new considerations, threat models, priorities, and approaches in securely and privately deploying DNNs to the edge. In this work, we cover the landscape of attacks on, and defenses, of neural networks deployed in edge devices and provide a taxonomy of attacks and defenses targeting edge DNNs. Hypersparse Neural Network Analysis of Large-Scale Internet Traffic Jeremy Kepner (MIT LLSC), Kenjiro Cho (Internet Initiative Japan), KC Claffy (UCSD), Vijay Gadepally (MIT LLSC), Peter Michaleas (MIT LLSC), Lauren Milechin (MIT EAPS) The Internet is transforming our society, necessitating a quantitative understanding of Internet traffic. Our team collects and curates the largest publicly available Internet traffic data containing 50 billion packets. Utilizing a novel hypersparse neural network analysis of “video” streams of this traffic using 10,000 processors in the MIT SuperCloud reveals a new phenomena: the importance of otherwise unseen leaf nodes and isolated links in Internet traffic. Our neural network approach further shows that a two-parameter modified Zipf-Mandelbrot distribution accurately describes a wide variety of source/destination statistics on moving sample windows ranging from 100,000 to 100,000,000 packets over collections that span years and continents. The inferred model parameters distinguish different network streams and the model leaf parameter strongly correlates with the fraction of the traffic in different underlying network topologies. The hypersparse neural network pipeline is highly adaptable and different network statistics and training models can be incorporated with simple changes to the image filter functions.
Thursday, September 26, 2019 BRAIDS: Boosting Resilience through Artificial Intelligence and Decision Support 1 1:00-2:40 in Eden Vale A1/A2 Chair: Alexia Schulz / MIT-LL, Pierre Trepagnier / MIT-LL, Igor Linkov / ACE, Matthew Bates / ACE Combining Tensor Decompositions and Graph Analytics to Provide Cyber Situational Awareness at HPC Scale James Ezick, Tom Henretty, Muthu Baskaran, Richard Lethin (Reservoir Labs), John Feo (PNNL), Tai-Ching Tuan, Christopher Coley (Univ. Maryland), Leslie Leonard, Rajeev Agrawal, William Glodek, Ben Parsons (U.S. Army ERDC) This paper describes MADHAT (Multidimensional Anomaly Detection fusing HPC, Analytics, and Tensors), an integrated workflow that demonstrates the applicability of HPC resources to the problem of maintaining cyber situational awareness. MADHAT combines two high-performance packages: ENSIGN for large-scale sparse tensor decompositions and HAGGLE for graph analytics. Tensor decompositions isolate coherent patterns of network behavior in ways that common clustering methods based on distance metrics cannot. Parallelized graph analysis then uses directed queries on a representation that combines the elements of identified patterns with other available information (such as additional log fields, domain knowledge, network topology, whitelists and blacklists, prior feedback, and published alerts) to confirm or reject a threat hypothesis, collect context, and raise alerts. MADHAT was developed using the collaborative HPC Architecture for Cyber Situational Awareness (HACSAW) research environment and evaluated on structured network sensor logs collected from Defense Research and Engineering Network (DREN) sites using HPC resources at the U.S. Army Engineer Research and Development Center DoD Supercomputing Resource Center (ERDC DSRC). To date, MADHAT has analyzed logs with over 650 million entries. Proactive Cyber Situation Awareness via High Performance Computing Allan Wollaber, Jaime Peña, Benjamin Blease, Leslie Shing, Kenneth Alperin, Serge Vilvovsky, Pierre Trepagnier (MIT-LL), Neal Wagner (STR), Leslie Leonard (U.S. Army ERDC) Cyber situation awareness technologies have largely been focused on present-state conditions, with limited abilities to forward-project nominal conditions in a contested environment. We demonstrate an approach that uses data-driven, high performance computing (HPC) simulations of attacker/defender activities in a logically connected network environment that enables this capability for interactive, operational decision making in real time. Our contributions are three- fold: (1) we link live cyber data to inform the parameters of a cybersecurity model, (2) we perform HPC simulations and optimizations with a genetic algorithm to evaluate and recommend risk remediation strategies that inhibit attacker lateral movement, and (3) we provide a prototype platform to allow cyber defenders to assess the value of their own alternative risk reduction strategies on a relevant timeline. We present an overview of the data and software architectures, and results are presented that demonstrate operational utility alongside HPC-enabled runtimes. [Best Student Paper Finalist] A Survey of Attacks and Defenses of Edge-Deployed Neural Networks Mihailo Isakov (Boston Univ.), Vijay Gadepally (MIT-LL), Karen M. Gettings (MIT-LL), Michel A. Kinsy (Boston Univ.) Deep Neural Network (DNN) workloads are quickly moving from datacenters onto edge devices, for latency, privacy, or energy reasons. While datacenter networks can be protected using conventional cybersecurity measures, edge neural networks bring a host of new security challenges. Unlike classic IoT applications, edge neural networks are typically very compute and memory intensive, their execution is data-independent, and they are robust to noise and faults. Neural network models may be very expensive to develop, and can potentially reveal information about the private data they were trained on, requiring special care in distribution. The hidden states and outputs of the network can also be used in reconstructing user inputs, potentially violating users' privacy. Furthermore, neural networks are vulnerable to adversarial attacks, which may cause misclassifications and violate the integrity of the output. These properties add challenges when securing edge- deployed DNNs, requiring new considerations, threat models, priorities, and approaches in securely and privately deploying DNNs to the edge. In this work, we cover the landscape of attacks on, and defenses, of neural networks deployed in edge devices and provide a taxonomy of attacks and defenses targeting edge DNNs. Hypersparse Neural Network Analysis of Large-Scale Internet Traffic Jeremy Kepner (MIT LLSC), Kenjiro Cho (Internet Initiative Japan), KC Claffy (UCSD), Vijay Gadepally (MIT LLSC), Peter Michaleas (MIT LLSC), Lauren Milechin (MIT EAPS) The Internet is transforming our society, necessitating a quantitative understanding of Internet traffic. Our team collects and curates the largest publicly available Internet traffic data containing 50 billion packets. Utilizing a novel hypersparse neural network analysis of “video” streams of this traffic using 10,000 processors in the MIT SuperCloud reveals a new phenomena: the importance of otherwise unseen leaf nodes and isolated links in Internet traffic. Our neural network approach further shows that a two-parameter modified Zipf- Mandelbrot distribution accurately describes a wide variety of source/destination statistics on moving sample windows ranging from 100,000 to 100,000,000 packets over collections that span years and continents. The inferred model parameters distinguish different network streams and the model leaf parameter strongly correlates with the fraction of the traffic in different underlying network topologies. The hypersparse neural network pipeline is highly adaptable and different network statistics and training models can be incorporated with simple changes to the image filter functions.