(For a full list go to Google Scholar )

Download BibTeX.

2020
March
PhGC: A Machine Learning Based Workflow for Phenotype-Genotype Co-analysis on Autism.
Safa Shubbar, Chen Fu, Zhi Liu, Anthony Wynshaw-Boris, and Qiang Guan.
Proceedings of the 12th International Conference on Bioinformatics and Computational Biology.
2019
November
CARE: Compiler-Assisted Recovery from Soft Failures.
Chao Chen, Greg Eisenhauer, Santosh Pande, and Qiang Guan.
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
2019
July
TSM2: optimizing tall-and-skinny matrix-matrix multiplication on GPUs.
Jieyang Chen, Nan Xiong, Xin Liang, Dingwen Tao, Sihuan Li, Kaiming Ouyang, Kai Zhao, Nathan DeBardeleben, Qiang Guan, and Zizhong Chen.
Proceedings of the ACM International Conference on Supercomputing(ICS).
2018
December
In situ TensorView: In situ Visualization of Convolutional Neural Networks.
Xinyu Chen, Qiang Guan, Li-Ta Lo, Simon Su, Zhengyong Ren, James Paul Ahrens, and Trilce Estrada.
2018 IEEE International Conference on Big Data (Big Data).
2018
December
Build and Execution Environment (BEE): an Encapsulated Environment Enabling HPC Applications Running Everywhere.
Jieyang Chen, Qiang Guan, Xin Liang, Paul Bryant, Patricia Grubel, Allen McPherson, Li-Ta Lo, Timothy Randles, Zizhong Chen, and James Paul Ahrens.
2018 IEEE International Conference on Big Data (Big Data).
2018
November
Fault tolerant one-sided matrix decompositions on heterogeneous systems with GPUs.
Jieyang Chen, Hongbo Li, Sihuan Li, Xin Liang, Panruo Wu, Dingwen Tao, Kaiming Ouyang, Yuanlai Liu, Kai Zhao, Qiang Guan, and others.
Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC).
2018
October
UNS: A Portable, Mobile, and Exchangeable Namespace for Supporting Fetch-from-Anywhere Big Data Eco-Systems.
Hsing-bung Chen, Qiang Guan, and Song Fu.
2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech).
2018
September
Sensor Data Fusion Framework to Improve Holographic Object Registration Accuracy for a Shared Augmented Reality Mission Planning Scenario.
Simon Su, Vincent Perry, Qiang Guan, Andrew Durkee, Alexis R Neigel, and Sue Kase.
International Conference on Virtual, Augmented and Mixed Reality.
2018
June
BeeFlow: A Workflow Management System for In Situ Processing across HPC and Cloud Systems.
Jieyang Chen, Qiang Guan, Zhao Zhang, Xin Liang, Louis Vernon, Allen McPherson, Li-Ta Lo, Patricia Grubel, Tim Randles, Zizhong Chen, and others.
2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).
2018
May
Modeling Application Resilience in Large-scale Parallel Execution.
Kai Wu, Wenqian Dong, Qiang Guan, Nathan DeBardeleben, and Dong Li.
Proceedings of the 47th International Conference on Parallel Processing.
2018
February
Using virtualization to quantify power conservation via near-threshold voltage reduction for inherently resilient applications.
Li Tan, Nathan DeBardeleben, Qiang Guan, Sean Blanchard, and Michael Lang.
Parallel Computing.
2017
December
Resilience Analysis of Top K Selection Algorithms.
Ryan Slechta, Laura Monroe, Nathan DeBardeleben, Qiang Guan, Joanne Wendelberger, and Sarah Michalak.
2017 13th European Dependable Computing Conference (EDCC).
2017
November
Lifetime memory reliability data from the field.
Taniya Siddiqua, Vilas Sridharan, Steven E Raasch, Nathan DeBardeleben, Kurt B Ferreira, Scott Levy, Elisabeth Baseman, and Qiang Guan.
2017 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT).
2017
November
Tensorview: visualizing the training of convolutional neural network using paraview.
Xinyu Chen, Qiang Guan, Xin Liang, Li-Ta Lo, Simon Su, Trilce Estrada, and James Ahrens.
Proceedings of the 1st Workshop on Distributed Infrastructures for Deep Learning.
2017
August
Silent data corruption resilient two-sided matrix factorizations.
Panruo Wu, Nathan DeBardeleben, Qiang Guan, Sean Blanchard, Jieyang Chen, Dingwen Tao, Xin Liang, Kaiming Ouyang, and Zizhong Chen.
ACM HPDC.
2017
August
Letgo: A lightweight continuous framework for hpc applications under failures.
Bo Fang, Qiang Guan, Nathan Debardeleben, Karthik Pattabiraman, and Matei Ripeanu.
Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing.
2017
August
RSVP: Soft Error Resilient Power Savings at Near-Threshold Voltage using Register Vulnerability.
Li Tan, Nathan DeBardeleben, Qiang Guan, Sean Blanchard, and Michael Lang.
2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W).
2017
April
Addressing statistical significance of fault injection: Empirical studies of the soft error susceptibility.
Qiang Guan, Nathan DeBardeleben, Sean Blanchard, and Song Fu.
International Journal of High Performance Computing and Networking.
2016
October
Improving dram fault characterization through machine learning.
Elisabeth Baseman, Nathan DeBardeleben, Kurt Ferreira, Scott Levy, Steven Raasch, Vilas Sridharan, Taniya Siddiqua, and Qiang Guan.
2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop (DSN-W).
2016
October
Design, use and evaluation of p-fsefi: A parallel soft error fault injection framework for emulating soft errors in parallel applications.
Qiang Guan, Nathan BeBardeleben, Panruo Wu, Stephan Eidenbenz, Sean Blanchard, Laura Monroe, Elisabeth Baseman, and Li Tan.
Proceedings of the 9th EAI International Conference on Simulation Tools and Techniques.
2016
September
On the inherent resilience of integer operations.
Laura Monroe, William M Jones, Scott R Lavigne, Claude H Davis, Qiang Guan, and Nathan DeBardeleben.
European Conference on Parallel Processing.
2016
August
Sdc is in the eye of the beholder: A survey and preliminary study.
Bo Fang, Panruo Wu, Qiang Guan, Nathan DeBardeleben, Laura Monroe, Sean Blanchard, Zhizong Chen, Karthik Pattabiraman, and Matei Ripeanu.
2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop (DSN-W).
2016
July
Towards practical algorithm based fault tolerance in dense linear algebra.
Panruo Wu, Qiang Guan, Nathan DeBardeleben, Sean Blanchard, Dingwen Tao, Xin Liang, Jieyang Chen, and Zizhong Chen.
Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing.
2016
March
Analyzing the Robustness of HPC Applications Using a Fine-Grained Soft Error Fault Injection Tool.
Qiang Guan, Nathan DeBardeleben, Sean Blanchard, Song Fu, Claude H Davis IV, and William M Jones.
Innovative Research and Applications in Next-Generation High Performance Computing.
2015
December
Differentiated Failure Remediation with Action Selection for Resilient Computing.
Song Huang, Song Fu, Nathan DeBardeleben, Qiang Guan, and Cheng-Zhong Xu.
2015 IEEE 21st Pacific Rim International Symposium on Dependable Computing (PRDC).
2015
September
Towards building resilient scientific applications: Resilience analysis on the impact of soft error and transient error tolerance with the clamr hydrodynamics mini-app.
Qiang Guan, Nathan DeBardeleben, Brian Artkinson, Robert Robey, and WIlliam M Jones.
2015 IEEE International Conference on Cluster Computing.
2015
July
Empirical studies of the soft error susceptibility ofsorting algorithms to statistical fault injection.
Qiang Guan, Nathan DeBardeleben, Sean Blanchard, and Song Fu.
Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale.
2014
December
Fault injection experiments with the clamr hydrodynamics mini-app.
Brian Atkinson, Nathan Debardeleben, Qiang Guan, Robert Robey, and William M Jones.
2014 IEEE International Symposium on Software Reliability Engineering Workshops.
2014
October
F-sefi: A fine-grained soft error fault injection tool for profiling application vulnerability.
Qiang Guan, Nathan Debardeleben, Sean Blanchard, and Song Fu.
2014 IEEE 28th International Parallel and Distributed Processing Symposium.
2013
November
Wavelet-based multi-scale anomaly identification in cloud computing systems.
Qiang Guan and Song Fu.
2013 IEEE Global Communications Conference (GLOBECOM).
2013
April
Adaptive anomaly identification by exploring metric subspace in cloud computing infrastructures.
Qiang Guan and Song Fu.
2013 IEEE 32nd International Symposium on Reliable Distributed Systems.
2013
February
Exploring Time and Frequency Domains for Accurate and Automated Anomaly Detection in Cloud Computing Systems.
Qiang Guan, Song Fu, Nathan DeBardeleben, and Sean Blanchard.
Dependable Computing (PRDC), 2013 IEEE 19th Pacific Rim International Symposium on.
2012
December
Cda: A cloud dependability analysis framework for characterizing system dependability in cloud computing infrastructures.
Qiang Guan, Chi-Chen Chiu, and Song Fu.
2012 IEEE 18th Pacific Rim International Symposium on Dependable Computing.
2012
December
An adaptive power management framework for autonomic resource configuration in cloud computing infrastructures.
Ziming Zhang, Qiang Guan, and Song Fu.
2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC).
2012
October
AFD: Adaptive failure detection system for cloud computing infrastructures.
Husanbir S Pannu, Jianguo Liu, Qiang Guan, and Song Fu.
2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC).
2012
August
Efficient and accurate anomaly identification using reduced metric space in utility clouds.
Qiang Guan, Chi-Chen Chiu, Ziming Zhang, and Song Fu.
2012 IEEE Seventh International Conference on Networking, Architecture, and Storage.
2012
February
A failure detection and prediction mechanism for enhancing dependability of data centers.
Qiang Guan, Ziming Zhang, and Song Fu.
International Journal of Computer Theory and Engineering.
2012
January
Ensemble of bayesian predictors and decision trees for proactive failure management in cloud computing systems.
Qiang Guan, Ziming Zhang, and Song Fu.
Journal of Communications.
2011
October
Ensemble of bayesian predictors for autonomic failure management in cloud computing.
Qiang Guan, Ziming Zhang, and Song Fu.
2011 Proceedings of 20th International Conference on Computer Communications and Networks (ICCCN).
2011
August
Experimental framework for injecting logic errors in a virtual machine to profile applications for soft error resilience.
Nathan DeBardeleben, Sean Blanchard, Qiang Guan, Ziming Zhang, and Song Fu.
European Conference on Parallel Processing.
2011
June
Proactive failure management by integrated unsupervised and semi-supervised learning for dependable cloud systems.
Qiang Guan, Ziming Zhang, and Song Fu.
2011 Sixth International Conference on Availability, Reliability and Security.
2010
December
auto-AID: A data mining framework for autonomic anomaly identification in networked computer systems.
Qiang Guan and Song Fu.
International Performance Computing and Communications Conference.
2010
October
Anomaly detection in large-scale coalition clusters for dependability assurance.
Qiang Guan, Derek Smith, and Song Fu.
2010 International Conference on High Performance Computing.
2010
August
An anomaly detection framework for autonomic management of compute cloud systems.
Derek Smith, Qiang Guan, and Song Fu.
2010 IEEE 34th Annual Computer Software and Applications Conference Workshops.
2009
October
Research on optimization of process bus in IEC 61850-based substation communication network.
Yang Liu, Qiang Guan, Seung-Soo Han, Myeon-Song Choi, and Seung-Jae Lee.
The International Conference on Electrical Engineering.
2009
June
Reliability and Dependability Analysis for Agent-Based Reliability Enhancement Technology (ARET) System.
Qiang Guan and Seung-Soo Han.
2009 International Conference on Electronic Computer Technology.