Shane Canon

Biographical Sketch
Shane Canon is the acting group lead for the Data Science Engagement Group. He joined NERSC in 2000 to serve as a system administrator for the PDSF cluster. While working with PDSF he gained experience in cluster administration, batch systems, parallel file systems, and the Linux kernel. In 2005, Shane left Lawrence Berkeley National Laboratory (Berkeley Lab) to take a position as group leader at Oak Ridge National Laboratory. One of his more significant accomplishments while at ORNL was architecting the 10-petabyte Spider File System. In 2008, Shane returned to NERSC to lead the Data Systems Group. In 2009, he transitioned to leading a newly created Technology Integration Group to focus on the Magellan Project and other areas of strategic focus. More recently, Shane has focused on enabling data-intensive applications on HPC platforms and engaging with bioinformatics applications. Shane joined the Data & Analytics Services group in 2016 to focus on these topics. Shane is involved in several projects outside of NERSC. He is the Production Lead on the KBase project, which is developing a platform to enable predictive biology. Shane holds a Ph.D. in Physics from Duke University and B.S. in Physics from Auburn University.
Journal Articles
Antypas, KB and Bard, DJ and Blaschke, JP and Canon, RS and Enders, B and… more authors » "Enabling discovery data science through cross-facility workflows", Institute of Electrical and Electronics Engineers (IEEE), December 2021, 3671-3680, doi: 10.1109/bigdata52589.2021.9671421
Abe Singer, Shane Canon, Rebecca Hartman-Baker, Kelly L. Rowland, David Skinner, Craig Lant, "What Deploying MFA Taught Us About Changing Infrastructure", HPCSYSPROS19: HPC System Professionals Workshop, November 2019, doi: 10.5281/zenodo.3525375
Adam P Arkin, Robert W Cottingham, Christopher S Henry, Nomi L Harris,… more authors » "KBase: the United States department of energy systems biology knowledgebase", Nature Biotechnology, July 6, 2018, 36.7, doi: 10.1038/nbt.4163.
Lee, Jason R., et al, "Enhancing supercomputing with software defined networking", IEEE Conference on Information Networking (ICOIN), January 10, 2018,
Conference Papers
Alex Gittens et al, "Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies", 2016 IEEE International Conference on Big Data, July 1, 2017,
Shane Canon, Doug Jacobsen, "Shifter: Containers for HPC", Cray User Group, London, England, May 13, 2016,
Jialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon,
Alex Gittens,… more authors »
"H5Spark: Bridging the I/O Gap between Spark and Scientific Data Formats on HPC Systems",
Cray User Group,
May 13, 2016,
Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon,… more authors » "Cori - A System to Support Data-Intensive Computing", Cray User Group Meeting 2016, London, England, May 2016,
- Download File: Cori-CUG2016.pdf (pdf: 4.4 MB)
Doug Jacobsen, Shane Canon, "Contain This, Unleashing Docker for HPC", Cray User Group 2015, April 23, 2015,
Justin Blair, Richard S. Canon, Jack Deslippe, Abdelilah Essiari,… more authors » "High performance data management and analysis for tomography", Proc. SPIE 9212, Developments in X-Ray Tomography IX, September 12, 2014,
S. Parete-Koon, B. Caldwell, S. Canon, E. Dart, J. Hick, J. Hill, C.… more authors » "HPC's Pivot to Data", Conference, May 5, 2014,
Jay Srinivasan, Richard Shane Canon, "Evaluation of A Flash Storage Filesystem on the Cray XE-6", CUG 2013, May 2013,
You-Wei Cheah, Richard Canon, Beth Plale, Lavanya Ramakrishnan, "Milieu: Lightweight and Configurable Big Data Provenance for Science", IEEE Big Data Congress, 2013,
You-Wei Cheah, Richard Canon, Plale, Lavanya Ramakrishnan, "Milieu: Lightweight and Configurable Big Data Provenance for Science", BigData Congress, 2013, 46-53,
Elif Dede, Fadika, Hartog, Govindaraju, Ramakrishnan, Gunter, Shane Richard Canon, "MARISSA: MApReduce Implementation for Streaming Science Applications", eScience, October 8, 2012, 1-8,
Zacharia Fadika, Madhusudhan Govindaraju, Shane Richard Canon, Lavanya Ramakrishnan, "Evaluating Hadoop for Data-Intensive Scientific Operations", IEEE Cloud 2012, June 24, 2012,
Jay Srinivasan, Richard Shane Canon, Lavanya Ramakrishnan, "My Cray can do that? Supporting Diverse Workloads on the Cray XE-6", CUG 2012, May 2012,
Ghoshal, Devarshi and Canon, Richard Shane and Ramakrishnan, Lavanya, "Understanding I/O Performance of Virtualized Cloud Environments", The Second International Workshop on Data Intensive Computing in the Clouds (DataCloud-SC11), 2011,
- Download File: ioperformance.pdf (pdf: 174 KB)
Lavanya Ramakrishnan, Richard Shane Canon, Krishna Muriki, Iwona… more authors » "Evaluating Interconnect and Virtualization Performance for High Performance Computing", Proceedings of 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS11), 2011,
- Download File: pmbs11.pdf (pdf: 441 KB)
Lavanya Ramakrishnan, Piotr T. Zbiegel, Scott Campbell, Rick Bradshaw,… more authors » "Magellan: Experiences from a Science Cloud", Proceedings of the 2nd International Workshop on Scientific Cloud Computing, ACM ScienceCloud '11, Boulder, Colorado, and New York, NY, 2011, 49 - 58,
- Download File: P1871.pdf (pdf: 318 KB)
Neal Master, Matthew Andrews, Jason Hick, Shane Canon, Nicholas J. Wright, "Performance Analysis of Commodity and Enterprise Class Flash Devices", Petascale Data Storage Workshop (PDSW), November 2010,
Kesheng Wu, Kamesh Madduri, Shane Canon, "Multi-Level Bitmap Indexes for Flash Memory Storage", IDEAS '10: Proceedings of the Fourteenth International Database Engineering and Applications Symposium, Montreal, QC, Canada, 2010,
Lavanya Ramakrishnan, R. Jackson, Canon, Cholia, John Shalf, "Defining future platform requirements for e-Science clouds", SoCC, 2010, 101-106,
Keith R. Jackson, Ramakrishnan, Muriki, Canon, Cholia, Shalf, J. Wasserman, Nicholas J. Wright, "Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud", CloudCom, January 1, 2010, 159-168,
Book Chapters
N.J. Wright, S. S. Dosanjh, A. K. Andrews, K. Antypas, B. Draney, R.S… more authors » "Cori: A Pre-Exascale Computer for Big Data and HPC Applications", Big Data and High Performance Computing 26 (2015): 82., ( June 2015) doi: 10.3233/978-1-61499-583-8-82
Sudip Dosanjh, Shane Canon, Jack Deslippe, Kjiersten Fagnan, Richard… more authors » "Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center", Proceedings of International Conference on Parallel Programming – ParCo 2013, ( March 26, 2014)
Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane… more authors » "Magellan - A Testbed to Explore Cloud Computing for Science", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: 2013)
Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane… more authors » "CAMP", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: January 1, 2013)
Presentation/Talks
Tina Declerck, Katie Antypas, Deborah Bard, Wahid Bhimji, Shane Canon,… more authors » Cori - A System to Support Data-Intensive Computing, Cray User Group Meeting 2016, London, England, May 12, 2016,
Doug Jacobsen, Shane Canon, Contain This, Unleashing Docker for HPC, NERSC Webcast, May 15, 2015,
David Skinner and Shane Canon, NERSC and High Throughput Computing, February 12, 2013,
- Download File: NUG2013-HTC-at-NERSC.pdf (pdf: 5.8 MB)
Richard Shane Canon, Magellan Project: Clouds for Science?, Coalition for Academic Scientific Computation, February 29, 2012,
Richard Shane Canon, Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities, Metagenomics Informatics Challenges Workshop, October 12, 2011,
Lavanya Ramakrishnan & Shane Canon, NERSC, Hadoop and Pig Overview, October 2011,
Shane Canon, Debunking Some Common Misconceptions of Science in the Cloud, ScienceCloud 2011, June 29, 2011,
Richard Shane Canon, Cosmic Computing: Supporting the Science of the Planck Space Based Telescope, LISA 2009, November 5, 2009,
Reports
GK Lockwood, D Hazen, Q Koziol, RS Canon, K Antypas, J Balewski, N… more authors » "Storage 2020: A Vision for the Future of HPC Storage", October 20, 2017, LBNL LBNL-2001072,
Katherine Yelick, Susan Coghlan, Brent Draney, Richard Shane Canon,… more authors » "The Magellan Report on Cloud Computing for Science", U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), December 2011,
- Download File: MagellanFinalReport.pdf (pdf: 10 MB)