TY - JOUR
T1 - Using MPI For Distributed Hyper-Parameter Optimization andUncertainty Evaluation
AU - Pautsch, Erik
AU - Li, John
AU - Rizzi, Silvio
AU - Pantoja, Maria
AU - Thiruvathukal, George K.
N1 - Pautsch, Erik; LI, John; Rizzi, Silvio; Thiruvathukal, George K.; Pantoja, Maria (2023). Using MPI For Distributed Hyper-Parameter Optimization and Uncertainty Evaluation. EduHPC 2023 Peachy Assignments, https://doi.org/10.6084/m9.figshare.24329899
PY - 2023/11/13
Y1 - 2023/11/13
N2 - Deep Learning (DL) methods have recently dominated the fields of Machine Learning (ML). Most DL models assume that the input data distribution is identical between testing and validation, though they often are not. For example, if we train a traffic sign classifier, the model might confidently but incorrectly classify a graffitied stop sign as a speed limit sign. Often ML provides high-confidence (softmax) output for out-of-distribution input that should have been classified as "I don't know". By adding the capability of propagating uncertainty to our results, the model can provide not just a single prediction, but a distribution over predictions that will allow the user to determine the model's reliability and whether it needs to be deferred to a human expert. Uncertainty estimation is computationally expensive; in this assignment, we will learn to accelerate the calculations using common distributed systems divide and conquer techniques. This assignment is part of a Distributed Computing (DC) class (undergraduate), where most students have no experience in ML. We explain the ML concepts necessary to understand the problem and then explain where in the code the independent tasks are generated and how they can be distributed among rank nodes using MPI4Py as the programming language.
AB - Deep Learning (DL) methods have recently dominated the fields of Machine Learning (ML). Most DL models assume that the input data distribution is identical between testing and validation, though they often are not. For example, if we train a traffic sign classifier, the model might confidently but incorrectly classify a graffitied stop sign as a speed limit sign. Often ML provides high-confidence (softmax) output for out-of-distribution input that should have been classified as "I don't know". By adding the capability of propagating uncertainty to our results, the model can provide not just a single prediction, but a distribution over predictions that will allow the user to determine the model's reliability and whether it needs to be deferred to a human expert. Uncertainty estimation is computationally expensive; in this assignment, we will learn to accelerate the calculations using common distributed systems divide and conquer techniques. This assignment is part of a Distributed Computing (DC) class (undergraduate), where most students have no experience in ML. We explain the ML concepts necessary to understand the problem and then explain where in the code the independent tasks are generated and how they can be distributed among rank nodes using MPI4Py as the programming language.
KW - uncertainty quantification
KW - artificial intelligence
KW - MPI
KW - high-performance computing
UR - https://ecommons.luc.edu/cs_facpubs/350
UR - https://doi.org/10.6084/m9.figshare.24329899
M3 - Article
JO - Computer Science: Faculty Publications and Other Works
JF - Computer Science: Faculty Publications and Other Works
ER -