TY - JOUR
T1 - Evaluation of Novel AI Architectures for Uncertainty Estimation
AU - Pautsch, Erik
AU - Li, John
AU - Rizzi, Silvio
AU - Thiruvathukal, George K.
AU - Pantoja, Maria
N1 - Pautsch, E., Li, J., Rizzi, S., Thiruvathukal, G. K., & Pantoja, M. (2023). Evaluation of Novel AI Architectures for Uncertainty Estimation, In Proceedings of CARLA 2023, https://doi.org/10.6084/m9.figshare.24400984
PY - 2023/9/1
Y1 - 2023/9/1
N2 - Deep learning (DL) has become a cornerstone for advancements in computer vision, yielding models capable of remarkable performance on complex visual tasks. Despite these achievements, there remains a critical need for accurate uncertainty estimations, especially when models encounter out-of-distribution (OOD) inputs. Addressing this, our research focuses on the implementation and evaluation of uncertainty techniques in two prominent DL architectures: Convolutional Neural Networks (CNN) and Vision Transformers (ViT). These architectures were applied specifically to computer vision tasks, utilizing the MNIST and ImageNet-1K datasets for our evaluations. High-Performance Computing (HPC) platforms, pivotal to this research, were employed to assess these techniques. The traditional Polaris supercomputer, equipped with AMD EPYC processors and NVIDIA A100 GPUs, is evaluated alongside cutting-edge AI accelerators: Cerebras CS-2 and SambaNova DataScale. These platforms, with their unique architectures, presented varying efficiencies in the application of uncertainty estimation. Our findings elucidate the computational advantages and challenges associated with each, providing invaluable insight for researchers. Furthermore, this paper offers practical considerations when employing HPC systems for uncertainty estimations in DL. From computational setup to architectural nuances, the insights garnered pave the way for future research aiming to integrate algorithms and hardware for robust model predictions in computer vision and other areas of DL.
AB - Deep learning (DL) has become a cornerstone for advancements in computer vision, yielding models capable of remarkable performance on complex visual tasks. Despite these achievements, there remains a critical need for accurate uncertainty estimations, especially when models encounter out-of-distribution (OOD) inputs. Addressing this, our research focuses on the implementation and evaluation of uncertainty techniques in two prominent DL architectures: Convolutional Neural Networks (CNN) and Vision Transformers (ViT). These architectures were applied specifically to computer vision tasks, utilizing the MNIST and ImageNet-1K datasets for our evaluations. High-Performance Computing (HPC) platforms, pivotal to this research, were employed to assess these techniques. The traditional Polaris supercomputer, equipped with AMD EPYC processors and NVIDIA A100 GPUs, is evaluated alongside cutting-edge AI accelerators: Cerebras CS-2 and SambaNova DataScale. These platforms, with their unique architectures, presented varying efficiencies in the application of uncertainty estimation. Our findings elucidate the computational advantages and challenges associated with each, providing invaluable insight for researchers. Furthermore, this paper offers practical considerations when employing HPC systems for uncertainty estimations in DL. From computational setup to architectural nuances, the insights garnered pave the way for future research aiming to integrate algorithms and hardware for robust model predictions in computer vision and other areas of DL.
KW - uncertainty
KW - deep learning
KW - ensembles
KW - evidential learning
UR - https://ecommons.luc.edu/cs_facpubs/353
M3 - Article
JO - Computer Science: Faculty Publications and Other Works
JF - Computer Science: Faculty Publications and Other Works
ER -