Evaluation of Novel AI Architectures for Uncertainty Estimation

Erik Pautsch; John Li; Silvio Rizzi; George K. Thiruvathukal; Maria Pantoja

doi:10.29375/25392115.5274

Erik Pautsch Loyola University Chicago
John Li University of California San Diego
Silvio Rizzi Argonne National Laboratory
George K. Thiruvathukal Loyola University Chicago
Maria Pantoja California Polytechnic State University https://orcid.org/0000-0002-1942-9769

DOI: https://doi.org/10.29375/25392115.5274

Keywords: Uncertainty, Deep Learning, Ensembles, Evidential Learning, Artificial intelligence

Abstract References How to Cite Downloads

Abstract

Deep learning (DL) has advanced computer vision, delivering impressive performance on intricate visual tasks. Yet, the need for accurate uncertainty estimations, particularly for out-of-distribution (OOD) inputs, persists. Our research evaluates uncertainty in Convolutional Neural Networks (CNN) and Vision Transformers (ViT) using the MNIST and ImageNet-1K datasets. Using High-Performance (HPC) platforms, including the traditional Polaris supercomputer and AI accelerators like Cerebras CS-2 and SambaNova DataScale, we assessed the computational merits and bottlenecks of each platform. This paper delineates key considerations for using HPC in uncertainty estimations in DL, offering insights that guide the integration of algorithms and hardware for robust DL applications, especially in computer vision.

References

Amini, A., Schwarting, W., & Rus, D. (2020, December 6). Deep evidential regression. In H. Larochelle, M. Ranzato, R. T. Hadsell, M. F. Balcan, & H. Lin (Eds.), NIPS'20: 34th International Conference on Neural Information Processing Systems, Vancouver BC, Canada, December 6-12, (pp. 14927-14937, Article 1251). Red Hook, NY, USA: Curran Associates Inc. doi:10.5555/3495724.3496975

ANL. (2021, August 26). Polaris. (Argonne National Laboratory) Retrieved July 2023, from ANL website: https://www.alcf.anl.gov/polaris

Bojarski, M., Yeres, P., Choromanska, A., Choromanski, K., Firner, B., Jackel, L., & Muller, U. (2017, April 25). Explaining how a deep neural network trained with end-to-end learning steers a car. arXiv:1704.07911v1 [cs.CV], 1-8. doi:10.48550/arXiv.1704.07911

Cordonnier, J.-B., Loukas, A., & Jaggi, M. (2020). On the relationship between selfattention and convolutional layers. Eighth International Conference on Learning Representations - ICLR 2020, April 26-30. Addis Ababa. Retrieved from https://infoscience.epfl.ch/entities/publication/48815b9c-e947-4c4d-84fa-7ebf1f6df4dd/conferencedetails

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Li, F.-F. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, 20-25 June (pp. 248-255). Miami, FL, USA: IEEE. doi:10.1109/CVPR.2009.5206848

Emani, M., Vishwanath, V., Adams, C., Papka, M. E., Stevens, R., Florescu, L., . . . Sujeeth, A. (2021, March 26). Accelerating scientific applications with sambanova reconfigurable dataflow architecture. Computing in Science & Engineering, 23(2), 114–119. doi:10.1109/MCSE.2021.3057203

Gal, Y., & Ghahramani, Z. (2016, June). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In M. F. Balcan, & K. Q. Weinberger (Ed.), Proceedings of The 33rd International Conference on Machine Learning. 48, pp. 1050-1059. New York, New York, USA (20–22 Jun 2016): PMLR. Retrieved from https://proceedings.mlr.press/v48/gal16.html

Geifman, Y., & El-Yaniv, R. (2017, December 4). Selective classification for deep neural networks. In U. von Luxburg, I. M. Guyon, S. Bengio, H. M. Wallach, & R. Fergus (Eds.), NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, California, USA, December 4 - 9, 2017 (pp. 4885-4894). Red Hook, NY, USA: Curran Associates Inc. doi:10.5555/3295222.3295241

Guo, C., Pleiss, G., Sun, Y., & Weinber, K. Q. (2017). On calibration of modern neural networks. In D. Precup, & Y. W. Teh (Ed.), Proceedings of the 34th International Conference on Machine Learning. 70, pp. 1321-1330. PMLR. Retrieved from https://proceedings.mlr.press/v70/guo17a.html

Hendrycks, D., Liu, X., Wallace, E., Dziedzic, A., Krishnan, R., & Song, D. (2020, July). Pretrained transformers improve out-of-distribution robustness. In D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 2744–2751). Online: Association for Computational Linguistics. doi:10.18653/v1/2020.acl-main.244

Lecun, Y., Jackel, L. D., Bottou, L., Cortes, C., Denker, J. S., Drucker, H., . . . Vapnik, V. (1995). Learning algorithms for classification: A comparison on handwritten digit recognition. In J. H. Oh, C. Kwon, & S. Cho (Eds.), Learning algorithms for classification: A comparison on handwritten digit recognition (pp. 261-276). World Scientific. Retrieved from https://nyuscholars.nyu.edu/en/publications/learning-algorithms-for-classification-a-comparison-on-handwritte

Lie, S. (2022). Cerebras architecture deep dive: First look inside the hw/sw co-design for deep learning. In 2022 IEEE Hot Chips 34 Symposium (HCS), 21-23 August (pp. 1–34). Cupertino, CA, USA: IEEE. doi:10.1109/HCS55958.2022.9895479

Liu, Y., & Guo, H. (2020, July 13). Peer loss functions: Learning from noisy labels without knowing noise rates. In H. C. Daumé, & A. Singh (Eds.), ICML'20: International Conference on Machine LearningJuly 13 - 18 (Vols. 119, Article 578, pp. 6226–6236). JMLR.org.

MacDonald, S., Foley, H., Yap, M., Johnston, R. L., Steven, K., Koufariotis, L. T., . . . Trzaskowski, M. (2023, May 6). Generalising uncertainty improves accuracy and safety of deep learning analytics applied to oncology. Scientific Reports, 13, 7395. doi:10.1038/s41598-023-31126-5

Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley, D., Nowozin, S., . . . Snoek, J. (2019). Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 32). Red Hook, NY, USA: Curran Associates Inc. Retrieved from https://proceedings.neurips.cc/paper_files/paper/2019/file/8558cb408c1d76621371888657d2eb1d-Paper.pdf

Ren, A. Z., Dixit, A., Bodrova, A., Singh, S., Tu, S., Brown, N., . . . Majumdar, A. (2023, September 4). Robots that ask for help: Uncertainty alignment for large language model planners. arXiv:2307.01928v2 [cs.RO], 1-24. doi:10.48550/arXiv.2307.01928

Tamkin, A., Nguyen, D., Deshpande, S., Mu, J., & Goodman, N. (2022). Active learning helps pretrained models learn the intended task. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.), Advances in Neural Information Processing Systems (Vol. 35, pp. 28140-28153). Curran Associates Inc. Retrieved from https://proceedings.neurips.cc/paper_files/paper/2022/file/b43a0e8a35b1c044b18cd843b9771915-Paper-Conference.pdf

Wenzel, F., Snoek, J., Tran, D., & Jenatton, R. (2020, Decembre 6). Hyperparameter ensembles for robustness and uncertainty quantification. In H. Larochelle, M. Ranzato, R. T. Hadsell, M. F. Balcan, & H. Lin (Eds.), NIPS'20: Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, December 6 - 12, 2020 (pp. 6514–6527). Red Hook, NY, USA, Article 546: Curran Associates Inc. doi:10.5555/3495724.3496270

How to Cite

Pautsch, E., Li, J., Rizzi, S., Thiruvathukal, G. K., & Pantoja, M. (2024). Evaluation of Novel AI Architectures for Uncertainty Estimation. Revista Colombiana De Computación, 25(2), 23–34. https://doi.org/10.29375/25392115.5274

Download Citation

Downloads

Download data is not yet available.

Evaluation of Novel AI Architectures for Uncertainty Estimation

Abstract

References

Downloads

Altmetric

Some similar items:

portada

button_group_sidebar

tutoriales

For authors:

For editors:

For reviewers:

Indexada

Scimago

estadisticas

sugeridos

creative_commons

Importante

Nuestros Sitios

Enlaces de Interés