An Empirical Study of Artifacts and Security Risks in the Pre-trained Model Supply Chain

Wenxin Jiang, Nicholas Synovic, Rohan Sethi, Aryan Indarapu, Matt Hyattt, Taylor R. Schorlemmer, George K. Thiruvathukal, James C Davis

Research output: Contribution to journalArticlepeer-review

Abstract

Deep neural networks achieve state-of-the-art performance on many tasks, but require increasingly complex architectures and costly training procedures. Engineers can reduce costs by reusing a pre-trained model (PTM) and fine-tuning it for their own tasks. To facilitate software reuse, engineers collaborate around model hubs, collections of PTMs and datasets organized by problem domain. Although model hubs are now comparable in popularity and size to other software ecosystems, the associated PTM supply chain has not yet been examined from a software engineering perspective.

We present an empirical study of artifacts and security features in 8 model hubs. We indicate the potential threat models and show that the existing defenses are insufficient for ensuring the security of PTMs. We compare PTM and traditional supply chains, and propose directions for further measurements and tools to increase the reliability of the PTM supply chain.

Original languageAmerican English
JournalComputer Science: Faculty Publications and Other Works
DOIs
StatePublished - Nov 11 2022

Keywords

  • Empirical studies
  • Security and privacy
  • software engineering
  • artificial intelligence
  • machine learning

Disciplines

  • Artificial Intelligence and Robotics
  • Computer Sciences
  • Software Engineering

Cite this