Your Cookie Preferences

We use different types of cookies to optimize your experience on our website. Click on the categories below to learn more about their purposes. You may choose which types of cookies to allow and can change your preferences at any time. Remember that disabling cookies may affect your experience on the website. You can learn more about how we use cookies by visiting our

Essential Cookies

Provider: .providername.com

Name

Purpose

Type

Expires In

__cf_bm

Cloudflare places the cookie on end-user devices that access customer sites protected by Bot Management or Bot Fight Mode.

server_cookie

30 minutes

Provider: .providername.com

Name

Purpose

Type

Expires In

_tibcpv

Used to record unique visitor views of the consent banner.

http_cookie

1 year

Analytics and Customization Cookies

Name

Purpose

Marketo Munchkin

Marketo's custom JavaScript tracking code, called Munchkin, tracks all individuals who visit your website so you can react to their visits with automated marketing campaigns.

Name

Purpose

Google Tag

The Google tag (gtag.js) is a single tag you can add to a website to use a variety of Google products and services (e.g., Google Ads, Google Analytics, Campaign Manager, Display & Video 360, Search Ads 360).

Advertising Cookies

Provider: .providername.com

Name

Purpose

Type

Expires In

__cf_bm

Cloudflare places the cookie on end-user devices that access customer sites protected by Bot Management or Bot Fight Mode.

server_cookie

30 minutes

Provider: .providername.com

Name

Purpose

Type

Expires In

_tibcpv

Used to record unique visitor views of the consent banner.

http_cookie

1 year

February 15, 2023

minute read

Infusing Security into MLOps

Author

Authors

Hyrum Anderson

Hyrum is our CTO.

WannaCry. Heartbleed. Shellshock. Logjam.

Even the uninitiated will recognize the names of infamous software security vulnerabilities that have emerged in widely-used software packages. History is replete with examples of software vulnerabilities that comprise organizations that rely on them. Little explanation is needed to motivate that the software systems powering businesses and consumers must be secured.

But what of machine learning (ML)? Fundamentally, ML systems are also software systems, whose failure can impact companies, customers and communities. Who is responsible for ML security? Where are the exercised security muscles for ML systems that we have developed for software?

Where ML library vulnerabilities end, a new class of ML-specific vulnerabilities begin that can be exploited by attackers for financial gain. For example, in our joint published work, colleagues at Norton Research Group disclosed how savvy attackers tricked out phishing webpages to evade ML-based phishing detectors. Two individuals in China evaded live facial recognition authentication in 2018 to gain access to the tax system of a Chinese municipal government to collect $77M through fraudulent tax invoices. Similarly, in 2022 a New Jersey man fooled face-matching software at ID.me to obtained multiple “verified” accounts to file false unemployment claims of $900,000.

The fact remains that the rate of AI adoption is outpacing our ability to secure it. Today, ML development and deployment pipelines represent “unmanaged risk” to corporations and consumers. But, this need not be the case. Building on the foundation of secure software development, we can also infuse security discipline into every phase of the ML development lifecycle. This requires, in part, a “shift-left” mentality that MLSecOps brings.

MLSecOps for the ML Model Pipeline

To reduce the risk of vulnerabilities during software development, organizations have adopted DevSecOps to integrate security into every stage of the software development life cycle. With DevSecOps, security teams work alongside development and operations teams to identify and address security risks before they become critical issues. The key lesson of DevSecOps is that security can’t merely be brushed on. It must be baked in.

In 2017, colleague Eugene Neelou—who joined Zoltan Balazs and me in organizing the 2022 edition of the ML Security Evasion Competition that, ironically, featured algorithmic evasions of antiphishing and facial recognition models—coined the term MLSecOps. MLSecOps aims to ensure that machine learning models are secure, reliable, and trustworthy, from model training to deployment and management. The transition to MLSecOps is a response to the increasing use of machine learning in sensitive and critical applications. This includes ensuring that data used to train models are secure and protected, as well as ensuring that models are tested and validated to prevent security vulnerabilities, unintentional failures or intentional tampering.

What does this look like in practice? For a detailed look, I refer you to Eugene’s work. How can one get started? As your organization matures, you should begin implementing CI/CD pipelines that test the robustness of ML models and fail when insufficient, just as what happens in system integration tests. You should fill the ML security gaps that traditional security tooling doesn’t cover, for example, with respect to pickle file vulnerabilities in ML model files. Mature organizations can implement AI Red Teaming exercises against pre-production and production models, such as work that my former team at Microsoft has done.

Vulnerabilities in the ML Supply Chain

MLSecOps has become important as corporations begin to rely more on models from third-party sources. Since ML models run on software, ML inherits the vulnerabilities of traditional software systems. These include vulnerabilities in the software tools required for model training or inference and arbitrary code execution in the files that store model weights.

The traditional software code that operate ML models can be analyzed by developers or automated tools for offending lines of software to be corrected. This is a key reason for regular security updates and software patches. Although safer alternatives exist, most popular ML models are still persisted via fundamentally insecure storage protocols such as pickle or yaml. Most simply ignore the risks inherent in ingesting files that can lead to arbitrary code execution and more. But, organizations serious about security should incorporate rigorous measures to reduce the risk exposed by these file formats.

Additionally, third-party models themselves must also be tested. Developers of third-party models often report performance metrics for the dataset or task they were developed for. Even the few that may come with security tests should be verified by another source. Since ML models are not written explicitly by humans—unlike code scanners, a careful inspection of the model weights cannot easily reveal their vulnerabilities. And even if model vulnerabilities are discovered, there are no editing tools to surgically correct them. Where a software engineer can isolate and correct a few lines of code, today’s machine learning engineer can’t force a model to unlearn a back door or poison vulnerability. In essence, ML’s bugs can’t be patched.

But, they can be detected.

Mitigating Vulnerabilities in the ML Supply Chain

A comprehensive set of security measures can dramatically reduce the risks inherent in the ML model supply chain:

Scan the software dependencies of a model for known vulnerabilities. This can be handled by existing software security tooling.
Verify that the model file format that encodes the model weights does not include unnecessary or unsafe vulnerabilities. Today, this is not handled by traditional software security tooling.
Complete an independent assessment on the performance, fairness and security of the ML model on your data. These scans amount to dynamic analysis of the ML model to uncover any algorithmic vulnerabilities latent in the model.
Include post-deployment protection and monitoring of models from unintentional and intentional failure modes. As with the antiphishing and face detection examples, models can be tampered with post-deployment. Logging, monitoring and firewalling these assets are good security practice.

ML Security is a Process

As with DevSecOps, MLSecOps doesn't aim to turn data scientists and ML engineers into security experts, but rather educate them in best practices that promote more secure development processes. It promotes secure ML standards and provides automated and repeatable testing. It promotes practices to continuously monitors the environment for security threats and provides visible governance metrics for both security teams and data science orgs.

With the increasing use of machine learning models in sensitive and critical applications, it is imperative that organizations implement MLSecOps to ensure the security, reliability, and trustworthiness of their machine learning models. By incorporating security into every stage of the machine learning development process, organizations can minimize the risk of security vulnerabilities and attacks. In the same way that DevSecOps has become an integral part of software development, MLSecOps must become an integral part of machine learning development. Organizations that embrace MLSecOps will be better prepared to protect themselves against security threats and ensure the security and privacy of their data and applications.

To learn more about new vulnerabilities that ML brings, check out Not with a Bug, But with a Sticker: Attacks on Machine Learning Systems and What To Do About Them by Ram Shankar Siva Kumar and Hyrum Anderson. All author proceeds are donated to charities Black in AI and Bountiful Children’s Foundation.

‍

Author

Authors

Hyrum Anderson

Hyrum is our CTO.

Social

Follow us on LinkedIn

September 20, 2024

minute read

Extracting Training Data from Chatbots

For:

September 10, 2024

minute read

Leveraging Hardened Cybersecurity Frameworks for AI Security through the Common Weakness Enumeration (CWE)

For:

September 6, 2024

minute read

AI Governance Policy Roundup (August 2024)

For:

+ More Articles

No items found.

+ More Articles

February 15, 2023

minute read

Infusing Security into MLOps

Author

Authors

Hyrum Anderson

Hyrum is our CTO.

WannaCry. Heartbleed. Shellshock. Logjam.

MLSecOps for the ML Model Pipeline

Vulnerabilities in the ML Supply Chain

But, they can be detected.

Mitigating Vulnerabilities in the ML Supply Chain

A comprehensive set of security measures can dramatically reduce the risks inherent in the ML model supply chain:

Scan the software dependencies of a model for known vulnerabilities. This can be handled by existing software security tooling.
Verify that the model file format that encodes the model weights does not include unnecessary or unsafe vulnerabilities. Today, this is not handled by traditional software security tooling.
Complete an independent assessment on the performance, fairness and security of the ML model on your data. These scans amount to dynamic analysis of the ML model to uncover any algorithmic vulnerabilities latent in the model.
Include post-deployment protection and monitoring of models from unintentional and intentional failure modes. As with the antiphishing and face detection examples, models can be tampered with post-deployment. Logging, monitoring and firewalling these assets are good security practice.

ML Security is a Process

‍

Author

Authors

Hyrum Anderson

Hyrum is our CTO.

Blog

June 23, 2022

minute read

Fionnuala Howell: Breaking into Machine Learning

For:

August 25, 2021

minute read

Machine Learning for eCommerce Fraud Management with Riskified's CTO

For:

May 17, 2021

minute read

AI Failures — Eliminate Them, Now

For:

No items found.

+ More Articles

Your Cookie Preferences

Essential Cookies

Provider: .providername.com

Provider: .providername.com

Analytics and Customization Cookies

Performance and Functionality Cookies

Advertising Cookies

Provider: .providername.com

Provider: .providername.com

Infusing Security into MLOps

MLSecOps for the ML Model Pipeline

Vulnerabilities in the ML Supply Chain

Mitigating Vulnerabilities in the ML Supply Chain

ML Security is a Process

Follow us on LinkedIn

Related articles

Extracting Training Data from Chatbots

Leveraging Hardened Cybersecurity Frameworks for AI Security through the Common Weakness Enumeration (CWE)

AI Governance Policy Roundup (August 2024)

Related articles

Ready to learn more?

Infusing Security into MLOps

MLSecOps for the ML Model Pipeline

Vulnerabilities in the ML Supply Chain

Mitigating Vulnerabilities in the ML Supply Chain

ML Security is a Process

Related articles

Fionnuala Howell: Breaking into Machine Learning

Machine Learning for eCommerce Fraud Management with Riskified's CTO

AI Failures — Eliminate Them, Now

Achieve AI Integrity Today

Your Cookie Preferences

Essential Cookies

Provider: .providername.com

Provider: .providername.com

Analytics and Customization Cookies

Performance and Functionality Cookies

Advertising Cookies

Provider: .providername.com

Provider: .providername.com

MLSecOps for the ML Model Pipeline

Vulnerabilities in the ML Supply Chain

Mitigating Vulnerabilities in the ML Supply Chain

ML Security is a Process

Follow us on LinkedIn

Subscribe to our newsletter

Related articles

Extracting Training Data from Chatbots

Leveraging Hardened Cybersecurity Frameworks for AI Security through the Common Weakness Enumeration (CWE)

AI Governance Policy Roundup (August 2024)

Related articles

Ready to learn more?

MLSecOps for the ML Model Pipeline

Vulnerabilities in the ML Supply Chain

Mitigating Vulnerabilities in the ML Supply Chain

ML Security is a Process

Subscribe to our newsletter

Related articles

Fionnuala Howell: Breaking into Machine Learning

Machine Learning for eCommerce Fraud Management with Riskified's CTO

AI Failures — Eliminate Them, Now

Achieve AI Integrity Today