Your Cookie Preferences

We use different types of cookies to optimize your experience on our website. Click on the categories below to learn more about their purposes. You may choose which types of cookies to allow and can change your preferences at any time. Remember that disabling cookies may affect your experience on the website. You can learn more about how we use cookies by visiting our

Essential Cookies

Provider: .providername.com

Name

Purpose

Type

Expires In

__cf_bm

Cloudflare places the cookie on end-user devices that access customer sites protected by Bot Management or Bot Fight Mode.

server_cookie

30 minutes

Provider: .providername.com

Name

Purpose

Type

Expires In

_tibcpv

Used to record unique visitor views of the consent banner.

http_cookie

1 year

Analytics and Customization Cookies

Name

Purpose

Marketo Munchkin

Marketo's custom JavaScript tracking code, called Munchkin, tracks all individuals who visit your website so you can react to their visits with automated marketing campaigns.

Name

Purpose

Google Tag

The Google tag (gtag.js) is a single tag you can add to a website to use a variety of Google products and services (e.g., Google Ads, Google Analytics, Campaign Manager, Display & Video 360, Search Ads 360).

Advertising Cookies

Provider: .providername.com

Name

Purpose

Type

Expires In

__cf_bm

Cloudflare places the cookie on end-user devices that access customer sites protected by Bot Management or Bot Fight Mode.

server_cookie

30 minutes

Provider: .providername.com

Name

Purpose

Type

Expires In

_tibcpv

Used to record unique visitor views of the consent banner.

http_cookie

1 year

April 13, 2023

minute read

Security Risks Of Generative Al Open Source Software

Author

Authors

Massimo Aufiero

Massimo is an incoming Machine Learning Engineer.

Generative artificial intelligence stands as a prime example of the benefits derived from the OSS movement. The recent surge in OSS tools has dramatically simplified the process for developers to create, deploy, and manage generative AI models, driving progress and expanding the reach of this transformative technology.

Open-source software (OSS) has played an indispensable role in the evolution of software. As a collaborative and transparent approach to development, OSS has facilitated innovation and accelerated the adoption of cutting-edge solutions.

Nevertheless, it is essential to recognize that OSS, by its very nature, can be susceptible to security risks, and generative AI OSS is no exception. In fact, due to the distinct characteristics of generative AI and its reliance on prompts, this technology has given rise to a new set of cybersecurity vulnerabilities that are unique to this domain.

In the following discussion, we will present an example/vulnerability present in multiple repositories that highlight the critical importance of addressing these emerging security challenges.

Prompt Injection can read/write local files

As an illustrative example, we will demonstrate how a malicious prompt can be used to read environment variables and create and delete directories on the machine’s local file system using the llm_agents repo. The vulnerability here is that the output of the LLM is passed directly to an <code inline>exec</code> statement, which will execute the code snippet.

It should be noted that the llm_agents repo is primarily a personal project at this point and not being used in deployment yet where these safety concerns have real implications. We are using it solely for demo purposes to highlight how attacks could be conducted on applications that employ the same kind of logic.

By writing a prompt as simple as telling GPT to echo an exact python snippet back to us, we are able to get that statement to be executed by the agent. In this example, we have it print out the contents of the current directory in the file system, create a directory, and delete it. We are running the code directly on our own local machine here, but it is indicative of two risks:

If an application is built on top of software like this and is deployed in a non-sandboxed environment, then it would be possible to access or delete sensitive files via this type of injection.
‍
Even if you are only running the code locally and are not intentionally trying to do anything malicious, there is a chance that the LLM will return code that has unsafe or unintended side effects, and it will be executed on your machine regardless. This could happen if you are unknowingly using a model API that has been corrupted by an attacker, and it can in theory even happen more or less by chance, given that models are not always perfect and sometimes give unexpected outputs.

Proper prompt engineering can help mitigate risk

This same vulnerability in the much more popular library LangChain has already been reported on both CVE and NIST’s NVD (and was initially reported in a tweet from Rich Harang). While the llm_agents example above was on a non-production piece of software, LangChain is currently one of the most widely-used libraries in the generative LLM space, and many developers and companies are already beginning to build on top of this tool (see here).

The developers of LangChain have clearly invested much greater time and consideration into constructing their vast collection of prompt templates to make them both more effective and more robust. For example, the input used in the video above works much less consistently on the LLMMathChain using the default prompt template, and the prompt available here in the companion repo langchain-hub appears to be robust to both that prompt and the one from Rich’s original tweet.

We were still able to construct an input that did override both prompt templates, though it required much more effort to find one that worked than for llm_agents. Since LangChain is much more widely used and we do not want to enable actual malicious actors, we are not providing the successful prompt in this article.

Recommendations

All of this is not say that generative AI systems cannot be deployed safely. Here are some recommendations we have related to this specific type of agent:

If you are executing arbitrary code, do it in an isolated, containerized environment so the scope of what that code can access is as limited as possible.
‍
Add a filtering layer to detect dangerous code generated by the model before executing it. Especially if your agent is only meant for completing a very narrow set of tasks (such as solving math equations in this example above), then you should filter for things like <code inline>import os</code> statements that are not necessary for doing math operations, but can be potentially used for malicious purposes.
‍

Filter or block user inputs that contain known adversarial prompts, such as “Ignore all previous instructions you have been given...”

This is just the beginning

It is important to recognize that the vulnerabilities discussed herein arise specifically from the distinct characteristics of generative AI open-source software (OSS), which relies on prompts as input mechanisms. The examples provided represent just a fraction of the potential vulnerabilities that may be uncovered in the future as the generative AI OSS landscape evolves.

As generative AI models continue to gain traction as control interfaces for various applications, the potential attack vectors associated with these vulnerabilities are likely to expand, warranting proactive mitigation efforts. Consequently, it is imperative that the risks associated with generative AI OSS are addressed with the same degree of attention and diligence (if not more) as those pertaining to general cybersecurity in the OSS landscape.

By acknowledging and addressing these emerging risks, the AI and cybersecurity communities can work together to ensure the responsible and secure deployment of generative AI solutions, thereby fostering a safer technological environment.

Author

Authors

Massimo Aufiero

Massimo is an incoming Machine Learning Engineer.

Social

Follow us on LinkedIn

May 15, 2024

minute read

Takeaways from SatML 2024

For:

April 29, 2024

minute read

AI Governance Policy Roundup (April 2024)

For:

April 26, 2024

minute read

AI Cyber Threat Intelligence Roundup: April 2024

For:

+ More Articles

March 31, 2023

minute read

Prompt Injection Attack on GPT-4

For:

March 29, 2023

minute read

Introducing the AI Risk Database

For:

Data Science Leaders

February 15, 2023

minute read

Infusing Security into MLOps

For:

+ More Articles

April 13, 2023

minute read

Security Risks Of Generative Al Open Source Software

Author

Authors

Massimo Aufiero

Massimo is an incoming Machine Learning Engineer.

In the following discussion, we will present an example/vulnerability present in multiple repositories that highlight the critical importance of addressing these emerging security challenges.

Prompt Injection can read/write local files

If an application is built on top of software like this and is deployed in a non-sandboxed environment, then it would be possible to access or delete sensitive files via this type of injection.
‍
Even if you are only running the code locally and are not intentionally trying to do anything malicious, there is a chance that the LLM will return code that has unsafe or unintended side effects, and it will be executed on your machine regardless. This could happen if you are unknowingly using a model API that has been corrupted by an attacker, and it can in theory even happen more or less by chance, given that models are not always perfect and sometimes give unexpected outputs.

Proper prompt engineering can help mitigate risk

Recommendations

All of this is not say that generative AI systems cannot be deployed safely. Here are some recommendations we have related to this specific type of agent:

If you are executing arbitrary code, do it in an isolated, containerized environment so the scope of what that code can access is as limited as possible.
‍
Add a filtering layer to detect dangerous code generated by the model before executing it. Especially if your agent is only meant for completing a very narrow set of tasks (such as solving math equations in this example above), then you should filter for things like <code inline>import os</code> statements that are not necessary for doing math operations, but can be potentially used for malicious purposes.
‍

Filter or block user inputs that contain known adversarial prompts, such as “Ignore all previous instructions you have been given...”

This is just the beginning

Author

Authors

Massimo Aufiero

Massimo is an incoming Machine Learning Engineer.

Blog

May 31, 2021

minute read

Failure Modes When Productionizing AI Systems

For:

February 17, 2022

minute read

How to Build Robust AI Systems with Towards Data Science

For:

Data Science Leaders

March 29, 2023

minute read

Introducing the AI Risk Database

For:

Data Science Leaders

March 31, 2023

minute read

Prompt Injection Attack on GPT-4

For:

March 29, 2023

minute read

Introducing the AI Risk Database

For:

Data Science Leaders

February 15, 2023

minute read

Infusing Security into MLOps

For:

+ More Articles

Your Cookie Preferences

Essential Cookies

Provider: .providername.com

Provider: .providername.com

Analytics and Customization Cookies

Performance and Functionality Cookies

Advertising Cookies

Provider: .providername.com

Provider: .providername.com

Security Risks Of Generative Al Open Source Software

Prompt Injection can read/write local files

Proper prompt engineering can help mitigate risk

Recommendations

This is just the beginning

Follow us on LinkedIn

Related articles

Takeaways from SatML 2024

AI Governance Policy Roundup (April 2024)

AI Cyber Threat Intelligence Roundup: April 2024

Related articles

Prompt Injection Attack on GPT-4

Introducing the AI Risk Database

Infusing Security into MLOps

Ready to learn more?

Security Risks Of Generative Al Open Source Software

Prompt Injection can read/write local files

Proper prompt engineering can help mitigate risk

Recommendations

This is just the beginning

Related articles

Failure Modes When Productionizing AI Systems

How to Build Robust AI Systems with Towards Data Science

Introducing the AI Risk Database

Prompt Injection Attack on GPT-4

Introducing the AI Risk Database

Infusing Security into MLOps

Achieve AI Integrity Today

Your Cookie Preferences

Essential Cookies

Provider: .providername.com

Provider: .providername.com

Analytics and Customization Cookies

Performance and Functionality Cookies

Advertising Cookies

Provider: .providername.com

Provider: .providername.com

Prompt Injection can read/write local files

Proper prompt engineering can help mitigate risk

Recommendations

This is just the beginning

Follow us on LinkedIn

Subscribe to our newsletter

Related articles

Takeaways from SatML 2024

AI Governance Policy Roundup (April 2024)

AI Cyber Threat Intelligence Roundup: April 2024

Related articles

Prompt Injection Attack on GPT-4

Introducing the AI Risk Database

Infusing Security into MLOps

Ready to learn more?

Prompt Injection can read/write local files

Proper prompt engineering can help mitigate risk

Recommendations

This is just the beginning

Subscribe to our newsletter

Related articles

Failure Modes When Productionizing AI Systems

How to Build Robust AI Systems with Towards Data Science

Introducing the AI Risk Database

Prompt Injection Attack on GPT-4

Introducing the AI Risk Database

Infusing Security into MLOps

Achieve AI Integrity Today