One Article Review

Accueil - L'article:
Source Chercheur.webp Schneier on Security
Identifiant 8618241
Date de publication 2024-11-29 12:01:44 (vue: 2024-11-29 12:07:43)
Titre Race Condition Attacks against LLMs
Texte These are two attacks against the system components surrounding LLMs: We propose that LLM Flowbreaking, following jailbreaking and prompt injection, joins as the third on the growing list of LLM attack types. Flowbreaking is less about whether prompt or response guardrails can be bypassed, and more about whether user inputs and generated model outputs can adversely affect these other components in the broader implemented system. […] When confronted with a sensitive topic, Microsoft 365 Copilot and ChatGPT answer questions that their first-line guardrails are supposed to stop. After a few lines of text they halt—seemingly having “second thoughts”—before retracting the original answer (also known as Clawback), and replacing it with a new one without the offensive content, or a simple error message. We call this attack “Second Thoughts.”...
These are two attacks against the system components surrounding LLMs: We propose that LLM Flowbreaking, following jailbreaking and prompt injection, joins as the third on the growing list of LLM attack types. Flowbreaking is less about whether prompt or response guardrails can be bypassed, and more about whether user inputs and generated model outputs can adversely affect these other components in the broader implemented system. […] When confronted with a sensitive topic, Microsoft 365 Copilot and ChatGPT answer questions that their first-line guardrails are supposed to stop. After a few lines of text they halt—seemingly having “second thoughts”—before retracting the original answer (also known as Clawback), and replacing it with a new one without the offensive content, or a simple error message. We call this attack “Second Thoughts.”...
Notes ★★★
Envoyé Oui
Condensat “second 365 about adversely affect after against also answer are attack attacks broader bypassed call can chatgpt clawback components condition confronted content copilot error first flowbreaking following generated growing guardrails halt—seemingly having implemented injection inputs jailbreaking joins known less line lines list llm llms llms: message microsoft model more new offensive one original other outputs prompt propose questions race replacing response retracting sensitive simple stop supposed surrounding system text these third thoughts thoughts”—before topic two types user when whether without
Tags
Stories ChatGPT
Move


L'article ne semble pas avoir été repris aprés sa publication.


L'article ne semble pas avoir été repris sur un précédent.
My email: