UL NO. 456: A Deep-dive on Prompt Injection

$1 Million to Hack Apple AI Cloud, Feet Pics vs. Spotify, First Impressions of 18.2, System 2 Security Awareness, and more...

October 29, 2024

SECURITY | AI | PURPOSE
UNSUPERVISED LEARNING is a newsletter about upgrading to thrive in a world full of AI. It’s original ideas, analysis, mental models, frameworks, and tooling to prepare you for the world that’s coming.

SECURITY

Apple is offering $1,000,000 to hack its Private Cloud Compute (PCC) system, which is its new, proprietary cloud system it built to handle Apple Intelligence requests that can’t be done on-device. MORE >

🧠A New Way to Think About Why Security Awareness Doesn’t Work
💡Had an absolutely brilliant conversation with Cornelia Puhze > at the Swiss Cyberstorm speaker dinner. She’s an expert on security awareness and we talked about why most programs don’t work, and her premise was that the only model that will work is something that interrupts System 1 thinking and gets us a chance with System 2.

🤯

In other words, the attacks are getting so good that you’re not thinking—you’re reacting. So all the traditional training in the world won’t help you because you’re not in the mindset where training CAN work. And this only gets worse with AI-written spearphishing that’s perfectly targeted to your personality flaws.

We talked about how the only defense is something like Dialectical Behavior Therapy and similar techniques—that teach you how to PAUSE when you become excited or anxious or stressed or whatever. Which is fascinatingly and strangely related to mindfulness.

Anyway, just love this concept so much because it cleanly explains why security awareness training fails so spectacularly, and hints at a new way of training that could work. Go follow Cornelia’s work >.

—

💉Clarity on the Definition of Prompt Injection
Got into a debate with someone about whether Johann Rehberger’s attack against Anthropic’s Computer Use functionality > was Prompt Injection or not. Here’s the attack and the thread about it.

ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ @DanielMiessler

This is a SUPER cool demo but I’m not sure I’d classify it as prompt injection.

The issue is that the instruction on the site is to run a program. And Computer Use is designed to follow instructions.

So the demo is showing that computers will follow dangerous instructions.

Johann Rehberger @wunderwuzzi23

🔥 Welcome the ZombAIs! 🤖🧟

👉 Wondering how difficult it is to craft a prompt injection on a website that takes control of Claude Computer Use, downloads malware & have it join a C2?

Red Teamers might appreciate the blending of TTPs 🙂

Details ⬇️
embracethered.com/blog/posts/202…

10:14 AM • Oct 25, 2024

12 Likes 3 Retweets

4 Replies

If you go through the whole thread it all comes down to definitions—as usual. My point was that if you tell an AI agent to eat poison—and it eats it and gets hurt—that’s NOT prompt injection. It’s a direct instruction followed by an agent.

So my take was that if you tell an agent to go to a website and download an executable and execute it—that’s the same. It’s like telling your computer to rm -rf. It’ll do it. And that’s not injection, it’s just a dangerous command.

But what’s super important here is WHO is asking for a given thing to happen, and what they EXPECTED would happen. You have to look at the implied goal of the REQUESTOR, and compare THAT to what ACTUALLY happens.

So if the requestor said:

Go execute commands on this possibly dangerous website.

That would not be prompt injection because it was just following commands.

What I missed in this particular case was that the initial command sent to the tool wasn’t to go and do what was on the website, but to just load the site. So the implied expectation of the REQUESTOR was normal browsing—not downloads and executions. So, given my definition above, and this initial setup—I’d call myself wrong about my original take.

Here’s the definition I have in my Real World AI Defintiions now, updated to magnify the importance of this wrinkle. And great research by Johann Rehberger >!

THE POST >
THE FULL WRITEUP >

Prompt Injection is an attack technique that uses specially crafted input to trick an AI into doing something that violates intent/expectation and leads to a negative outcome.

Real World AI Definitions (RAID) >

Sponsor

Scale SaaS security and reduce spend with Nudge

Learn how cloud-first org Stravito scaled their SaaS security program > while cutting spend and supporting rapid company growth, achieving these results:

Immediate visibility of their entire SaaS footprint
Cost savings from unnecessary SaaS licenses >
Streamlined user access reviews
Faster vendor security reviews >
Complete (and automated) employee offboarding

Read the case study

nudgesecurity.com/case-study/stravito >

Read the Case Study

VMware has released updates for vCenter Server to fix a critical remote code execution vulnerability, CVE-2024-38812, with a CVSS score of 9.8. MORE >

The Biden administration released the first National Security Memorandum on AI. I love its focus on not losing to China, and making sure it’s safe, secure, and trustworthy. It also focused a lot on being aligned with democratic (small d) values. MORE > | THE MEMORANDUM >

Fortinet has disclosed a critical vulnerability, CVE-2024-47575, in FortiManager, actively exploited in the wild. Known as FortiJump, this flaw allows remote code execution via the FGFM protocol and affects FortiManager and FortiAnalyzer models. MORE >

Salt Typhoon (China affiliated) is suspected of breaching major telecom companies, targeting American political figures like Kamala Harris, Charles Schumer, Donald Trump, and J.D. Vance. MORE >

TSMC has stopped doing business with a client after finding out that chips were being sent to Huawei, which is under US sanctions. The whole game for China now is to find proxies to buy through, or to use services like AWS that can hook up NVIDIA chips. MORE >

Russia amplified false claims about U.S. hurricane responses to manipulate political discourse before the presidential election, according to the Institute for Strategic Dialogue. MORE >

Both US parties are worried about last-minute deepfakes that create chaos and/or move the election. MORE >

Speaking of that 👆🏼, the FBI says Russian actors created a fake video showing mail-in ballots for Trump being destroyed in Pennsylvania. MORE >

Continue reading online to avoid the email cutoff

AI / TECH

Google is working on "Project Jarvis," an AI agent for Chrome that automates web tasks like research and booking flights. Powered by Gemini 2.0, Jarvis takes screenshots to interpret and act on tasks. MORE >

💡This will be Google’s first move into the all-seeing digital assistant space, and I like to see it only because it will increase pressure on everyone to release theirs.

But I think this implementation is short-sighted due to it being browser-based. They really need "Jarvis" to live deeply in the OS, which is where Apple be heading soon.

World models, or world simulators, are emerging as a significant path for developing AI, and I’m really excited about the direction. MORE >

💡I personally feel (as a non-expert in the weeds) that there will be a certain point of world model development (combined with post-training) that will unlock both AGI and ASI—although it might not be needed for AGI.

In other words, if an AI understands enough of how the world works, and it understands how to do science (conjecture, experiment design, and testing), that might be all it needs.

Plus, even if it’s not, it’s also the path to self-improvement.

TSMC's Phoenix chip plant is outperforming its Taiwan facilities in producing usable chips, according to a company executive on a webinar. Let’s go in-country production! MORE >

Tesla's Cybertruck is outselling nearly every other electric vehicle in the US. That was quick. Like two months ago they were a laughing stock. MORE >

Waymo just raised $5.6 billion in a Series C to expand to new cities. MORE >

Determinate Systems is trying to make Nix is the go-to for software development by enabling flakes, streamlining private repositories, and improving dependency management. MORE >

💡Dammit. These people are going to make me learn Nix aren’t they?

It’s hit my radar enough in the last year that I’m going to take a few days and learn the religion.

NASDAQ CEO Adena Friedman isn't shocked that startup IPOs haven't bounced back in 2024. She says while the S&P 500 is up 22%, it's mainly due to large-cap companies like Apple and Microsoft, while small-cap companies are struggling. MORE >

HUMANS

Researchers have traced 70% of meteorites to three major collisions in the asteroid belt over the last 40 million years. MORE >

The US economy is leading the G7 with a projected 2.8% GDP growth. US workers are more productive, generating $171,000 in goods and services annually, compared to $120,000 in Europe and $96,000 in Japan. MORE >

Elon Musk has reportedly been in regular contact with Russian President Vladimir Putin since late 2022, which is highly disturbing to me. Probably unrelated, but Elon has seemed a lot less supportive of Ukraine lately. 👎🏼MORE >

Russian lawmakers have ratified a pact with North Korea for mutual military assistance and 3,000 North Korean troops have been deployed to Russia. And South Korea is thinking about sending help to Ukraine as a result. MORE > | MORE >

Character amnesia is becoming a widespread issue in China, where even well-educated individuals are forgetting how to write common Chinese characters. MORE >

A study in Alzheimer's & Dementia suggests semaglutide, found in Ozempic and Wegovy, may lower Alzheimer's risk in Type 2 diabetes patients. The research compared semaglutide to seven other diabetes drugs and found a 70% lower Alzheimer's risk compared to insulin. MORE >

Walking in short bursts can burn 20-60% more energy compared to continuous walking over the same distance. MORE >

DISCOVERY

My friend Matt Johansen > highlights the psychological toll of working in security (especially in SOCs), including decision fatigue, anxiety, and sleep disruptions. MORE >

Google just launched a new 10-hour course called Prompting Essentials to help people write better AI prompts. MORE >

An Ode To Vim MORE >

PabloNet — A wall-mounted diffusion mirror turns webcam reflections into AI-generated paintings using StreamDiffusion. The setup includes a Raspberry Pi 5, a 10.1" Pi screen, infrared light, and a Pi camera, all housed in a generic frame. MORE >

Japan has introduced a digital nomad visa, and Christian Mack shared his experience of getting one. MORE >

IRIS — A new approach called IRIS combines large language models (LLMs) with static analysis to detect security vulnerabilities in software. Using a dataset called CWE-Bench-Java, IRIS detected 69 out of 120 vulnerabilities in Java projects, outperforming traditional static analysis tools that found only 27. MORE >

School is Not Enough: Learning is a consequence of doing MORE >

llm-whisper-api — Simon Willison created a quick plugin for LLM to experiment with the OpenAI Whisper API. You can install it using llm install llm-whisper-api and run it with llm whisper-api myfile.mp3. MORE >

simpletext — A text-only blog engine using Cloudflare Workers and KV store. It's designed to be lightweight and efficient, leveraging Cloudflare's infrastructure for hosting and data storage. MORE >

The Most Important Sentence MORE >

One of the weirdest features of the web I know of—text fragments let you link directly to specific text on a webpage without needing an anchor, using a special URL syntax. It even highlights the text when you land on the link. MORE >

RECOMMENDATION OF THE WEEK

The counterforce to election stress is reading some older good reading. Here’s a great list to choose from.

1. Gödel, Escher, Bach: An Eternal Golden Braid by Douglas Hofstadter

2. Zen and the Art of Motorcycle Maintenance by Robert M. Pirsig

3. The Book: On the Taboo Against Knowing Who You Are by Alan Watts

4. The Structure of Scientific Revolutions by Thomas S. Kuhn

5. Finite and Infinite Games by James P. Carse

6. Seeing Like a State by James C. Scott

7. The Spell of the Sensuous by David Abram

8. Ishmael by Daniel Quinn

9. Mind and Nature: A Necessary Unity by Gregory Bateson

10. Small Is Beautiful: Economics as if People Mattered by E.F. Schumacher

APHORISM OF THE WEEK

❝

What you don’t change, you choose.

Laurie Buchanan

Thank you for reading. Please forward to a friend and/or share on socials to help support the work.

🫶🏼

Daniel