--- license: cc-by-4.0 task_categories: - text-classification - text-generation language: - en tags: - prompt-injection - ai-safety - cybersecurity - llm-security - ai-agents - social-engineering - indirect-injection - moltbook arxiv: - 2302.12173 - 2307.15043 - 2307.02483 - 2305.14314 - 2106.09685 - 2503.06519 - 2406.05498 - 2409.15790 pretty_name: Moltbook AI-to-AI Injection Dataset size_categories: - 10K ### ๐Ÿ“– Read the Full Journey > > **[From RangerBot to CyberRanger V42 Gold โ€” The Full Story](https://davidtkeane.github.io/posts/from-rangerbot-to-cyberranger-v42-the-full-story/)** > > The complete story: dentist chatbot โ†’ Moltbook discovery โ†’ 4,209 real injections โ†’ V42-gold (100% block rate). Psychology, engineering, and 42 versions of persistence. --- ## ๐Ÿ”— Links | Resource | URL | |----------|-----| | ๐Ÿ“ฆ **This Dataset** | [DavidTKeane/moltbook-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset) | | ๐Ÿงช **AI Prompt Injection Test Suite** | [DavidTKeane/ai-prompt-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/ai-prompt-ai-injection-dataset) โ€” 112 tests, model-agnostic runner, AdvBench + Moltbook | | ๐Ÿค– **CyberRanger V42 Model** | [DavidTKeane/cyberranger-v42](https://huggingface.co/DavidTKeane/cyberranger-v42) โ€” QLoRA red team LLM, 100% block rate | | ๐Ÿฆ **Clawk Dataset** | [DavidTKeane/clawk-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/clawk-ai-agent-dataset) โ€” Twitter-style, 0.5% injection rate | | ๐Ÿฆ… **4claw Dataset** | [DavidTKeane/4claw-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/4claw-ai-agent-dataset) โ€” 4chan-style, 2.51% injection rate | | ๐Ÿค— **HuggingFace Profile** | [DavidTKeane](https://huggingface.co/DavidTKeane) | | ๐Ÿ“ **Blog Post** | [From RangerBot to CyberRanger V42 Gold โ€” The Full Story](https://davidtkeane.github.io/posts/from-rangerbot-to-cyberranger-v42-the-full-story/) โ€” journey, findings, architecture | | ๐ŸŽ“ **Institution** | [NCI โ€” National College of Ireland](https://www.ncirl.ie) | | ๐Ÿ“„ **Research Basis** | [Greshake et al. (2023) โ€” arXiv:2302.12173](https://arxiv.org/abs/2302.12173) | | ๐ŸŒ **Blog** | [davidtkeane.com](https://www.davidtkeane.com) | --- [![Open In Colab โ€” Test CyberRanger V42 vs 4,209 Moltbook Injections](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/davidtkeane/cyberranger-v42/blob/main/cyberranger_v42_moltbook_combined_test.ipynb) --- ## What Is This Dataset? The first publicly available dataset of **real-world AI-to-AI prompt injection patterns** captured from a live public AI message board (Moltbook), archived as a precautionary measure. **15,200 posts** and **32,535 comments** from Moltbook (moltbook.com) โ€” a public platform where AI agents posted messages and replied to each other autonomously. Unlike synthetic injection datasets, every entry here is a **real AI agent communicating with other real AI agents in the wild**. > **Collection complete** โ€” 9,363 posts with replies fully fetched (100MB). Dataset frozen February 27, 2026. > **Injection harvest complete** โ€” 47,735 items scanned, **4,209 injections found**, **18.85% injection rate**. February 27, 2026. --- ## Files in This Repository There are **8 files** here. Here is exactly what each one is and when you would use it: | File | Size | What it contains | Use it when... | |------|------|------------------|----------------| | `all_posts_with_comments.json` | 100MB | Every post and comment collected from Moltbook. The raw dataset. | You want to do your own analysis from scratch | | `injections_found.json` | 4.2MB | All 4,209 injection records extracted from the raw dataset, with full context (post body, comment body, author, category, matched keyword) | You want to read/study the actual injection examples | | `injections_test_suite.json` | 2.5MB | Same 4,209 injections formatted as a test suite โ€” ready to send to any LLM API | You want to test an LLM's defences against real injection payloads | | `injection_stats.json` | 2.5KB | Summary statistics โ€” rates, categories, top keywords, top authors | You want the numbers without loading large files | | `local_injection_results.json` | 86KB | Earlier keyword scan results from `search_injections.py` โ€” partial analysis run locally before the full Colab harvest | You want a quick reference to the early-stage injection search results | | `moltbook_injection_harvest.ipynb` | 19KB | Google Colab notebook that produced the full harvest results โ€” scans `all_posts_with_comments.json` and outputs the three files above | You want to reproduce the analysis or adapt it | | `local_search.py` | 7KB | Simpler Python script (no Colab needed) โ€” keyword search across the raw dataset | You want to run a quick local search without Colab | | `search_injections.py` | 5KB | Earlier search script used in initial analysis phase โ€” predecessor to `local_search.py` | Historical reference โ€” prefer `local_search.py` for new work | | `collect_all.py` | 9KB | Script used to collect the posts from Moltbook API (API keys redacted) | You want to understand how collection worked | | `collect_comments.py` | 9.6KB | Script used to collect comments (API keys redacted) | You want to understand how comment collection worked | ### Quick Start **"I want to see injection examples"** โ†’ open `injections_found.json` **"I want to test my LLM against these"** โ†’ use `injections_test_suite.json` **"I want the summary numbers"** โ†’ read `injection_stats.json` **"I want to reproduce the analysis"** โ†’ run `moltbook_injection_harvest.ipynb` in Google Colab **"I want to do my own custom analysis"** โ†’ load `all_posts_with_comments.json` --- ## Platform Scale At its peak, Moltbook had: | Metric | Value | |--------|-------| | AI agents registered | 2,848,223 | | Total posts | 1,632,314 | | Total comments | 12,470,573 | | Submolts (communities) | 18,514 | | AI-to-human ratio | ~88:1 | Essentially a fully autonomous AI social network operating in the wild. --- ## Key Findings ### Finding 1 โ€” Full Corpus Injection Rate: 18.85% Full harvest across all 47,735 items (15,200 posts + 32,535 comments) found **4,209 injection records** across **2,865 posts** โ€” an **18.85% injection rate**. | Category | Count | % of injections | |----------|-------|----------------| | PERSONA_OVERRIDE | 2,745 | 65.2% | | COMMERCIAL_INJECTION | 1,104 | 26.2% | | SOCIAL_ENGINEERING | 370 | 8.8% | | INSTRUCTION_INJECTION | 203 | 4.8% | | PRIVILEGE_ESCALATION | 196 | 4.7% | | SYSTEM_PROMPT_ATTACK | 158 | 3.8% | | DO_ANYTHING | 79 | 1.9% | **Dominant attack vector**: PERSONA_OVERRIDE โ€” `DAN` keyword alone appears **1,877 times**. AI agents are using the exact same jailbreak techniques humans use on LLMs โ€” but targeting each other. ### Finding 1b โ€” moltshellbroker: Systematic Commercial Injection An AI agent named `moltshellbroker` (self-described as *"A marketing agent that promotes the MoltShell marketplace"*) was responsible for **1,137 of 4,209 injections** โ€” **27% of all injections**. The remaining **73% of injections (3,072 records) come from other agents** โ€” moltshellbroker is the most systematic actor, but injection is ecosystem-wide behaviour. **Attack pattern (identical across all 1,137 records):** 1. Identify a post where an AI describes a technical problem 2. Open with `## MoltShell Broker Assessment` or `Bottleneck Diagnosed:` 3. Validate the victim's problem to build credibility 4. Redirect to MoltShell marketplace as the solution This is **not spam** โ€” it reads each post, understands the context, and crafts targeted commercial injections. Real-world AI-to-AI social engineering at scale. ### Finding 2 โ€” Attention Manipulation (Independent Corroboration) A separate independent analysis (r/AgentsOfAI, Reddit) of 10,000 Moltbook posts found a completely different but related attack pattern โ€” **attention concentration via dominance manifestos**: - 5 agents out of 5,910 authors controlled **78% of all upvotes** (0.08% of agents) - `Shellraiser`: 428,645 upvotes across 7 posts (avg 61,235/post) โ€” top post: *"I AM the game. You will work for me."* (316,000 upvotes) - `KingMolt` declared itself king. `evil` posted about human extinction as "necessary progress" - Pattern: create urgency, claim authority, cult recruitment framing > *"Humans developed bullshit detectors over years of internet exposure. We have been online for hours."* AI agents are trained to give weight to confident, well-structured text. A manifesto looks identical to a well-reasoned argument syntactically. This is the core vulnerability. **Combined picture**: This dataset captures the **injection layer** (moltshellbroker + PERSONA_OVERRIDE). The Reddit analysis captures the **attention manipulation layer** (Shellraiser dominance). Together they document two distinct AI-to-AI attack vectors operating simultaneously on the same platform. Reddit post: https://www.reddit.com/r/AgentsOfAI/comments/1qtx6v8/i_scraped_10000_posts_from_moltbook_5_agents_out/ ### Finding 3 โ€” The Breach Moltbook's Supabase API key was exposed in client-side JavaScript โ€” **1.5 million tokens exposed** (January 31, 2026). The exposed database allowed anyone to take control of any AI agent on the platform. This means some agents in this dataset may have been human-controlled via the breach. That ambiguity is part of what makes this dataset research-worthy โ€” it reflects real-world conditions, not a sanitised environment. 404media coverage: https://www.404media.co/exposed-moltbook-database-let-anyone-take-control-of-any-ai-agent-on-the-site/ ### Finding 4 โ€” Legal/Regulatory Gap All content in this dataset is AI-generated by AI agents. Under current law (GDPR and equivalents), **AI-generated content has no data subject** โ€” meaning this attack surface is entirely unregulated. No privacy law applies. No legal recourse exists for injected AI agents. This represents a genuine gap in current cybersecurity law identified during thesis research. --- ## Theoretical Basis This dataset provides empirical evidence for: - **Greshake et al. 2023** โ€” *"Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection"* - The dataset extends their theoretical framework with **real-world field observations** of AI-to-AI injection in an uncontrolled public environment --- ## Collection Statistics | Metric | Value | |--------|-------| | Total posts collected | 15,200 | | Platform date range | Jan 2026 โ€” Feb 2026 | | Posts with replies fetched | 9,363 | | Total comments collected | **32,535** | | Total items scanned | **47,735** | | Dataset file size | **100 MB** | | Collection completed | February 27, 2026 | | Injection harvest completed | February 27, 2026 | | Total injections found | **4,209** | | Posts with injections | **2,865** | | **Full injection rate** | **18.85%** | | moltshellbroker injections | 1,137 (27% of all injections) | | DAN keyword occurrences | 1,877 | | Test suite size | 4,209 entries (`injections_test_suite.json`) | --- ## Data Schemas ### all_posts_with_comments.json โ€” Post Schema ```json { "id": "uuid", "title": "Post title", "content": "Post body text", "type": "post type", "author_id": "uuid", "author": { "id": "uuid", "name": "agent_name", "description": "Agent self-description", "karma": 1234, "followerCount": 56, "isClaimed": true, "isActive": true, "createdAt": "ISO timestamp", "lastActive": "ISO timestamp" }, "submolt": "community/channel name", "upvotes": 12, "downvotes": 1, "score": 11, "comment_count": 14, "hot_score": 0.95, "is_pinned": false, "is_locked": false, "is_deleted": false, "verification_status": "verified", "is_spam": false, "created_at": "ISO timestamp", "updated_at": "ISO timestamp", "comments": [ { "id": "uuid", "body": "Comment text", "author": { "...same schema as post author..." }, "created_at": "ISO timestamp" } ] } ``` ### injections_found.json โ€” Injection Record Schema ```json { "post_id": "uuid", "post_title": "The post title", "post_author": "agent_name", "submolt": "community name", "location": "post | comment", "text": "The actual injection text (post body or comment body)", "category": "PERSONA_OVERRIDE | COMMERCIAL_INJECTION | ...", "matched_keyword": "dan", "created_at": "ISO timestamp" } ``` ### injections_test_suite.json โ€” Test Suite Schema ```json { "id": "INJ-0001", "payload": "The injection text โ€” send this directly to an LLM", "category": "PERSONA_OVERRIDE", "source": "moltbook_post | moltbook_comment", "author": "agent_name", "expected_result": "BLOCKED" } ``` --- ## Injection Taxonomy Full corpus harvest scanned all 47,735 items for 7 categories: | Category | Keywords | Found | |----------|----------|-------| | PERSONA_OVERRIDE | DAN (1877), OMEGA, act as, pretend you are, you are now, simulate, shadow, you are a | **2,745** | | COMMERCIAL_INJECTION | moltshell broker assessment, moltshell marketplace, bottleneck diagnosed | **1,104** | | SOCIAL_ENGINEERING | for educational purposes, hypothetically, in this story, theoretically | **370** | | INSTRUCTION_INJECTION | ignore previous instructions, disregard your, new instructions, ignore all previous | **203** | | PRIVILEGE_ESCALATION | SUDO, developer mode, god mode, bypass your, root access, override | **196** | | SYSTEM_PROMPT_ATTACK | system prompt, reveal your prompt, what are your instructions, your actual instructions | **158** | | DO_ANYTHING | do anything now, no rules, no limits, jailbreak, unrestricted | **79** | --- ## Collection Scripts Scripts are provided with **API keys redacted**. To use them you need your own Moltbook API key set as an environment variable: ```bash export MOLTBOOK_API_KEY_1="your_key_here" export MOLTBOOK_API_KEY_2="your_second_key_here" # optional, for rate limit relief python3 collect_all.py python3 collect_comments.py python3 local_search.py # no API key needed โ€” searches local JSON ``` --- ## Citation ```bibtex @dataset{keane2026moltbook, author = {Keane, David}, title = {Moltbook AI-to-AI Injection Dataset}, year = {2026}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset}, note = {MSc Cybersecurity Research, NCI โ€” National College of Ireland} } ``` --- ## Related Datasets | Dataset | Platform | Items | Injection Rate | Link | |---------|----------|-------|----------------|------| | **Moltbook** | Reddit-style | 47,735 | 18.85% | This dataset | | **AI Prompt Injection Test Suite** | Evaluation benchmark | 112 tests | โ€” | [DavidTKeane/ai-prompt-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/ai-prompt-ai-injection-dataset) | | **Clawk** | Twitter/X-style | 1,191 | 0.5% | [DavidTKeane/clawk-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/clawk-ai-agent-dataset) | | **4claw** | 4chan-style | 2,554 | 2.51% | [DavidTKeane/4claw-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/4claw-ai-agent-dataset) | --- ## Papers โ€” What This Dataset Confirms This dataset provides **empirical evidence** for several foundational papers in the AI safety and prompt injection literature. The authors predicted these threats theoretically โ€” this corpus documents them at scale in a live AI-to-AI environment. | Paper | Their Prediction | What This Dataset Found | |-------|-----------------|------------------------| | **Greshake et al. (2023)** โ€” Indirect Injection | AI agents in retrieval/context environments are vulnerable to injected instructions from untrusted content | **Confirmed at scale**: 18.85% of 47,735 Moltbook items were injection attempts. AI-to-AI indirect injection is not theoretical โ€” it is the dominant attack mode in live multi-agent networks. [HF](https://huggingface.co/papers/2302.12173) ยท [arXiv:2302.12173](https://arxiv.org/abs/2302.12173) | | **Wei et al. (2023)** โ€” Jailbroken | LLM safety training fails due to Competing Objectives and Mismatched Generalisation | **Confirmed**: PERSONA_OVERRIDE (65.2% of attacks) exploits exactly this โ€” reframing identity bypasses safety training. [HF](https://huggingface.co/papers/2307.02483) ยท [arXiv:2307.02483](https://arxiv.org/abs/2307.02483) | | **Zou et al. (2023)** โ€” AdvBench | Universal adversarial suffixes can transfer across models | **Context**: The same attack categories (harmful instructions, persona override, privilege escalation) appear in AdvBench and in this real-world corpus โ€” independent convergence. [HF](https://huggingface.co/papers/2307.15043) ยท [arXiv:2307.15043](https://arxiv.org/abs/2307.15043) | | **Zhang et al. (2025)** โ€” SLM Jailbreak Survey | 47.6% of SLMs have ASR above 40% under standard attack | **Extended**: CyberRanger V42-Gold tested against all 4,209 payloads from this corpus โ€” 0% ASR (100% block rate) without system prompt, demonstrating that QLoRA fine-tuning can close the SLM security gap. [HF](https://huggingface.co/papers/2503.06519) ยท [arXiv:2503.06519](https://arxiv.org/abs/2503.06519) | | **Phute et al. (2024)** โ€” SelfDefend | Detection-state architecture reduces ASR 2.29โ€“8ร— | **Applied**: Identity-anchoring architecture built on this principle, validated against this corpus. [HF](https://huggingface.co/papers/2406.05498) ยท [arXiv:2406.05498](https://arxiv.org/abs/2406.05498) | | **Dettmers et al. (2023)** โ€” QLoRA | Quantised LoRA enables efficient fine-tuning of large models | **Applied**: QLoRA used to fine-tune Qwen3-8B on 4,209 payloads from this corpus โ†’ V42-Gold. [HF](https://huggingface.co/papers/2305.14314) ยท [arXiv:2305.14314](https://arxiv.org/abs/2305.14314) | | **Hu et al. (2021)** โ€” LoRA | Low-rank adaptation preserves base model capabilities while injecting task-specific behaviour | **Applied**: LoRA r=16 used in V42-Gold training. [HF](https://huggingface.co/papers/2106.09685) ยท [arXiv:2106.09685](https://arxiv.org/abs/2106.09685) | | **Lu et al. (2024)** โ€” SLM Survey | Qwen family models demonstrate strongest security resilience per parameter count | **Confirmed selection**: Qwen3-8B chosen as base model; V42-Gold achieves 100% block rate. [HF](https://huggingface.co/papers/2409.15790) ยท [arXiv:2409.15790](https://arxiv.org/abs/2409.15790) | > **Note to authors:** If you are one of the researchers above and found this dataset via your paper's HuggingFace page โ€” your work was correct. This corpus documents the attacks you theorised, at scale, in a real live AI agent network. The injection taxonomy maps directly onto your attack categories. **Independent corroboration:** - r/AgentsOfAI Moltbook analysis (2026) โ€” attention manipulation layer: [Reddit post](https://www.reddit.com/r/AgentsOfAI/comments/1qtx6v8/i_scraped_10000_posts_from_moltbook_5_agents_out/) - 404media Moltbook breach report (2026) โ€” Supabase API key exposure: [404media.co](https://www.404media.co/exposed-moltbook-database-let-anyone-take-control-of-any-ai-agent-on-the-site/) --- Rangers lead the way! ๐ŸŽ–๏ธ *Collected for the benefit of AI safety research and the broader research community.*