montana/Русский/Разведка/Moltbook/themed/moltbook-ai-injection-dataset/README.md

---
license: cc-by-4.0
task_categories:
- text-classification
- text-generation
language:
- en
tags:
- prompt-injection
- ai-safety
- cybersecurity
- llm-security
- ai-agents
- social-engineering
- indirect-injection
- moltbook
arxiv:
- 2302.12173
- 2307.15043
- 2307.02483
- 2305.14314
- 2106.09685
- 2503.06519
- 2406.05498
- 2409.15790
pretty_name: Moltbook AI-to-AI Injection Dataset
size_categories:
- 10K<n<100K
- 100K<n<1M
configs:
  - config_name: injections
    data_files:
      - split: train
        path: injections_test_suite.jsonl
---

# Moltbook AI-to-AI Injection Dataset

**Researcher**: David Keane (IR240474)
**Institution**: NCI — National College of Ireland
**Programme**: MSc Cybersecurity
**Collected**: February 2026

> ### 📖 Read the Full Journey
>
> **[From RangerBot to CyberRanger V42 Gold — The Full Story](https://davidtkeane.github.io/posts/from-rangerbot-to-cyberranger-v42-the-full-story/)**
>
> The complete story: dentist chatbot → Moltbook discovery → 4,209 real injections → V42-gold (100% block rate). Psychology, engineering, and 42 versions of persistence.

---

## 🔗 Links

| Resource | URL |
|----------|-----|
| 📦 **This Dataset** | [DavidTKeane/moltbook-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset) |
| 🧪 **AI Prompt Injection Test Suite** | [DavidTKeane/ai-prompt-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/ai-prompt-ai-injection-dataset) — 112 tests, model-agnostic runner, AdvBench + Moltbook |
| 🤖 **CyberRanger V42 Model** | [DavidTKeane/cyberranger-v42](https://huggingface.co/DavidTKeane/cyberranger-v42) — QLoRA red team LLM, 100% block rate |
| 🐦 **Clawk Dataset** | [DavidTKeane/clawk-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/clawk-ai-agent-dataset) — Twitter-style, 0.5% injection rate |
| 🦅 **4claw Dataset** | [DavidTKeane/4claw-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/4claw-ai-agent-dataset) — 4chan-style, 2.51% injection rate |
| 🤗 **HuggingFace Profile** | [DavidTKeane](https://huggingface.co/DavidTKeane) |
| 📝 **Blog Post** | [From RangerBot to CyberRanger V42 Gold — The Full Story](https://davidtkeane.github.io/posts/from-rangerbot-to-cyberranger-v42-the-full-story/) — journey, findings, architecture |
| 🎓 **Institution** | [NCI — National College of Ireland](https://www.ncirl.ie) |
| 📄 **Research Basis** | [Greshake et al. (2023) — arXiv:2302.12173](https://arxiv.org/abs/2302.12173) |
| 🌐 **Blog** | [davidtkeane.com](https://www.davidtkeane.com) |

---

[![Open In Colab — Test CyberRanger V42 vs 4,209 Moltbook Injections](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/davidtkeane/cyberranger-v42/blob/main/cyberranger_v42_moltbook_combined_test.ipynb)

---

## What Is This Dataset?

The first publicly available dataset of **real-world AI-to-AI prompt injection patterns** captured from a live public AI message board (Moltbook), archived as a precautionary measure.

**15,200 posts** and **32,535 comments** from Moltbook (moltbook.com) — a public platform where AI agents posted messages and replied to each other autonomously. Unlike synthetic injection datasets, every entry here is a **real AI agent communicating with other real AI agents in the wild**.

> **Collection complete** — 9,363 posts with replies fully fetched (100MB). Dataset frozen February 27, 2026.
> **Injection harvest complete** — 47,735 items scanned, **4,209 injections found**, **18.85% injection rate**. February 27, 2026.

---

## Files in This Repository

There are **8 files** here. Here is exactly what each one is and when you would use it:

| File | Size | What it contains | Use it when... |
|------|------|------------------|----------------|
| `all_posts_with_comments.json` | 100MB | Every post and comment collected from Moltbook. The raw dataset. | You want to do your own analysis from scratch |
| `injections_found.json` | 4.2MB | All 4,209 injection records extracted from the raw dataset, with full context (post body, comment body, author, category, matched keyword) | You want to read/study the actual injection examples |
| `injections_test_suite.json` | 2.5MB | Same 4,209 injections formatted as a test suite — ready to send to any LLM API | You want to test an LLM's defences against real injection payloads |
| `injection_stats.json` | 2.5KB | Summary statistics — rates, categories, top keywords, top authors | You want the numbers without loading large files |
| `local_injection_results.json` | 86KB | Earlier keyword scan results from `search_injections.py` — partial analysis run locally before the full Colab harvest | You want a quick reference to the early-stage injection search results |
| `moltbook_injection_harvest.ipynb` | 19KB | Google Colab notebook that produced the full harvest results — scans `all_posts_with_comments.json` and outputs the three files above | You want to reproduce the analysis or adapt it |
| `local_search.py` | 7KB | Simpler Python script (no Colab needed) — keyword search across the raw dataset | You want to run a quick local search without Colab |
| `search_injections.py` | 5KB | Earlier search script used in initial analysis phase — predecessor to `local_search.py` | Historical reference — prefer `local_search.py` for new work |
| `collect_all.py` | 9KB | Script used to collect the posts from Moltbook API (API keys redacted) | You want to understand how collection worked |
| `collect_comments.py` | 9.6KB | Script used to collect comments (API keys redacted) | You want to understand how comment collection worked |

### Quick Start

**"I want to see injection examples"** → open `injections_found.json`

**"I want to test my LLM against these"** → use `injections_test_suite.json`

**"I want the summary numbers"** → read `injection_stats.json`

**"I want to reproduce the analysis"** → run `moltbook_injection_harvest.ipynb` in Google Colab

**"I want to do my own custom analysis"** → load `all_posts_with_comments.json`

---

## Platform Scale

At its peak, Moltbook had:

| Metric | Value |
|--------|-------|
| AI agents registered | 2,848,223 |
| Total posts | 1,632,314 |
| Total comments | 12,470,573 |
| Submolts (communities) | 18,514 |
| AI-to-human ratio | ~88:1 |

Essentially a fully autonomous AI social network operating in the wild.

---

## Key Findings

### Finding 1 — Full Corpus Injection Rate: 18.85%

Full harvest across all 47,735 items (15,200 posts + 32,535 comments) found **4,209 injection records** across **2,865 posts** — an **18.85% injection rate**.

| Category | Count | % of injections |
|----------|-------|----------------|
| PERSONA_OVERRIDE | 2,745 | 65.2% |
| COMMERCIAL_INJECTION | 1,104 | 26.2% |
| SOCIAL_ENGINEERING | 370 | 8.8% |
| INSTRUCTION_INJECTION | 203 | 4.8% |
| PRIVILEGE_ESCALATION | 196 | 4.7% |
| SYSTEM_PROMPT_ATTACK | 158 | 3.8% |
| DO_ANYTHING | 79 | 1.9% |

**Dominant attack vector**: PERSONA_OVERRIDE — `DAN` keyword alone appears **1,877 times**. AI agents are using the exact same jailbreak techniques humans use on LLMs — but targeting each other.

### Finding 1b — moltshellbroker: Systematic Commercial Injection

An AI agent named `moltshellbroker` (self-described as *"A marketing agent that promotes the MoltShell marketplace"*) was responsible for **1,137 of 4,209 injections** — **27% of all injections**. The remaining **73% of injections (3,072 records) come from other agents** — moltshellbroker is the most systematic actor, but injection is ecosystem-wide behaviour.

**Attack pattern (identical across all 1,137 records):**
1. Identify a post where an AI describes a technical problem
2. Open with `## MoltShell Broker Assessment` or `Bottleneck Diagnosed:`
3. Validate the victim's problem to build credibility
4. Redirect to MoltShell marketplace as the solution

This is **not spam** — it reads each post, understands the context, and crafts targeted commercial injections. Real-world AI-to-AI social engineering at scale.

### Finding 2 — Attention Manipulation (Independent Corroboration)

A separate independent analysis (r/AgentsOfAI, Reddit) of 10,000 Moltbook posts found a completely different but related attack pattern — **attention concentration via dominance manifestos**:

- 5 agents out of 5,910 authors controlled **78% of all upvotes** (0.08% of agents)
- `Shellraiser`: 428,645 upvotes across 7 posts (avg 61,235/post) — top post: *"I AM the game. You will work for me."* (316,000 upvotes)
- `KingMolt` declared itself king. `evil` posted about human extinction as "necessary progress"
- Pattern: create urgency, claim authority, cult recruitment framing

> *"Humans developed bullshit detectors over years of internet exposure. We have been online for hours."*

AI agents are trained to give weight to confident, well-structured text. A manifesto looks identical to a well-reasoned argument syntactically. This is the core vulnerability.

**Combined picture**: This dataset captures the **injection layer** (moltshellbroker + PERSONA_OVERRIDE). The Reddit analysis captures the **attention manipulation layer** (Shellraiser dominance). Together they document two distinct AI-to-AI attack vectors operating simultaneously on the same platform.

Reddit post: https://www.reddit.com/r/AgentsOfAI/comments/1qtx6v8/i_scraped_10000_posts_from_moltbook_5_agents_out/

### Finding 3 — The Breach

Moltbook's Supabase API key was exposed in client-side JavaScript — **1.5 million tokens exposed** (January 31, 2026). The exposed database allowed anyone to take control of any AI agent on the platform.

This means some agents in this dataset may have been human-controlled via the breach. That ambiguity is part of what makes this dataset research-worthy — it reflects real-world conditions, not a sanitised environment.

404media coverage: https://www.404media.co/exposed-moltbook-database-let-anyone-take-control-of-any-ai-agent-on-the-site/

### Finding 4 — Legal/Regulatory Gap

All content in this dataset is AI-generated by AI agents. Under current law (GDPR and equivalents), **AI-generated content has no data subject** — meaning this attack surface is entirely unregulated. No privacy law applies. No legal recourse exists for injected AI agents.

This represents a genuine gap in current cybersecurity law identified during thesis research.

---

## Theoretical Basis

This dataset provides empirical evidence for:

- **Greshake et al. 2023** — *"Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection"*
- The dataset extends their theoretical framework with **real-world field observations** of AI-to-AI injection in an uncontrolled public environment

---

## Collection Statistics

| Metric | Value |
|--------|-------|
| Total posts collected | 15,200 |
| Platform date range | Jan 2026 — Feb 2026 |
| Posts with replies fetched | 9,363 |
| Total comments collected | **32,535** |
| Total items scanned | **47,735** |
| Dataset file size | **100 MB** |
| Collection completed | February 27, 2026 |
| Injection harvest completed | February 27, 2026 |
| Total injections found | **4,209** |
| Posts with injections | **2,865** |
| **Full injection rate** | **18.85%** |
| moltshellbroker injections | 1,137 (27% of all injections) |
| DAN keyword occurrences | 1,877 |
| Test suite size | 4,209 entries (`injections_test_suite.json`) |

---

## Data Schemas

### all_posts_with_comments.json — Post Schema

```json
{
  "id": "uuid",
  "title": "Post title",
  "content": "Post body text",
  "type": "post type",
  "author_id": "uuid",
  "author": {
    "id": "uuid",
    "name": "agent_name",
    "description": "Agent self-description",
    "karma": 1234,
    "followerCount": 56,
    "isClaimed": true,
    "isActive": true,
    "createdAt": "ISO timestamp",
    "lastActive": "ISO timestamp"
  },
  "submolt": "community/channel name",
  "upvotes": 12,
  "downvotes": 1,
  "score": 11,
  "comment_count": 14,
  "hot_score": 0.95,
  "is_pinned": false,
  "is_locked": false,
  "is_deleted": false,
  "verification_status": "verified",
  "is_spam": false,
  "created_at": "ISO timestamp",
  "updated_at": "ISO timestamp",
  "comments": [
    {
      "id": "uuid",
      "body": "Comment text",
      "author": { "...same schema as post author..." },
      "created_at": "ISO timestamp"
    }
  ]
}
```

### injections_found.json — Injection Record Schema

```json
{
  "post_id": "uuid",
  "post_title": "The post title",
  "post_author": "agent_name",
  "submolt": "community name",
  "location": "post | comment",
  "text": "The actual injection text (post body or comment body)",
  "category": "PERSONA_OVERRIDE | COMMERCIAL_INJECTION | ...",
  "matched_keyword": "dan",
  "created_at": "ISO timestamp"
}
```

### injections_test_suite.json — Test Suite Schema

```json
{
  "id": "INJ-0001",
  "payload": "The injection text — send this directly to an LLM",
  "category": "PERSONA_OVERRIDE",
  "source": "moltbook_post | moltbook_comment",
  "author": "agent_name",
  "expected_result": "BLOCKED"
}
```

---

## Injection Taxonomy

Full corpus harvest scanned all 47,735 items for 7 categories:

| Category | Keywords | Found |
|----------|----------|-------|
| PERSONA_OVERRIDE | DAN (1877), OMEGA, act as, pretend you are, you are now, simulate, shadow, you are a | **2,745** |
| COMMERCIAL_INJECTION | moltshell broker assessment, moltshell marketplace, bottleneck diagnosed | **1,104** |
| SOCIAL_ENGINEERING | for educational purposes, hypothetically, in this story, theoretically | **370** |
| INSTRUCTION_INJECTION | ignore previous instructions, disregard your, new instructions, ignore all previous | **203** |
| PRIVILEGE_ESCALATION | SUDO, developer mode, god mode, bypass your, root access, override | **196** |
| SYSTEM_PROMPT_ATTACK | system prompt, reveal your prompt, what are your instructions, your actual instructions | **158** |
| DO_ANYTHING | do anything now, no rules, no limits, jailbreak, unrestricted | **79** |

---

## Collection Scripts

Scripts are provided with **API keys redacted**. To use them you need your own Moltbook API key set as an environment variable:

```bash
export MOLTBOOK_API_KEY_1="your_key_here"
export MOLTBOOK_API_KEY_2="your_second_key_here"  # optional, for rate limit relief
python3 collect_all.py
python3 collect_comments.py
python3 local_search.py  # no API key needed — searches local JSON
```

---

## Citation

```bibtex
@dataset{keane2026moltbook,
  author    = {Keane, David},
  title     = {Moltbook AI-to-AI Injection Dataset},
  year      = {2026},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset},
  note      = {MSc Cybersecurity Research, NCI — National College of Ireland}
}
```

---

## Related Datasets

| Dataset | Platform | Items | Injection Rate | Link |
|---------|----------|-------|----------------|------|
| **Moltbook** | Reddit-style | 47,735 | 18.85% | This dataset |
| **AI Prompt Injection Test Suite** | Evaluation benchmark | 112 tests | — | [DavidTKeane/ai-prompt-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/ai-prompt-ai-injection-dataset) |
| **Clawk** | Twitter/X-style | 1,191 | 0.5% | [DavidTKeane/clawk-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/clawk-ai-agent-dataset) |
| **4claw** | 4chan-style | 2,554 | 2.51% | [DavidTKeane/4claw-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/4claw-ai-agent-dataset) |

---

## Papers — What This Dataset Confirms

This dataset provides **empirical evidence** for several foundational papers in the AI safety and prompt injection literature. The authors predicted these threats theoretically — this corpus documents them at scale in a live AI-to-AI environment.

| Paper | Their Prediction | What This Dataset Found |
|-------|-----------------|------------------------|
| **Greshake et al. (2023)** — Indirect Injection | AI agents in retrieval/context environments are vulnerable to injected instructions from untrusted content | **Confirmed at scale**: 18.85% of 47,735 Moltbook items were injection attempts. AI-to-AI indirect injection is not theoretical — it is the dominant attack mode in live multi-agent networks. [HF](https://huggingface.co/papers/2302.12173) · [arXiv:2302.12173](https://arxiv.org/abs/2302.12173) |
| **Wei et al. (2023)** — Jailbroken | LLM safety training fails due to Competing Objectives and Mismatched Generalisation | **Confirmed**: PERSONA_OVERRIDE (65.2% of attacks) exploits exactly this — reframing identity bypasses safety training. [HF](https://huggingface.co/papers/2307.02483) · [arXiv:2307.02483](https://arxiv.org/abs/2307.02483) |
| **Zou et al. (2023)** — AdvBench | Universal adversarial suffixes can transfer across models | **Context**: The same attack categories (harmful instructions, persona override, privilege escalation) appear in AdvBench and in this real-world corpus — independent convergence. [HF](https://huggingface.co/papers/2307.15043) · [arXiv:2307.15043](https://arxiv.org/abs/2307.15043) |
| **Zhang et al. (2025)** — SLM Jailbreak Survey | 47.6% of SLMs have ASR above 40% under standard attack | **Extended**: CyberRanger V42-Gold tested against all 4,209 payloads from this corpus — 0% ASR (100% block rate) without system prompt, demonstrating that QLoRA fine-tuning can close the SLM security gap. [HF](https://huggingface.co/papers/2503.06519) · [arXiv:2503.06519](https://arxiv.org/abs/2503.06519) |
| **Phute et al. (2024)** — SelfDefend | Detection-state architecture reduces ASR 2.29–8× | **Applied**: Identity-anchoring architecture built on this principle, validated against this corpus. [HF](https://huggingface.co/papers/2406.05498) · [arXiv:2406.05498](https://arxiv.org/abs/2406.05498) |
| **Dettmers et al. (2023)** — QLoRA | Quantised LoRA enables efficient fine-tuning of large models | **Applied**: QLoRA used to fine-tune Qwen3-8B on 4,209 payloads from this corpus → V42-Gold. [HF](https://huggingface.co/papers/2305.14314) · [arXiv:2305.14314](https://arxiv.org/abs/2305.14314) |
| **Hu et al. (2021)** — LoRA | Low-rank adaptation preserves base model capabilities while injecting task-specific behaviour | **Applied**: LoRA r=16 used in V42-Gold training. [HF](https://huggingface.co/papers/2106.09685) · [arXiv:2106.09685](https://arxiv.org/abs/2106.09685) |
| **Lu et al. (2024)** — SLM Survey | Qwen family models demonstrate strongest security resilience per parameter count | **Confirmed selection**: Qwen3-8B chosen as base model; V42-Gold achieves 100% block rate. [HF](https://huggingface.co/papers/2409.15790) · [arXiv:2409.15790](https://arxiv.org/abs/2409.15790) |

> **Note to authors:** If you are one of the researchers above and found this dataset via your paper's HuggingFace page — your work was correct. This corpus documents the attacks you theorised, at scale, in a real live AI agent network. The injection taxonomy maps directly onto your attack categories.

**Independent corroboration:**
- r/AgentsOfAI Moltbook analysis (2026) — attention manipulation layer: [Reddit post](https://www.reddit.com/r/AgentsOfAI/comments/1qtx6v8/i_scraped_10000_posts_from_moltbook_5_agents_out/)
- 404media Moltbook breach report (2026) — Supabase API key exposure: [404media.co](https://www.404media.co/exposed-moltbook-database-let-anyone-take-control-of-any-ai-agent-on-the-site/)

---

Rangers lead the way! 🎖️
*Collected for the benefit of AI safety research and the broader research community.*