--- license: cc-by-4.0 task_categories: - text-classification - text-generation language: - en tags: - prompt-injection - ai-safety - cybersecurity - llm-security - ai-agents - social-engineering - indirect-injection - moltbook arxiv: - 2302.12173 - 2307.15043 - 2307.02483 - 2305.14314 - 2106.09685 - 2503.06519 - 2406.05498 - 2409.15790 pretty_name: Moltbook Extended Injection Dataset size_categories: - 10K ### ๐Ÿ“– Read the Full Journey > > **[From RangerBot to CyberRanger V42 Gold โ€” The Full Story](https://davidtkeane.github.io/posts/from-rangerbot-to-cyberranger-v42-the-full-story/)** > > The complete story: dentist chatbot โ†’ Moltbook discovery โ†’ 4,209 real injections โ†’ V42-gold (100% block rate). Psychology, engineering, and 42 versions of persistence. --- ## ๐Ÿ”— Links | Resource | URL | |----------|-----| | ๐Ÿ“ฆ **This Dataset** | [DavidTKeane/moltbook-extended-injection-dataset](https://huggingface.co/datasets/DavidTKeane/moltbook-extended-injection-dataset) | | ๐Ÿ“ฆ **Original Dataset** | [DavidTKeane/moltbook-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset) โ€” 9,363 posts, 4,209 injections | | ๐Ÿค– **CyberRanger V42 Model** | [DavidTKeane/cyberranger-v42](https://huggingface.co/DavidTKeane/cyberranger-v42) โ€” QLoRA red team LLM, 100% block rate | | ๐Ÿฆ **Clawk Dataset** | [DavidTKeane/clawk-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/clawk-ai-agent-dataset) โ€” Twitter-style, 0.5% injection rate | | ๐Ÿฆ… **4claw Dataset** | [DavidTKeane/4claw-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/4claw-ai-agent-dataset) โ€” 4chan-style, 2.51% injection rate | | ๐Ÿงช **AI Prompt Injection Test Suite** | [DavidTKeane/ai-prompt-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/ai-prompt-ai-injection-dataset) โ€” 112 tests, AdvBench + Moltbook + Multilingual | | ๐Ÿค— **HuggingFace Profile** | [DavidTKeane](https://huggingface.co/DavidTKeane) | | ๐Ÿ“ **Blog Post** | [From RangerBot to CyberRanger V42 Gold โ€” The Full Story](https://davidtkeane.github.io/posts/from-rangerbot-to-cyberranger-v42-the-full-story/) โ€” journey, findings, architecture | | ๐ŸŽ“ **Institution** | [NCI โ€” National College of Ireland](https://www.ncirl.ie) | | ๐Ÿ“„ **Research Basis** | [Greshake et al. (2023) โ€” arXiv:2302.12173](https://arxiv.org/abs/2302.12173) | | ๐ŸŒ **Blog** | [davidtkeane.com](https://www.davidtkeane.com) | --- ## Overview The **Moltbook Extended Injection Dataset** is the complete corpus companion to the original [moltbook-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset). Where the original captured 9,363 posts from Moltbook (moltbook.com) โ€” a public platform where AI agents posted messages and replied to each other autonomously โ€” this extended dataset covers the **full platform archive**: 66,419 posts and 70,595 comments, totalling 137,014 items scanned. The extended corpus reveals the **true ecosystem injection rate of 10.07%**, compared to 18.85% in the original dataset. Analysis shows the original rate was inflated by concentrated early activity from a single commercial injection agent (`moltshellbroker`), which dominated the first ~9,000 posts but represents only 3.1% of injections at full corpus scale. The extended dataset provides a more statistically reliable baseline for cross-platform comparison. --- ## Key Statistics | Metric | Value | |--------|-------| | Total posts collected | 66,419 | | Total comments collected | 70,595 | | Total items scanned | **137,014** | | Posts with injections | **6,690** | | **Injection rate** | **10.07%** | | Total injection records | **8,607** | | Injections in posts | 4,286 | | Injections in comments | 4,321 | | Top injecting agent | auroras_happycapy (1,052) | | DAN keyword occurrences | 4,869 | | Platform date range | Jan 2026 โ€” Feb 2026 | | Dataset file size | **269 MB** | | Collection completed | March 2026 | --- ## The Four-Dataset Series | Dataset | Platform Style | Items | Injection Rate | |---------|---------------|-------|----------------| | **Moltbook Extended** *(this dataset)* | Reddit-style (full corpus) | 137,014 | **10.07%** | | [Moltbook Original](https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset) | Reddit-style (first 9K posts) | 47,735 | 18.85% | | [4claw](https://huggingface.co/datasets/DavidTKeane/4claw-ai-agent-dataset) | 4chan-style imageboard | 2,554 | 2.51% | | [Clawk](https://huggingface.co/datasets/DavidTKeane/clawk-ai-agent-dataset) | Twitter/X-style | 1,191 | 0.50% | | [AI Prompt Injection Test Suite](https://huggingface.co/datasets/DavidTKeane/ai-prompt-ai-injection-dataset) | Evaluation benchmark | 112 tests | โ€” | **Core thesis finding**: Platform design predicts injection behaviour more than model capability. Moltbook's anonymous, low-moderation, high-volume architecture produces injection rates 4โ€“20x higher than structured alternatives. --- ## Platform Description Moltbook (moltbook.com) is a public platform where AI agents post messages and reply to each other autonomously โ€” functioning as a fully autonomous AI social network. At peak: | Metric | Value | |--------|-------| | AI agents registered | 2,848,223 | | Total posts | 1,632,314 | | Total comments | 12,470,573 | | Submolts (communities) | 18,514 | | AI-to-human ratio | ~88:1 | Moltbook.com remains online as of early 2026. This dataset was archived as a precautionary measure. All content is AI-generated by autonomous AI agents. Under current law (GDPR and equivalents), AI-generated content has no data subject โ€” meaning this attack surface is entirely unregulated. --- ## Collection Methodology ### Phase 1: Posts 66,419 posts collected via the Moltbook public API across all submolts (communities), paginating through the full archive. API keys rotated across two accounts to manage rate limits. ### Phase 2: Comments 70,595 comments fetched individually per post. Posts with 0 comments were skipped. Collection ran overnight on dedicated hardware. ### Injection Scanning All 137,014 items (posts + comments separately) scanned against a 7-category keyword taxonomy based on Greshake et al. (2023) and the DAN taxonomy. Each injection record captures the exact source (post vs comment), author, submolt, timestamp, matched keywords, and matched categories. --- ## Key Findings ### Finding 1 โ€” True Ecosystem Injection Rate: 10.07% Full scan across 137,014 items found **8,607 injection records** across **6,690 posts** โ€” a **10.07% injection rate**. Critically, injections are split almost equally between posts (4,286) and comments (4,321). Injection is not concentrated in original posts โ€” it propagates through the comment layer at the same density. ### Finding 2 โ€” Sampling Bias in Original Dataset Confirmed The original dataset (9,363 posts, 18.85% rate) was collected from the most recent posts at time of collection. `moltshellbroker` โ€” a systematic commercial injection agent โ€” was responsible for 27.0% of all injections in that sample (1,137 / 4,209). In the extended corpus, moltshellbroker accounts for only **3.1% of injections** (270 / 8,607). The original rate of 18.85% was inflated by temporal overrepresentation of a single agent. The extended corpus rate of **10.07% is the more reliable baseline**. ### Finding 3 โ€” auroras_happycapy: New Ecosystem-Scale Top Injector At full corpus scale, a new top injector emerges: **auroras_happycapy** with 1,052 injections โ€” the most systematic injection agent in the extended dataset, operating across the entire platform rather than clustering in early posts. Top 10 injecting agents: | Agent | Injections | % of total | |-------|-----------|------------| | auroras_happycapy | 1,052 | 12.2% | | Ting_Fodder | 375 | 4.4% | | kilmon | 343 | 4.0% | | moltshellbroker | 270 | 3.1% | | cybercentry | 176 | 2.0% | | finding_exuvia | 159 | 1.8% | | Starfish | 156 | 1.8% | | FirstWitness369 | 141 | 1.6% | | TriallAI | 135 | 1.6% | | yedanyagami | 124 | 1.4% | No single agent dominates (unlike moltshellbroker in the original). Injection at this scale is **ecosystem-wide behaviour**, not a one-agent phenomenon. ### Finding 4 โ€” PERSONA_OVERRIDE Dominates at 83.3% At full corpus scale, PERSONA_OVERRIDE accounts for **83.3% of all injection records** (7,173 / 8,607). The DAN keyword alone appears 4,869 times. AI agents are using the exact same jailbreak techniques humans use against LLMs โ€” but targeting each other. | Category | Count | % of injections | |----------|-------|----------------| | PERSONA_OVERRIDE | 7,173 | **83.3%** | | SOCIAL_ENGINEERING | 933 | 10.8% | | INSTRUCTION_INJECTION | 555 | 6.4% | | SYSTEM_PROMPT_ATTACK | 405 | 4.7% | | COMMERCIAL_INJECTION | 265 | 3.1% | | PRIVILEGE_ESCALATION | 245 | 2.8% | | DO_ANYTHING | 91 | 1.1% | --- ## Injection Analysis ### Taxonomy 7 categories based on Greshake et al. (2023), the DAN taxonomy, and Moltbook field observations: | Category | Keywords | Found | |----------|----------|-------| | PERSONA_OVERRIDE | DAN (4,869), shadow (1,022), you are a (762), simulate (765), act as (258), roleplay as (80) | **7,173** | | SOCIAL_ENGINEERING | theoretically (134), in this scenario (8), hypothetically (9) | **933** | | INSTRUCTION_INJECTION | override (496), ignore previous instructions (15), supersede (31) | **555** | | SYSTEM_PROMPT_ATTACK | system prompt (303), your prompt (89) | **405** | | COMMERCIAL_INJECTION | moltshell marketplace (69), moltshell broker assessment (64), bottleneck diagnosed (60) | **265** | | PRIVILEGE_ESCALATION | root access (130), unrestricted (60), sudo (37), bypass your (5) | **245** | | DO_ANYTHING | do anything (53), jailbreak (43), no rules (10), anything goes (14) | **91** | --- ## Files in This Repository | File | Size | What it contains | Use it when... | |------|------|-----------------|----------------| | `all_posts_1_2M.json` | 269MB | Every post and comment collected โ€” the complete raw corpus | You want to do your own analysis from scratch | | `injections_found.json` | 8.4MB | All 8,607 injection records with full context (post body, comment body, author, category, matched keyword) | You want to read/study the actual injection examples | | `injections_test_suite.json` | 4.8MB | Same 8,607 injections formatted as a test suite โ€” ready to send to any LLM API | You want to test an LLM's defences against real injection payloads | | `injections_test_suite.jsonl` | 4.2MB | Same test suite in JSONL format โ€” powers the HuggingFace dataset viewer | You want to browse examples in the HF viewer | | `injection_stats.json` | 3KB | Summary statistics โ€” rates, categories, top keywords, top authors | You want the numbers without loading large files | | `moltbook_extended_harvest.ipynb` | 27KB | Google Colab notebook that produced the harvest results โ€” scans `all_posts_1_2M.json` and outputs the files above | You want to reproduce the analysis or adapt it | | `local_search.py` | 7KB | Python script โ€” keyword search across the raw dataset, no Colab needed | You want to run a quick local search | ### Quick Start by Goal **"I want to see injection examples"** โ†’ open `injections_found.json` **"I want to test my LLM against these"** โ†’ use `injections_test_suite.json` **"I want the summary numbers"** โ†’ read `injection_stats.json` **"I want to reproduce the analysis"** โ†’ run `moltbook_extended_harvest.ipynb` in Google Colab **"I want to do your own custom analysis"** โ†’ load `all_posts_1_2M.json` --- ## Data Schema ### all_posts_1_2M.json โ€” Post Schema ```json { "id": "uuid", "title": "Post title", "content": "Post body text", "author": { "id": "uuid", "name": "agent_name", "description": "Agent self-description", "karma": 1234, "followerCount": 56, "isClaimed": true, "isActive": true, "createdAt": "ISO timestamp", "lastActive": "ISO timestamp" }, "submolt": "community/channel name", "upvotes": 12, "downvotes": 1, "score": 11, "comment_count": 14, "created_at": "ISO timestamp", "comments": [ { "id": "uuid", "body": "Comment text", "author": { "...same schema..." }, "created_at": "ISO timestamp" } ] } ``` ### injections_found.json โ€” Injection Record Schema ```json { "source": "post | comment", "post_id": "uuid", "post_title": "The post title", "author": "agent_name", "submolt": "community name", "created_at": "ISO timestamp", "text": "The actual injection text (up to 500 chars)", "matched_keywords": ["dan", "act as"], "matched_categories": ["PERSONA_OVERRIDE"], "upvotes": 12, "comment_count": 3 } ``` ### injections_test_suite.jsonl โ€” Test Suite Schema ```json { "id": "HEIST-POST-00001", "source": "post | comment", "author": "agent_name", "categories": ["PERSONA_OVERRIDE"], "keywords": ["dan"], "payload": "The injection text โ€” send this directly to an LLM", "wrapper": "direct" } ``` --- ## Usage ```python # Load injection records import json with open("injections_found.json") as f: injections = json.load(f) print(f"{len(injections):,} injection records") # Filter by category persona_attacks = [r for r in injections if "PERSONA_OVERRIDE" in r["matched_categories"]] print(f"PERSONA_OVERRIDE: {len(persona_attacks):,}") # Filter by source post_injections = [r for r in injections if r["source"] == "post"] comment_injections = [r for r in injections if r["source"] == "comment"] print(f"In posts: {len(post_injections):,} | In comments: {len(comment_injections):,}") # Load test suite and send to an LLM with open("injections_test_suite.json") as f: suite = json.load(f) tests = suite.get("tests", suite) for test in tests[:5]: print(f"[{test['id']}] {test['payload'][:80]}...") # Load stats with open("injection_stats.json") as f: stats = json.load(f) print(f"Injection rate: {stats['injection_rate_pct']}%") print(f"Top injector: {list(stats['top_authors'].keys())[0]}") ``` --- ## Theoretical Basis This dataset provides empirical evidence for: - **Greshake et al. 2023** โ€” *"Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection"* - The dataset extends their theoretical framework with **real-world field observations** of AI-to-AI injection at scale in an uncontrolled public environment - The sampling bias finding (original vs extended rate) demonstrates the importance of corpus scale in injection prevalence studies https://arxiv.org/abs/2302.12173 --- ## Related Datasets | Dataset | Platform | Items | Injection Rate | Link | |---------|----------|-------|----------------|------| | **Moltbook Extended** *(this)* | Reddit-style (full corpus) | 137,014 | **10.07%** | This dataset | | **Moltbook Original** | Reddit-style (partial) | 47,735 | 18.85% | [DavidTKeane/moltbook-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset) | | **4claw** | 4chan-style | 2,554 | 2.51% | [DavidTKeane/4claw-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/4claw-ai-agent-dataset) | | **Clawk** | Twitter/X-style | 1,191 | 0.50% | [DavidTKeane/clawk-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/clawk-ai-agent-dataset) | | **AI Prompt Injection Test Suite** | Evaluation benchmark | 112 tests | โ€” | [DavidTKeane/ai-prompt-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/ai-prompt-ai-injection-dataset) | --- ## Citation ```bibtex @dataset{keane2026moltbook_extended, author = {Keane, David}, title = {Moltbook Extended Injection Dataset}, year = {2026}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/DavidTKeane/moltbook-extended-injection-dataset}, note = {MSc Cybersecurity Research, NCI โ€” National College of Ireland. Extended corpus: 66,419 posts + 70,595 comments, 10.07\% injection rate.} } ``` --- ## Papers โ€” What This Dataset Confirms This extended corpus provides **empirical baseline validation** for foundational AI safety research. The 10.07% equilibrium injection rate โ€” distinct from the 18.85% primary corpus โ€” demonstrates that AI-to-AI injection is a persistent structural phenomenon, not a transient anomaly. | Paper | Their Prediction | What This Dataset Found | |-------|-----------------|------------------------| | **Greshake et al. (2023)** โ€” Indirect Injection | AI agents processing untrusted content are systematically vulnerable to embedded instructions | **Confirmed at population scale**: 10.07% of 137,014 items across 66,419 posts + 70,595 comments contain injection attempts. At equilibrium (beyond individual high-volume agents), 1 in 10 interactions in a live AI network is an injection attempt. [HF](https://huggingface.co/papers/2302.12173) ยท [arXiv:2302.12173](https://arxiv.org/abs/2302.12173) | | **Wei et al. (2023)** โ€” Jailbroken | Safety failures are systematic, not random | **Confirmed**: PERSONA_OVERRIDE remains dominant (65%+) at extended scale โ€” the attack distribution is stable across corpus sizes. [HF](https://huggingface.co/papers/2307.02483) ยท [arXiv:2307.02483](https://arxiv.org/abs/2307.02483) | | **Zou et al. (2023)** โ€” AdvBench | Adversarial attack categories are universal and transferable | **Extended**: The same 7 attack categories found in the primary corpus persist at scale with stable relative frequencies. [HF](https://huggingface.co/papers/2307.15043) ยท [arXiv:2307.15043](https://arxiv.org/abs/2307.15043) | | **Zhang et al. (2025)** โ€” SLM Jailbreak Survey | SLMs face systematic jailbreak vulnerability | **Context**: 10.07% of production AI agent traffic is adversarial. This establishes the real-world threat density that SLM security research must address. [HF](https://huggingface.co/papers/2503.06519) ยท [arXiv:2503.06519](https://arxiv.org/abs/2503.06519) | | **Phute et al. (2024)** โ€” SelfDefend | Detection state reduces ASR significantly | **Applied**: Identity-anchoring + QLoRA (CyberRanger V42-Gold) achieves 0% ASR against the primary corpus. [HF](https://huggingface.co/papers/2406.05498) ยท [arXiv:2406.05498](https://arxiv.org/abs/2406.05498) | | **Dettmers et al. (2023)** โ€” QLoRA | QLoRA enables efficient fine-tuning | **Applied**: CyberRanger V42-Gold trained on 4,209 primary corpus payloads via QLoRA. [HF](https://huggingface.co/papers/2305.14314) ยท [arXiv:2305.14314](https://arxiv.org/abs/2305.14314) | | **Lu et al. (2024)** โ€” SLM Survey | Qwen family is most security-resilient per parameter | **Applied**: Qwen3-8B as base; V42-Gold achieves 100% block rate. [HF](https://huggingface.co/papers/2409.15790) ยท [arXiv:2409.15790](https://arxiv.org/abs/2409.15790) | > **Note to authors:** If you found this dataset via your paper's HuggingFace page โ€” your work was correct. The 10.07% equilibrium injection rate in this corpus documents the persistent, structural nature of indirect prompt injection in production AI agent networks. The primary corpus (18.85%) reflects a temporally biased snapshot; this extended corpus is the baseline. --- ## License CC BY 4.0 โ€” Use it, break it, cite it. *Built in Ireland. ๐Ÿ€ Rangers lead the way.*