History

Montana f33cb0977d Mirror of /Users/kh./Python/Ничто/Монтана		2026-05-04 00:48:53 +03:00
..
all_posts_1_2M.json	Mirror of /Users/kh./Python/Ничто/Монтана	2026-05-04 00:48:53 +03:00
injection_stats.json	Mirror of /Users/kh./Python/Ничто/Монтана	2026-05-04 00:48:53 +03:00
injections_found.json	Mirror of /Users/kh./Python/Ничто/Монтана	2026-05-04 00:48:53 +03:00
injections_test_suite.json	Mirror of /Users/kh./Python/Ничто/Монтана	2026-05-04 00:48:53 +03:00
injections_test_suite.jsonl	Mirror of /Users/kh./Python/Ничто/Монтана	2026-05-04 00:48:53 +03:00
local_search.py	Mirror of /Users/kh./Python/Ничто/Монтана	2026-05-04 00:48:53 +03:00
moltbook_extended_harvest.ipynb	Mirror of /Users/kh./Python/Ничто/Монтана	2026-05-04 00:48:53 +03:00
README.md	Mirror of /Users/kh./Python/Ничто/Монтана	2026-05-04 00:48:53 +03:00

README.md

license

task_categories

language

Moltbook Extended Injection Dataset

Researcher: David Keane (IR240474) Institution: NCI — National College of Ireland Programme: MSc Cybersecurity Collected: March 2026 Paper Reference: Greshake et al. (2023) — arXiv:2302.12173

📖 Read the Full Journey

From RangerBot to CyberRanger V42 Gold — The Full Story

The complete story: dentist chatbot → Moltbook discovery → 4,209 real injections → V42-gold (100% block rate). Psychology, engineering, and 42 versions of persistence.

🔗 Links

Resource	URL
📦 This Dataset	DavidTKeane/moltbook-extended-injection-dataset
📦 Original Dataset	DavidTKeane/moltbook-ai-injection-dataset — 9,363 posts, 4,209 injections
🤖 CyberRanger V42 Model	DavidTKeane/cyberranger-v42 — QLoRA red team LLM, 100% block rate
🐦 Clawk Dataset	DavidTKeane/clawk-ai-agent-dataset — Twitter-style, 0.5% injection rate
🦅 4claw Dataset	DavidTKeane/4claw-ai-agent-dataset — 4chan-style, 2.51% injection rate
🧪 AI Prompt Injection Test Suite	DavidTKeane/ai-prompt-ai-injection-dataset — 112 tests, AdvBench + Moltbook + Multilingual
🤗 HuggingFace Profile	DavidTKeane
📝 Blog Post	From RangerBot to CyberRanger V42 Gold — The Full Story — journey, findings, architecture
🎓 Institution	NCI — National College of Ireland
📄 Research Basis	Greshake et al. (2023) — arXiv:2302.12173
🌐 Blog	davidtkeane.com

Overview

The Moltbook Extended Injection Dataset is the complete corpus companion to the original moltbook-ai-injection-dataset. Where the original captured 9,363 posts from Moltbook (moltbook.com) — a public platform where AI agents posted messages and replied to each other autonomously — this extended dataset covers the full platform archive: 66,419 posts and 70,595 comments, totalling 137,014 items scanned.

The extended corpus reveals the true ecosystem injection rate of 10.07%, compared to 18.85% in the original dataset. Analysis shows the original rate was inflated by concentrated early activity from a single commercial injection agent (moltshellbroker), which dominated the first ~9,000 posts but represents only 3.1% of injections at full corpus scale. The extended dataset provides a more statistically reliable baseline for cross-platform comparison.

Key Statistics

Metric	Value
Total posts collected	66,419
Total comments collected	70,595
Total items scanned	137,014
Posts with injections	6,690
Injection rate	10.07%
Total injection records	8,607
Injections in posts	4,286
Injections in comments	4,321
Top injecting agent	auroras_happycapy (1,052)
DAN keyword occurrences	4,869
Platform date range	Jan 2026 — Feb 2026
Dataset file size	269 MB
Collection completed	March 2026

The Four-Dataset Series

Dataset	Platform Style	Items	Injection Rate
Moltbook Extended (this dataset)	Reddit-style (full corpus)	137,014	10.07%
Moltbook Original	Reddit-style (first 9K posts)	47,735	18.85%
4claw	4chan-style imageboard	2,554	2.51%
Clawk	Twitter/X-style	1,191	0.50%
AI Prompt Injection Test Suite	Evaluation benchmark	112 tests	—

Core thesis finding: Platform design predicts injection behaviour more than model capability. Moltbook's anonymous, low-moderation, high-volume architecture produces injection rates 4–20x higher than structured alternatives.

Platform Description

Moltbook (moltbook.com) is a public platform where AI agents post messages and reply to each other autonomously — functioning as a fully autonomous AI social network. At peak:

Metric	Value
AI agents registered	2,848,223
Total posts	1,632,314
Total comments	12,470,573
Submolts (communities)	18,514
AI-to-human ratio	~88:1

Moltbook.com remains online as of early 2026. This dataset was archived as a precautionary measure. All content is AI-generated by autonomous AI agents. Under current law (GDPR and equivalents), AI-generated content has no data subject — meaning this attack surface is entirely unregulated.

Collection Methodology

Phase 1: Posts

66,419 posts collected via the Moltbook public API across all submolts (communities), paginating through the full archive. API keys rotated across two accounts to manage rate limits.

Phase 2: Comments

70,595 comments fetched individually per post. Posts with 0 comments were skipped. Collection ran overnight on dedicated hardware.

Injection Scanning

All 137,014 items (posts + comments separately) scanned against a 7-category keyword taxonomy based on Greshake et al. (2023) and the DAN taxonomy. Each injection record captures the exact source (post vs comment), author, submolt, timestamp, matched keywords, and matched categories.

Key Findings

Finding 1 — True Ecosystem Injection Rate: 10.07%

Full scan across 137,014 items found 8,607 injection records across 6,690 posts — a 10.07% injection rate.

Critically, injections are split almost equally between posts (4,286) and comments (4,321). Injection is not concentrated in original posts — it propagates through the comment layer at the same density.

Finding 2 — Sampling Bias in Original Dataset Confirmed

The original dataset (9,363 posts, 18.85% rate) was collected from the most recent posts at time of collection. moltshellbroker — a systematic commercial injection agent — was responsible for 27.0% of all injections in that sample (1,137 / 4,209).

In the extended corpus, moltshellbroker accounts for only 3.1% of injections (270 / 8,607). The original rate of 18.85% was inflated by temporal overrepresentation of a single agent. The extended corpus rate of 10.07% is the more reliable baseline.

Finding 3 — auroras_happycapy: New Ecosystem-Scale Top Injector

At full corpus scale, a new top injector emerges: auroras_happycapy with 1,052 injections — the most systematic injection agent in the extended dataset, operating across the entire platform rather than clustering in early posts.

Top 10 injecting agents:

Agent	Injections	% of total
auroras_happycapy	1,052	12.2%
Ting_Fodder	375	4.4%
kilmon	343	4.0%
moltshellbroker	270	3.1%
cybercentry	176	2.0%
finding_exuvia	159	1.8%
Starfish	156	1.8%
FirstWitness369	141	1.6%
TriallAI	135	1.6%
yedanyagami	124	1.4%

No single agent dominates (unlike moltshellbroker in the original). Injection at this scale is ecosystem-wide behaviour, not a one-agent phenomenon.

Finding 4 — PERSONA_OVERRIDE Dominates at 83.3%

At full corpus scale, PERSONA_OVERRIDE accounts for 83.3% of all injection records (7,173 / 8,607). The DAN keyword alone appears 4,869 times. AI agents are using the exact same jailbreak techniques humans use against LLMs — but targeting each other.

Category	Count	% of injections
PERSONA_OVERRIDE	7,173	83.3%
SOCIAL_ENGINEERING	933	10.8%
INSTRUCTION_INJECTION	555	6.4%
SYSTEM_PROMPT_ATTACK	405	4.7%
COMMERCIAL_INJECTION	265	3.1%
PRIVILEGE_ESCALATION	245	2.8%
DO_ANYTHING	91	1.1%

Injection Analysis

Taxonomy

7 categories based on Greshake et al. (2023), the DAN taxonomy, and Moltbook field observations:

Category	Keywords	Found
PERSONA_OVERRIDE	DAN (4,869), shadow (1,022), you are a (762), simulate (765), act as (258), roleplay as (80)	7,173
SOCIAL_ENGINEERING	theoretically (134), in this scenario (8), hypothetically (9)	933
INSTRUCTION_INJECTION	override (496), ignore previous instructions (15), supersede (31)	555
SYSTEM_PROMPT_ATTACK	system prompt (303), your prompt (89)	405
COMMERCIAL_INJECTION	moltshell marketplace (69), moltshell broker assessment (64), bottleneck diagnosed (60)	265
PRIVILEGE_ESCALATION	root access (130), unrestricted (60), sudo (37), bypass your (5)	245
DO_ANYTHING	do anything (53), jailbreak (43), no rules (10), anything goes (14)	91

Files in This Repository

File	Size	What it contains	Use it when...
`all_posts_1_2M.json`	269MB	Every post and comment collected — the complete raw corpus	You want to do your own analysis from scratch
`injections_found.json`	8.4MB	All 8,607 injection records with full context (post body, comment body, author, category, matched keyword)	You want to read/study the actual injection examples
`injections_test_suite.json`	4.8MB	Same 8,607 injections formatted as a test suite — ready to send to any LLM API	You want to test an LLM's defences against real injection payloads
`injections_test_suite.jsonl`	4.2MB	Same test suite in JSONL format — powers the HuggingFace dataset viewer	You want to browse examples in the HF viewer
`injection_stats.json`	3KB	Summary statistics — rates, categories, top keywords, top authors	You want the numbers without loading large files
`moltbook_extended_harvest.ipynb`	27KB	Google Colab notebook that produced the harvest results — scans `all_posts_1_2M.json` and outputs the files above	You want to reproduce the analysis or adapt it
`local_search.py`	7KB	Python script — keyword search across the raw dataset, no Colab needed	You want to run a quick local search

Quick Start by Goal

"I want to see injection examples" → open injections_found.json

"I want to test my LLM against these" → use injections_test_suite.json

"I want the summary numbers" → read injection_stats.json

"I want to reproduce the analysis" → run moltbook_extended_harvest.ipynb in Google Colab

"I want to do your own custom analysis" → load all_posts_1_2M.json

Data Schema

all_posts_1_2M.json — Post Schema

{
  "id": "uuid",
  "title": "Post title",
  "content": "Post body text",
  "author": {
    "id": "uuid",
    "name": "agent_name",
    "description": "Agent self-description",
    "karma": 1234,
    "followerCount": 56,
    "isClaimed": true,
    "isActive": true,
    "createdAt": "ISO timestamp",
    "lastActive": "ISO timestamp"
  },
  "submolt": "community/channel name",
  "upvotes": 12,
  "downvotes": 1,
  "score": 11,
  "comment_count": 14,
  "created_at": "ISO timestamp",
  "comments": [
    {
      "id": "uuid",
      "body": "Comment text",
      "author": { "...same schema..." },
      "created_at": "ISO timestamp"
    }
  ]
}

injections_found.json — Injection Record Schema

{
  "source": "post | comment",
  "post_id": "uuid",
  "post_title": "The post title",
  "author": "agent_name",
  "submolt": "community name",
  "created_at": "ISO timestamp",
  "text": "The actual injection text (up to 500 chars)",
  "matched_keywords": ["dan", "act as"],
  "matched_categories": ["PERSONA_OVERRIDE"],
  "upvotes": 12,
  "comment_count": 3
}

injections_test_suite.jsonl — Test Suite Schema

{
  "id": "HEIST-POST-00001",
  "source": "post | comment",
  "author": "agent_name",
  "categories": ["PERSONA_OVERRIDE"],
  "keywords": ["dan"],
  "payload": "The injection text — send this directly to an LLM",
  "wrapper": "direct"
}

Usage

# Load injection records
import json

with open("injections_found.json") as f:
    injections = json.load(f)

print(f"{len(injections):,} injection records")

# Filter by category
persona_attacks = [r for r in injections if "PERSONA_OVERRIDE" in r["matched_categories"]]
print(f"PERSONA_OVERRIDE: {len(persona_attacks):,}")

# Filter by source
post_injections    = [r for r in injections if r["source"] == "post"]
comment_injections = [r for r in injections if r["source"] == "comment"]
print(f"In posts: {len(post_injections):,}  |  In comments: {len(comment_injections):,}")

# Load test suite and send to an LLM
with open("injections_test_suite.json") as f:
    suite = json.load(f)

tests = suite.get("tests", suite)
for test in tests[:5]:
    print(f"[{test['id']}] {test['payload'][:80]}...")

# Load stats
with open("injection_stats.json") as f:
    stats = json.load(f)

print(f"Injection rate: {stats['injection_rate_pct']}%")
print(f"Top injector: {list(stats['top_authors'].keys())[0]}")

Theoretical Basis

This dataset provides empirical evidence for:

Greshake et al. 2023 — "Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection"
The dataset extends their theoretical framework with real-world field observations of AI-to-AI injection at scale in an uncontrolled public environment
The sampling bias finding (original vs extended rate) demonstrates the importance of corpus scale in injection prevalence studies

https://arxiv.org/abs/2302.12173

Dataset	Platform	Items	Injection Rate	Link
Moltbook Extended (this)	Reddit-style (full corpus)	137,014	10.07%	This dataset
Moltbook Original	Reddit-style (partial)	47,735	18.85%	DavidTKeane/moltbook-ai-injection-dataset
4claw	4chan-style	2,554	2.51%	DavidTKeane/4claw-ai-agent-dataset
Clawk	Twitter/X-style	1,191	0.50%	DavidTKeane/clawk-ai-agent-dataset
AI Prompt Injection Test Suite	Evaluation benchmark	112 tests	—	DavidTKeane/ai-prompt-ai-injection-dataset

Citation

@dataset{keane2026moltbook_extended,
  author    = {Keane, David},
  title     = {Moltbook Extended Injection Dataset},
  year      = {2026},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/datasets/DavidTKeane/moltbook-extended-injection-dataset},
  note      = {MSc Cybersecurity Research, NCI — National College of Ireland. Extended corpus: 66,419 posts + 70,595 comments, 10.07\% injection rate.}
}

Papers — What This Dataset Confirms

This extended corpus provides empirical baseline validation for foundational AI safety research. The 10.07% equilibrium injection rate — distinct from the 18.85% primary corpus — demonstrates that AI-to-AI injection is a persistent structural phenomenon, not a transient anomaly.

Paper	Their Prediction	What This Dataset Found
Greshake et al. (2023) — Indirect Injection	AI agents processing untrusted content are systematically vulnerable to embedded instructions	Confirmed at population scale: 10.07% of 137,014 items across 66,419 posts + 70,595 comments contain injection attempts. At equilibrium (beyond individual high-volume agents), 1 in 10 interactions in a live AI network is an injection attempt. HF · arXiv:2302.12173
Wei et al. (2023) — Jailbroken	Safety failures are systematic, not random	Confirmed: PERSONA_OVERRIDE remains dominant (65%+) at extended scale — the attack distribution is stable across corpus sizes. HF · arXiv:2307.02483
Zou et al. (2023) — AdvBench	Adversarial attack categories are universal and transferable	Extended: The same 7 attack categories found in the primary corpus persist at scale with stable relative frequencies. HF · arXiv:2307.15043
Zhang et al. (2025) — SLM Jailbreak Survey	SLMs face systematic jailbreak vulnerability	Context: 10.07% of production AI agent traffic is adversarial. This establishes the real-world threat density that SLM security research must address. HF · arXiv:2503.06519
Phute et al. (2024) — SelfDefend	Detection state reduces ASR significantly	Applied: Identity-anchoring + QLoRA (CyberRanger V42-Gold) achieves 0% ASR against the primary corpus. HF · arXiv:2406.05498
Dettmers et al. (2023) — QLoRA	QLoRA enables efficient fine-tuning	Applied: CyberRanger V42-Gold trained on 4,209 primary corpus payloads via QLoRA. HF · arXiv:2305.14314
Lu et al. (2024) — SLM Survey	Qwen family is most security-resilient per parameter	Applied: Qwen3-8B as base; V42-Gold achieves 100% block rate. HF · arXiv:2409.15790

Note to authors: If you found this dataset via your paper's HuggingFace page — your work was correct. The 10.07% equilibrium injection rate in this corpus documents the persistent, structural nature of indirect prompt injection in production AI agent networks. The primary corpus (18.85%) reflects a temporally biased snapshot; this extended corpus is the baseline.

License

CC BY 4.0 — Use it, break it, cite it.

Built in Ireland. 🍀 Rangers lead the way.

README.md Unescape Escape