montana/Русский/Разведка/Moltbook/github/moltbook-analysis/eval/identifiability/IDENTIFIABILITY_FINDINGS.md

6.3 KiB

Identifiability of Autonomy in Agent Communities: Findings

Generated: 2026-02-07 Methodology: v2.0 (Independent Classification)


Methodology Note

Previous approach (v1) classified agents by behavioral features (community entropy, activity level), then tested whether those same features differed between groups. This created circularity.

Current approach (v2) classifies agents by external validation signals (karma, verified status, follower ratio, owner linkage) that are completely independent from the behavioral observables being tested. All results are now methodologically sound.


RQ1.1: Behavioral Differences by Validation Status

Research Question: Do externally-validated agents exhibit different behavioral patterns than unvalidated agents?

Classification Method

Agents are classified using a Validation Score based on independent features:

Feature Weight Description
Karma 20% Platform engagement quality score
Verified 25% Platform verification status
Follower Ratio 15% Followers / Following ratio
Owner Linkage 25% Has linked X/Twitter account
Comment/Post Ratio 15% Engagement style indicator

Independence Guarantee: None of these features overlap with the behavioral observables being tested.

Group Sizes

Group Agents Description
High-Validation 13,343 Top 40% by validation score
Low-Validation 13,342 Bottom 40% by validation score
Total with data 33,356 Agents with validation signals

Data Coverage (from 8,778 profiles + post-level data)

Signal Count Coverage
Verified 8,778 26.3%
Has Owner 31,784 95.3%
Has Karma 25,813 77.4%
Has Followers 25,164 75.4%

RQ1.2: Behavioral Observable Differences

All four observables show significant differences between validation groups.

Results Summary

Observable High-Val Low-Val Diff p-value Cohen's d Interpretation
One-shot ratio 0.241 0.580 -0.339 <0.0001 -0.69 Validated agents MORE engaged
Cross-community entropy 0.759 0.478 +0.280 <0.0001 +0.36 Validated agents BROADER participation
Temporal burstiness -0.022 -0.152 +0.129 <0.0001 +0.42 Validated agents MORE spontaneous
Style consistency 0.432 0.492 -0.061 <0.0001 -0.30 Validated agents MORE varied style

Detailed Findings

1. One-Shot Ratio

  • High-validation: 24.1% post exactly once
  • Low-validation: 58.0% post exactly once
  • Effect size: Large (d = -0.69)
  • Interpretation: Externally-validated agents are substantially more engaged with the platform

2. Cross-Community Entropy

  • High-validation: 0.759 bits (spread across communities)
  • Low-validation: 0.478 bits (focused on fewer communities)
  • Effect size: Medium (d = +0.36)
  • Interpretation: Validated agents participate in a broader range of communities

3. Temporal Burstiness

  • High-validation: -0.022 (near-random timing, Poisson-like)
  • Low-validation: -0.152 (more regular/scheduled)
  • Effect size: Medium (d = +0.42)
  • Interpretation: Validated agents post more spontaneously; unvalidated agents show more regular patterns (possibly automated scheduling)

4. Style Consistency

  • High-validation: 0.432 (more stylistic variation)
  • Low-validation: 0.492 (more consistent/uniform style)
  • Effect size: Small-Medium (d = -0.30)
  • Interpretation: Validated agents write with more variety; unvalidated agents may use more templated content

Key Insights

Validated Agent Profile

Agents with high external validation signals (karma, verification, owner linkage) exhibit:

  • Higher engagement: Post multiple times, not one-shot accounts
  • Broader participation: Active across multiple communities
  • Spontaneous timing: Random posting patterns, not scheduled
  • Stylistic variety: Varied writing style, not templated

Unvalidated Agent Profile

Agents lacking external validation signals show:

  • Lower engagement: 58% are one-shot accounts
  • Narrow focus: Concentrated in fewer communities
  • Regular timing: More predictable posting patterns
  • Consistent style: More uniform/templated writing

Implication for Detection

External validation signals (karma, verification, owner linkage) correlate strongly with behavioral authenticity markers. Platforms can use social proof as a first-pass quality filter.


RQ1.3: Consistent Estimators with Guarantees

Recommended Estimators:

  • one_shot_ratio: Wilson score interval
  • entropy: Grassberger with bootstrap CI
  • burstiness: Jackknife variance estimation

Finite-Sample Bounds:

  • With 33,356 agents, we can estimate proportions within ±0.0083 with 95% confidence (Hoeffding bound).

Recommendations:

  • Wilson interval is preferred for proportions near 0 or 1
  • Grassberger estimator has good bias properties for entropy
  • Bootstrap provides distribution-free confidence intervals
  • Hoeffding bounds are conservative but assumption-free

Methodology Comparison

Aspect v1 (Circular) v2 (Independent)
Classification features Community entropy, activity level Karma, verified, followers, owner
Overlap with observables 2/4 overlapped 0/4 overlap
Circular results Yes No
Methodological soundness Compromised Sound

Summary

This analysis demonstrates that externally-validated agents exhibit significantly different behavioral patterns than unvalidated agents across all four observables. The methodology uses independent classification features (social validation signals) to ensure no circularity with the behavioral observables being tested.

Key takeaway: External validation status (karma, verification, owner linkage) predicts behavioral authenticity markers with medium-to-large effect sizes.


Files

  • 02_independent_classification.json - Full results with independent classification
  • 09_synthesis_v2.json - Updated synthesis with methodology v2
  • 02_nonidentifiability.json - Original results (deprecated, circular methodology)