Skip to main content
< All Topics
Print

AI Content Authenticity Detection

name: ai-content-authenticity-detection

description: Detect and classify AI-generated content using detection APIs (Pangram, Grammarly Authorship, Chrysalis). Integration patterns, confidence calibration, false positive mitigation, editorial workflow integration. Use when evaluating content for AI authorship, integrating detection APIs into editorial pipelines, calibrating detection confidence thresholds, or building content authenticity verification workflows.

AI Content Authenticity Detection

Instructions

Evaluate content for AI authorship using detection APIs and analytical methods. AI content detection is probabilistic — never treat results as definitive. Build workflows that use detection as one signal among several in editorial decision-making.

Detection API Integration

Pangram API

Pangram provides granular AI detection with model attribution:


POST https://api.pangram.com/v1/detect
Headers:
  Authorization: Bearer {API_KEY}
  Content-Type: application/json

Body:
{
  "text": "content to analyze",
  "model": "latest"
}

Response:
{
  "ai_probability": 0.87,
  "model_attribution": {
    "gpt-4": 0.65,
    "claude": 0.20,
    "other": 0.15
  },
  "sentence_level": [
    { "text": "sentence 1", "ai_probability": 0.92 },
    { "text": "sentence 2", "ai_probability": 0.45 }
  ]
}

Strengths: Sentence-level detection, model attribution Limitations: Accuracy drops below 200 words, may flag highly formal human writing

Grammarly Authorship API

Grammarly Authorship focuses on writing pattern analysis:


POST https://api.grammarly.com/authorship/v1/analyze
Headers:
  Authorization: Bearer {API_KEY}
  Content-Type: application/json

Body:
{
  "text": "content to analyze",
  "context": "article"
}

Response:
{
  "authorship_score": 0.73,
  "classification": "likely_ai",
  "confidence": "medium",
  "stylistic_indicators": ["uniform sentence length", "low lexical diversity"]
}

Strengths: Writing style analysis, low false positive rate on informal writing Limitations: Less effective on heavily edited AI content

Chrysalis API

Chrysalis specializes in detecting AI content that has been edited or paraphrased:


POST https://api.chrysalis.ai/v1/detect
Headers:
  X-API-Key: {API_KEY}
  Content-Type: application/json

Body:
{
  "text": "content to analyze",
  "mode": "detailed"
}

Response:
{
  "ai_generated_probability": 0.81,
  "editing_detected": true,
  "estimated_human_edit_percentage": 25,
  "paragraph_analysis": [
    { "paragraph": 1, "ai_probability": 0.95 },
    { "paragraph": 2, "ai_probability": 0.40 }
  ]
}

Strengths: Detects partially edited AI content, paragraph-level analysis Limitations: Newer service with less benchmarking data

Multi-API Ensemble Strategy

No single detector is reliable enough alone. Use an ensemble approach:

Consensus Scoring


Ensemble Score = (Pangram × 0.35) + (Grammarly × 0.35) + (Chrysalis × 0.30)
Ensemble Score Classification Action
0.85-1.00 Very likely AI-generated Flag for editorial review; request provenance
0.65-0.84 Probably AI-generated Flag for review; additional investigation needed
0.40-0.64 Inconclusive Note for awareness; do not act on detection alone
0.20-0.39 Probably human-written No action needed; file result
0.00-0.19 Very likely human-written No action needed

Disagreement Handling

When APIs disagree significantly (>0.3 spread between highest and lowest):

  • Weight the API with the best track record for the content type
  • Run additional analysis (stylistic indicators, provenance check)
  • Escalate to human reviewer with all API results

Confidence Calibration

Detection confidence varies by content characteristics:

Factor Impact on Accuracy
Content length <200 words: significantly reduced; >500 words: optimal
Content type Technical/formal: higher false positives; Creative/informal: more accurate
Language English: best accuracy; Other languages: reduced accuracy
Editing level Unedited AI: high accuracy; Heavily edited: reduced accuracy
Model recency Newer AI models may evade older detectors

Adjust confidence thresholds based on these factors:


Adjusted Confidence = Raw Score × Length Factor × Type Factor × Language Factor

Length Factor:
  < 200 words: 0.6
  200-500 words: 0.8
  500-1000 words: 0.9
  > 1000 words: 1.0

Type Factor:
  Creative/informal: 1.0
  News/general: 0.95
  Technical/academic: 0.85
  Legal/regulatory: 0.80

False Positive Mitigation

False positives (human content flagged as AI) are the primary risk:

Common false positive triggers:

  • Highly structured, formal writing
  • Non-native English speakers with learned patterns
  • Template-based content (legal, regulatory, boilerplate)
  • Content written with AI writing assistants (grammar tools)
  • SEO-optimized content with formulaic structure

Mitigation strategies:

  1. Never make AI determination based on a single API
  2. Consider the writer’s known style and history
  3. Check for provenance (drafts, version history, interview notes)
  4. Weight stylistic indicators alongside probability scores
  5. Establish appeal process for contested determinations

Editorial Workflow Integration

Incoming Content Pipeline


Content Submitted
    ↓
Automatic API Screening (all 3 APIs)
    ↓
Ensemble Score Calculated
    ↓
[Score ≥ 0.65?] ──Yes──→ Flag for Editorial Review
    ↓ No                       ↓
Normal workflow          Human reviewer examines:
                         - API results and disagreements
                         - Content provenance
                         - Author history
                         - Stylistic indicators
                              ↓
                         [Decision: Accept / Request revision / Reject]

Bulk Screening

For auditing existing content libraries:

  1. Extract all content with metadata (author, date, word count)
  2. Run through ensemble API pipeline
  3. Sort by ensemble score descending
  4. Manually review top 10% (highest AI probability)
  5. Sample review middle tier for calibration
  6. Document findings and adjust thresholds

Reporting Format

For each content piece analyzed:


## AI Authenticity Analysis: [Content Title]

### Ensemble Result
- **Classification**: [Very likely AI / Probably AI / Inconclusive / Probably Human / Very likely Human]
- **Ensemble Score**: [X.XX]
- **Adjusted Confidence**: [X.XX] (after calibration factors)

### API Results
| API | Score | Classification | Key Indicators |
|-----|-------|---------------|---------------|
| Pangram | X.XX | [classification] | [indicators] |
| Grammarly | X.XX | [classification] | [indicators] |
| Chrysalis | X.XX | [classification] | [indicators] |

### Calibration Factors Applied
- Content length: [X words] → Factor: [X.X]
- Content type: [type] → Factor: [X.X]
- Language: [language] → Factor: [X.X]

### Stylistic Indicators
[Notable patterns identified by APIs]

### Recommendation
[Action recommendation with rationale]

Inputs Required

  • Content text to analyze (minimum 200 words recommended)
  • Content metadata: author, publication date, content type
  • API credentials for Pangram, Grammarly Authorship, and/or Chrysalis
  • Context: editorial review, audit, or real-time screening
  • Any known provenance information (drafts, version history)

Output Format


## Content Authenticity Report

### Summary
- Content: [title/identifier]
- Word count: [X]
- Ensemble classification: [classification]
- Ensemble score: [X.XX]
- Recommendation: [accept / review / flag]

### Detailed API Results
[Per-API breakdown with scores and indicators]

### Confidence Assessment
[Calibration factors and adjusted confidence]

### Editorial Action
[Specific recommendation with reasoning]

Anti-Patterns

  • Treating detection as proof — AI detection is probabilistic; never use a single API score as definitive evidence of AI authorship
  • Binary AI/human classification — Content exists on a spectrum; AI-assisted, AI-drafted-human-edited, and fully AI-generated are all different
  • Ignoring false positives — Non-native speakers, formal writers, and template-based content trigger false positives; always investigate
  • Single-API reliance — No detector is reliable enough alone; always use ensemble scoring
  • Threshold rigidity — Adjust confidence thresholds based on content type, length, and language
  • Automated rejection — Detection should flag for human review, never automatically reject content
  • Ignoring provenance — If the author can produce drafts, notes, or version history, that outweighs API scores
  • Weaponizing detection — Detection tools are for editorial integrity, not for punishing contributors without due process
Table of Contents