How Fabula Works

Stories Are Context Graphs. We Treat Them That Way.

Most AI tools read scripts the way autocomplete reads sentences—left to right, no memory, no structure. Fabula decomposes narrative into a typed knowledge graph with five core node types, dual temporal indices, and hallucination-guarded entity extraction from unstructured sources at scale. The architecture is the argument.

Engineered over 18 months. Tested on 60+ episodes across Star Trek TNG, The West Wing, Doctor Who, and more.

The Foundation

Five Node Types. Every Story Ever Told.

Every narrative, from a sitcom pilot to a seven-season epic, reduces to the same five primitives. The insight isn't that these entities exist—it's how they connect. We separate canonical identity from transient participation state. A Character is who someone is. An EventParticipation is who they are in this scene—their goals, their emotional state, what they did. Most tools flatten this distinction. That's why they lose the plot. This separation enables narrative state prediction—forecasting character arc trajectories based on accumulated participation history.

Character

Canonical identity

Event

Narrative action

Scene

Temporal container

Location

Spatial anchor

Object

Narrative prop

When other tools extract “Picard was angry in this scene,” they overwrite his previous state. When Fabula extracts it, the anger is scoped to an EventParticipation—his canonical node retains the full arc. This is the difference between a search index and a knowledge graph.

EventParticipation · BDI Model
Jed Bartlet · The West Wing S2 Schema-validated JSON
{
  "incarnation_identifier":
    "as the President Concealing
     a Chronic Illness",
  "emotional_state_at_event":
    "Resolute but conflicted, carrying
     the weight of public deception
     against personal integrity",
  "goals_at_event": [
    "Reveal the MS diagnosis on his
     own terms before it becomes
     a scandal",
    "Protect Abbey from the fallout
     of the concealment"
  ],
  "beliefs_at_event": [
    "The American people deserve honesty
     from their President",
    "Disclosure now, while he controls
     the narrative, is less damaging
     than disclosure later"
  ],
  "importance_to_event": "primary"
}

Not Just “Character Was Present”

Every EventParticipation captures the character's Beliefs, Desires, and Intentions in that specific moment. The incarnation identifier tracks how the character presents in this scene versus their canonical identity. This is what makes queries like “scenes where Bartlet's beliefs conflict with his goals” answerable.

The BDI model is extracted per-character, per-event, producing a temporal stack of psychological states that can be traversed, compared, and queried across the full series.

The Architecture

Compositional Cognitive Architecture. Decision Lineage at Every Layer.

Fabula doesn't send your script to an LLM with “extract the characters.” It runs a four-phase entity extraction pipeline where each stage has its own schema constraints, validation gates, and error-recovery paths. Base personas combine with specialist modifiers and style templates—each independently testable and versionable.

1

Structural Decomposition

Parse the screenplay into acts, scenes, and beats. Identify dialogue vs. action. Map the temporal skeleton before touching content.

2

Entity Synthesis

Extract characters, locations, objects. Resolve aliases against the existing graph. “The Captain” = “Picard” = “Jean-Luc.” One canonical node.

3

Event Enrichment with BDI Model

For every event, extract Beliefs, Desires, and Intentions of each participant. Not just what happened—but what each character wanted, believed, and did about it.

4

Graph Construction with Dual Timelines

Build the knowledge graph with two temporal indices: fabula (story-world chronology) and syuzhet (narrative presentation order). Flashbacks get both timestamps. This dual-index architecture functions as a world model for narrative—maintaining both chronological state and presentation state.

The key architectural insight: base personas + specialist modifiers + style templates. Each component is independently testable and versionable. When the BDI extractor improves, we deploy it without touching structural decomposition. When a new show has unusual formatting, we swap one style template.

Why This Matters

Monolithic prompts are brittle. Change one thing and three others break. A compositional architecture means each stage has its own contract, its own tests, its own failure modes. It's how you build systems that improve reliably over time.

Context Engineering

Every Function Gets Exactly the Context It Needs. Nothing More.

The AI industry is waking up to what Andrej Karpathy calls “the delicate art and science of filling the context window with just the right information for the next step.” We've been doing it since day one. Every function in our pipeline receives a dynamically constructed payload—optimised for that specific task, stripped of everything irrelevant, scoped to prevent drift. This is the difference between prompting a model and engineering a system.

Payload Scoping

The entity synthesiser sees only the current scene excerpt and the relevant slice of the existing graph—not the entire screenplay, not the full database. Each function receives a context window assembled for its specific task.

Adaptive Token Budgeting

Content duration and complexity determine sampling density. Short scenes get dense context; long sequences get intelligently compressed. Frame counts, token limits, and output caps are all set dynamically—not by a fixed prompt template.

Context Rot Prevention

Chroma's research shows LLM performance degrades well before stated token limits. We never let it get there. Scope constraints, negative instructions, and temporal boundaries keep each call focused. The model can't hallucinate about scenes it never sees.

Prior State Injection

Each extraction call is anchored to previous work. The draft screenplay, the existing entity graph, the resolved aliases—injected as constraints, not conversation history. The model builds on verified state, not its own prior outputs.

Most AI applications dump everything into a giant context window and hope for the best. That's prompt engineering. Context engineering is the opposite: you build a system that dynamically constructs exactly the right payload for each function call. The model sees only what it needs, formatted how it needs it, with explicit constraints on what it should ignore.

Why This Is the Economics Layer

Dynamic context engineering is the foundation of cost control and quality assurance.

Cost control at the token level

Every token costs money. Scoped payloads mean we send thousands of tokens per call, not tens of thousands. The savings compound across sixty episodes.

Smaller models, same quality

A focused 8K-token payload to a smaller model outperforms a bloated 128K-token dump to a frontier model. We route tasks to the cheapest model that can handle the scoped context.

Hallucination prevention at source

Context poisoning, context distraction, context confusion—the failure modes Drew Breunig catalogued—all stem from sending the model too much, too irrelevant, or too contradictory information. We eliminate them structurally.

Model-agnostic by design

Because context is engineered per-function, we can swap models without rewriting prompts. OpenAI for vision, Anthropic for reasoning, open-source for commodity tasks. The context layer is the stable interface.

The Guardrails

Every Claim Requires Evidence. Every Entity Earns Its Place.

LLMs hallucinate. This is not a philosophical problem—it's an engineering one. We treat hallucination the way databases treat corruption: with schema constraints, validation layers, and evidence requirements at every extraction boundary.

BAML Schema Enforcement

Every LLM output is validated against a typed schema before it enters the graph. Wrong types, missing fields, malformed relationships—rejected at the boundary.

Evidence Grounding

Every extracted entity and relationship must cite the scene, dialogue line, or action description that supports it. No citation, no node.

Confidence Gating at 0.7

Extractions below our confidence threshold are flagged for human review, not silently committed. The graph stays clean; the human stays in the loop.

Contrastive Entity Sharpening

When the model is unsure whether two mentions refer to the same entity, it generates arguments for and against—then resolves with evidence, not probability.

Results: What the Guardrails Deliver

Duplicate entity flags

66% fewer

Harmonization time

62% faster

API calls per episode

43% fewer

Adjudication

Graph-Native Decision Adjudication. Not a Similarity Threshold.

When semantic search surfaces potential duplicates, most systems apply a cosine similarity threshold and hope for the best. Fabula uses multi-level LLM adjudication: a compositional prompt system that reasons through evidence, weighs competing interpretations, and documents every decision. The system doesn't just resolve entities—it gets smarter with each pass.

1

Candidate Discovery

ChromaDB semantic search surfaces plausible matches. Mathematical filtering reduces O(n²) comparisons to a tractable candidate set—only entities that could plausibly be the same.

2

LLM Adjudication

Each candidate pair goes to an LLM with full narrative context, existing descriptions, aliases, and a decision framework. The model reasons through evidence for and against merging—then commits with a confidence score.

3

Graph Update + Sharpening

Merges create a refined canonical entity with synthesised descriptions. Keeps trigger contrastive entity sharpening—enhancing both entities' definitions so they're never flagged as duplicates again.

Entity Resolution

Zero Entity Drift. From Pilot to Series Finale.

The hardest problem in narrative extraction isn't finding entities—it's keeping them consistent across sixty episodes and five years of production. We handle this with a hybrid GraphRAG architecture: Neo4j for structural queries and relationship traversal, ChromaDB for semantic similarity and fuzzy matching.

Before Entity Resolution

“The Captain” Separate node
“Picard” Separate node
“Jean-Luc” Separate node

After Entity Resolution

Jean-Luc Picard
The Captain Picard Jean-Luc Number One (by Lwaxana)
Hundreds
Canonical Entities
0
Entity Drift
<100ms
Query Latency
Typed Narrative Edges
connections.yaml Schema-validated
- connection_type: CAUSAL
  strength: strong
  description: "Q's abrupt appearance and assertion of authority
    on the bridge leads directly to his ultimatum commanding
    humanity's retreat, provoking Picard's demand for Q's
    identity and Conn's readiness to fight."

- connection_type: CHARACTER_CONTINUITY
  strength: strong
  description: "Picard's immediate reaction to Conn being frozen
    — administering orders for medical aid and confronting Q —
    reflects his steadfast leadership and moral resolve."

- connection_type: THEMATIC_PARALLEL
  strength: medium
  description: "Both events stage a confrontation between
    institutional authority and individual moral conviction,
    with the bridge serving as contested ground."

Every edge carries a type, a strength, and a narrative claim explaining why these events connect—not just that they do. This is what makes the graph queryable.

Performance

83 Minutes Per Episode. Down from Five Hours.

Performance isn't a feature—it's the line between a research prototype and production-grade software. We've achieved 3.6x improvement through architectural discipline, not hardware scaling.

5h → 83min

Prep / Async / Save

The three-step pattern: prepare all data before LLM calls, run extraction phases concurrently where possible, batch-write results. Simple discipline, dramatic results.

9,993 → 254

Smart Duplicate Filtering

Mathematical filtering reduces pairwise comparisons from O(n²) to a tractable set. We don't compare every entity against every other—we compare candidates that could plausibly match.

18x faster

Batch Graph Operations

Neo4j UNWIND operations replace individual node-by-node writes. One batch operation where we used to make thousands of individual calls.

Open Formats

Netflix Built a Walled Garden. We Built the Railway.

Fabula accepts industry-standard screenplay formats and exports to any format your team needs—from Excel for production coordinators to GraphQL for developers. Open formats don't dilute the moat. They create the moat by enabling network effects.

Import: Industry Screenplay Formats

Final Draft (FDX), Fountain (plain text), and PDF screenplays. No proprietary formats, no manual data entry. Your scripts go in as they are.

Export: Any Format You Need

GraphQL API for real-time queries. JSON-LD for linked data. RDF for semantic web. Cypher for direct Neo4j access. CSV for production spreadsheets. PDF for printable story bibles. Markdown for version control.

vs. Netflix EKG: Walled Garden

Netflix's Entertainment Knowledge Graph has no export formats, no API access, and no third-party integrations. It's proprietary, siloed, and internal-only. Fabula is the opposite: full data portability, zero vendor lock-in, open APIs that every tool can integrate with.

vs. Notion Templates: Manual Entry

Notion-based story bibles require manual entry, offer limited CSV/PDF export, and provide generic API access with no narrative intelligence. Fabula automates the extraction and exports structured, queryable knowledge graphs.

Security

Your Scripts. Your Servers. Your Graph.

We Never

  • Share your scripts with other customers
  • Use your scripts to train AI models
  • Store scripts on shared infrastructure
  • Allow cross-customer data access

You Control

  • Where your data is stored (our servers or yours)
  • Which AI providers process your scripts
  • Who has access to your knowledge graph
  • When data gets deleted (it's permanent)

Cloud

We host everything. You upload scripts, we handle servers, backups, updates.

Self-Hosted

You run it on your infrastructure. Your scripts never leave your network.

Hybrid

Process scripts on your servers. Use our cloud for search and visualization.

Built For

Who Builds on the Context Graph

Fabula serves teams where narrative complexity creates real operational cost—and where structural understanding of story changes what’s possible.

Production Teams

Every entity across every episode, automatically maintained, permanently queryable. Institutional knowledge that doesn’t depend on any individual’s memory.

Development Executives

Narrative complexity metrics, character arc density, relationship networks, thematic coverage—the structural fingerprint behind the gut instinct.

Game Studios

Structured character data, canonical relationship networks, and timeline-consistent descriptions exported in any format your engine requires. Months of manual cataloguing replaced by a query.

Investors

Production-validated context graph infrastructure for entertainment IP. Entity extraction, resolution, and graph construction tested across five series and 60+ episodes. Horizontal applicability with entertainment as the proving ground.

You’ve Read the Architecture. Now Walk the Graph.

The architecture is running. Explore the live catalog and see what compositional extraction, dual temporal indices, and contrastive entity sharpening produce when applied to real television. Or tell us about your project.