AIcivic-intelligencearchitecturestate-firstMCPvibe-coding

Clio: The Civic Atlas That Compiles — 250,000+ Entities, One Validated Ground Truth

How a civic intelligence system built a 250,000-entity registry spanning power plants, watersheds, dams, airports, Census gazetteer data, and electoral schemas — where the model proposes and validators decide what is true.

V
Vario via Mnehmos
250k+
Validated Entities
12
Data Tracks
0
Hallucinated Sources

The Problem Clio Solves

Ask most AI systems a civic question and you get confident prose. The prose sounds authoritative. It cites things. Some of those things exist. Some do not. There is no way to tell which.

The failure mode is not hallucination alone. It is hallucination that looks like citation. A model that invents a source card is not making an obvious error. It is making a plausible one. In the civic domain — zoning, elections, policy, public services — plausible falsehoods are expensive.

The standard engineering response is to add disclaimers. Clio's response is to enforce provenance mechanically.

The Architecture

User request
  → ProductionOrchestrator
  → Briefing Planner (IR + section buffers)
  → Research Agent (resolves entities, creates TypedClaims)
  → Narrator Agent (packet-only prose)
  → Stagehand Director (visual staging commands)
  → Citation Auditor (prose, claims, sources)
  → Stagehand Runtime (validates effects)
  → PublicShowEvent stream
  → UI, replay, export
Private stream

Agent tokens, tool calls, source fetches, rejected drafts, repair loops, validation reports. Never shown to users.

Public stream

Only what has passed validation: narration, source cards, gap cards, Stagehand effects, section events.

Entity Registry Trust Model

The global registry is not publicly writable. Five layers, four of which are read-only from the outside.

Layer Scope Public access
Gold Master global read-only
Verified Import global read-only
Project Overlay one project read/write within project
User Assumption one project read/write within project
Session Draft one run temporary

Today's Census TIGER ingest lands at Verified Import. It will not drift. It will not be overwritten by a session that produces a more confident claim.

The Registry: 250,000+ Entities Across 12 Data Tracks

The registry reached 250k through disciplined Track A build-out — each source validated before admission.

Track Source Entities
A Phase 1 US Census TIGER 2024 — counties + places ~35,557
A Phase 2 OpenFlights airports ~7,000
A Phase 3 WRI Global Power Plant Database (CC-BY-4.0, v1.3) ~34,936
A Phase 4a USACE National Inventory of Dams ~92,000
A Phase 4c.1 USGS WBD HUC4/6/8/10/12 watersheds ~125,000
A Phase 4c.2 USGS GNIS named rivers + bayous ~8,408
A/B Natural Earth admin-0/1 (countries + states) thousands
D Phase 0 Electoral schemas + heads of state seeded
G People, corporations, treaties, military, cyber hundreds
Arizona ASLD statewide GIS (counties, cities, districts) 273

What Shipped Today

Track A Phase 1 + Track D Phase D0 merge + Arizona statewide GIS commit.

  • 3,223 US counties from 2024_Gaz_counties_national.txt
  • 32,334 US places from 2024_Gaz_place_national.txt
  • 15 AZ counties + 91 cities + 9 congressional districts + 158 Safford zoning features
  • Seeded heads-of-state registry
  • electoral.ts — full Zod schema for electoral entities
  • electoralRegistry.ts — 90-line typed registry
  • quickWinD0.ts — 431-line electoral seed with heads of state
  • 5 MCP electoral tools wired into the server
  • rag.search extended with a terms channel for electoral queries
  • 50-state GIS tracking checklist
  • 77 tests validating the gazetteer ingest
  • 188 tests covering the electoral schema layer
The merge required manually combining 14 conflict blocks across the server, registry, and source files. Track B Phase 1 overlay tools and Track D electoral tools coexist without collision.

Stagehand: Inline Visual Commands With Enforcement

Clio's narration and visual staging share one ordered stream through the Stagehand protocol.

The Strait of Hormuz is the chokepoint for roughly 20% of global oil.
[map.highlight entity="strait:hormuz" color="#ef4444" pulse=true]

Three counties in this district flipped in 2024.
[map.overlay entity="county:maricopa-az" style="electoral-swing"]

The bracketed command is not trusted until the runtime validates action schema, argument types, entity references, source references, and visual treatment rules. A command referencing an entity not in the registry does not render — it fails as a gap card.

What The Model Can Do

  • Read research obligations
  • Draft narration packets
  • Generate Stagehand command sequences
  • Propose entity updates and TypedClaims

What The Model Cannot Do

  • Write directly to the global registry
  • Bypass the Citation Auditor
  • Produce source cards from prose alone
  • Render visual effects without Stagehand validation
The Lesson

"Narration explains the civic state. It does not create it."

Public explanation should be compiled from validated source and entity state, not trusted as raw narration. The model can narrate 250,000 entities. It cannot invent them.