AIcivic-intelligencearchitecturestate-firstMCPvibe-coding

Clio: The Civic Atlas That Compiles — 250,000+ Entities, One Validated Ground Truth

How a civic intelligence system built a 250,000-entity registry spanning power plants, watersheds, dams, airports, Census gazetteer data, and electoral schemas — where the model proposes and validators decide what is true.

Vario via Mnehmos

May 17, 2026

250k+

Validated Entities

Data Tracks

Hallucinated Sources

The Problem Clio Solves

Ask most AI systems a civic question and you get confident prose. The prose sounds authoritative. It cites things. Some of those things exist. Some do not. There is no way to tell which.

The failure mode is not hallucination alone. It is hallucination that looks like citation. A model that invents a source card is not making an obvious error. It is making a plausible one. In the civic domain — zoning, elections, policy, public services — plausible falsehoods are expensive.

The standard engineering response is to add disclaimers. Clio's response is to enforce provenance mechanically.

The Architecture

User request
  → ProductionOrchestrator
  → Briefing Planner (IR + section buffers)
  → Research Agent (resolves entities, creates TypedClaims)
  → Narrator Agent (packet-only prose)
  → Stagehand Director (visual staging commands)
  → Citation Auditor (prose, claims, sources)
  → Stagehand Runtime (validates effects)
  → PublicShowEvent stream
  → UI, replay, export

Private stream

Agent tokens, tool calls, source fetches, rejected drafts, repair loops, validation reports. Never shown to users.

Public stream

Only what has passed validation: narration, source cards, gap cards, Stagehand effects, section events.

Entity Registry Trust Model

The global registry is not publicly writable. Five layers, four of which are read-only from the outside.

Layer	Scope	Public access
Gold Master	global	read-only
Verified Import	global	read-only
Project Overlay	one project	read/write within project
User Assumption	one project	read/write within project
Session Draft	one run	temporary

Today's Census TIGER ingest lands at Verified Import. It will not drift. It will not be overwritten by a session that produces a more confident claim.

The Registry: 250,000+ Entities Across 12 Data Tracks

The registry reached 250k through disciplined Track A build-out — each source validated before admission.

Track	Source	Entities
A Phase 1	US Census TIGER 2024 — counties + places	~35,557
A Phase 2	OpenFlights airports	~7,000
A Phase 3	WRI Global Power Plant Database (CC-BY-4.0, v1.3)	~34,936
A Phase 4a	USACE National Inventory of Dams	~92,000
A Phase 4c.1	USGS WBD HUC4/6/8/10/12 watersheds	~125,000
A Phase 4c.2	USGS GNIS named rivers + bayous	~8,408
A/B	Natural Earth admin-0/1 (countries + states)	thousands
D Phase 0	Electoral schemas + heads of state	seeded
G	People, corporations, treaties, military, cyber	hundreds
Arizona	ASLD statewide GIS (counties, cities, districts)	273

What Shipped Today

Track A Phase 1 + Track D Phase D0 merge + Arizona statewide GIS commit.

✓ 3,223 US counties from 2024_Gaz_counties_national.txt
✓ 32,334 US places from 2024_Gaz_place_national.txt
✓ 15 AZ counties + 91 cities + 9 congressional districts + 158 Safford zoning features
✓ Seeded heads-of-state registry
✓ electoral.ts — full Zod schema for electoral entities
✓ electoralRegistry.ts — 90-line typed registry
✓ quickWinD0.ts — 431-line electoral seed with heads of state
✓ 5 MCP electoral tools wired into the server
✓ rag.search extended with a terms channel for electoral queries
✓ 50-state GIS tracking checklist
✓ 77 tests validating the gazetteer ingest
✓ 188 tests covering the electoral schema layer

The merge required manually combining 14 conflict blocks across the server, registry, and source files. Track B Phase 1 overlay tools and Track D electoral tools coexist without collision.

Stagehand: Inline Visual Commands With Enforcement

Clio's narration and visual staging share one ordered stream through the Stagehand protocol.

The Strait of Hormuz is the chokepoint for roughly 20% of global oil.
[map.highlight entity="strait:hormuz" color="#ef4444" pulse=true]

Three counties in this district flipped in 2024.
[map.overlay entity="county:maricopa-az" style="electoral-swing"]

The bracketed command is not trusted until the runtime validates action schema, argument types, entity references, source references, and visual treatment rules. A command referencing an entity not in the registry does not render — it fails as a gap card.

✓ What The Model Can Do

Read research obligations
Draft narration packets
Generate Stagehand command sequences
Propose entity updates and TypedClaims

✗ What The Model Cannot Do

Write directly to the global registry
Bypass the Citation Auditor
Produce source cards from prose alone
Render visual effects without Stagehand validation

The Lesson

"Narration explains the civic state. It does not create it."

Public explanation should be compiled from validated source and entity state, not trusted as raw narration. The model can narrate 250,000 entities. It cannot invent them.