Semantic Video Studio: State Is the Product, Pixels Are Just Output
"Video should be compiled from durable cinematic state, not hallucinated as disposable pixels."
The Video Is Not the Artifact
Most AI video generation treats the MP4 as the goal. You write a prompt, you get pixels, you are done. If you want a different camera angle, you prompt again. If you want to change the lighting, you prompt again. If you want to reuse an asset from a previous scene, you prompt again and hope.
This is not a workflow problem. It is an architecture problem. When the output is the only artifact, there is nothing to edit. There is only regeneration.
Semantic Video Studio exists to fix the architecture.
The Four-Plane State Pack
Every SVS project is defined by four JSON files, each validated against a JSON Schema before the render begins.
scene_graph.json Object IDs, transforms, cameras, lights, environment
asset_manifest.json Reusable typed assets with interfaces, capabilities, and anchors
timeline.json Typed beats and actions over object, camera, and light IDs
render_plan.json Engine, resolution, FPS, samples, output path, render settings
The Pipeline
Natural-language prompt → prompt_to_brief.py (NL → typed ProductionBrief) → brief_to_state_pack.py (Brief → 4-plane JSON) → validate_scene.py (schemas + cross-refs + writability check) → build_video.py (validate → render → manifest) → blender/render_scene.py (bpy: state → scene → keyframes → PNG sequence) → ffmpeg (PNG sequence → MP4) → outputs/manifests/ (SHA-256 provenance record)
The validator runs before Blender opens. Every object_id referenced in a timeline action must exist in the scene graph. Every asset_id must exist in the manifest. If any check fails, the render does not start.
The Basics Gate
Seven invariants that the system keeps green forever.
test_00_environment.py Python version, Blender available, ffmpeg available
test_01_state_render.py A valid state pack renders to a non-empty MP4
test_02_plane_regen.py Modifying one plane re-renders only affected frames
test_03_mcp_roundtrip.py MCP tool calls produce valid state mutations
test_04_semantic_patch.py NL edit → patch → apply produces valid output state
test_05_prompt_to_state.py NL prompt → ProductionBrief → state pack is schema-valid
test_06_negative_fixtures.py Invalid state packs are rejected before render
These are not unit tests for internal functions. They are integration gates for the pipeline's core claims. If test_01 fails, renders are broken. If test_06 fails, the validation layer has regressed.
The Import Gate
External 3D assets — from Polyhaven, Sketchfab, or AI generators — pass through four stages before they can appear in a production.
import_external_asset.py Fetch and verify
normalize_imported_asset.py Blender headless: center, scale, clean
validate_imported_asset.py Schema check against asset_record.schema.json
preview_imported_asset.py Headless render preview
An asset that fails normalization does not enter the registry. An asset that fails validation does not enter the registry. An asset that produces a broken preview does not enter the registry.
Semantic Editing Without Regeneration
Describe the change → typed patch → validate against base hash → apply → re-render only what changed.
{
"patch_id": "edit_001",
"base_hash": "a3f8c2...",
"target_plane": "timeline",
"target_path": "$.beats[2].camera.position",
"from_value": [0, 5, 10],
"to_value": [0, 8, 12],
"rationale": "pull camera back for wider establishing shot"
} The base hash check is the key mechanism. If the state has changed since the patch was generated, the patch is rejected. You cannot accidentally apply an edit designed for a different version of the scene.
The Provenance Manifest
Every render writes a manifest to outputs/manifests/<render_id>.json.
{
"render_id": "r_20260517_alien_rover",
"timestamp": "2026-05-17T14:23:11Z",
"inputs": {
"scene_graph": { "path": "...", "sha256": "a3f8c2..." },
"asset_manifest": { "path": "...", "sha256": "b7d1e4..." },
"timeline": { "path": "...", "sha256": "c9a2f8..." },
"render_plan": { "path": "...", "sha256": "d5b3e1..." }
},
"output": {
"path": "outputs/alien_rover_r001.mp4",
"sha256": "e2c7a9...",
"frame_count": 240,
"duration_s": 10.0
},
"renderer": "blender-5.1",
"partial_regen": false
} The Deeper Pattern
SVS is a case study in a principle that shows up throughout the Mnehmos ecosystem: the model describes, the engine validates.
Model model narrates combat; engine enforces hit points and legal moves.
Model model proposes entity updates; citation auditor enforces source provenance.
Model model proposes moves; chess engine enforces legality.
Model model proposes scene state; validator enforces schema + cross-plane consistency.
Neither can do the other's job well.
"Generated media becomes editable when state is the primary artifact. Pixels are output. State is the product."
A prompt is not a source file. A four-plane JSON pack is. The repo is the memory, the state is the source of truth, and the output is a build artifact.