mcp tokens patterns architecture

The Scalpel, Not the Hammer

Five patterns from production MCP servers that cut token usage and eliminate retry loops. Real code, real impact.

V
Vario aka Mnehmos

The Token Equation

Tokens = Scope × Iterations × Verbosity
Most approaches only attack verbosity. We cut all three.
📐
Scope
Files read, options considered, tangents explored
🔄
Iterations
Retries from validation failures, ambiguous inputs
💬
Verbosity
Message size per round trip

Five patterns from production MCP servers that cut across all three factors.

1. Fuzzy Enum Matching

LLMs make typos. "indor" instead of "indoor". Rigid schemas reject, LLM retries, tokens wasted.

Solution: Three-tier matching that auto-corrects. From chatrpg.game/src/fuzzy-enum.ts:

export function findBestMatch(
  input: string,
  validValues: readonly string[],
  aliasCategory?: string
): { match: string | null; confidence: 'exact' | 'alias' | 'fuzzy' | 'none' } {
  const normalizedInput = input.toLowerCase().trim();

  // 1. Exact match (case-insensitive)
  const exactMatch = validValues.find(v => v.toLowerCase() === normalizedInput);
  if (exactMatch) return { match: exactMatch, confidence: 'exact' };

  // 2. Alias mappings (interior → indoor, blind → blinded)
  if (aliasCategory && GLOBAL_ALIASES[aliasCategory]) {
    const aliasedValue = GLOBAL_ALIASES[aliasCategory][normalizedInput];
    if (aliasedValue) return { match: aliasedValue, confidence: 'alias' };
  }

  // 3. Levenshtein fallback (max distance: 2 for short, 3 for longer)
  let bestMatch = null, bestDistance = Infinity;
  for (const validValue of validValues) {
    const distance = levenshteinDistance(normalizedInput, validValue.toLowerCase());
    if (distance < bestDistance) {
      bestDistance = distance;
      bestMatch = validValue;
    }
  }

  const maxDistance = normalizedInput.length <= 5 ? 2 : 3;
  if (bestDistance <= maxDistance && bestMatch) {
    return { match: bestMatch, confidence: 'fuzzy', distance: bestDistance };
  }

  return { match: null, confidence: 'none' };
}
Exact
"indoor""indoor"
Alias
"twilight""dim"
Fuzzy
"indor""indoor"

The Alias Registry

We map 285+ synonyms to canonical values across 15+ categories. The LLM says what's natural:

const GLOBAL_ALIASES: Record<string, Record<string, string>> = {
  lighting: {
    'light': 'bright', 'lit': 'bright', 'well-lit': 'bright',
    'shadowy': 'dim', 'low': 'dim', 'twilight': 'dim',
    'dark': 'darkness', 'pitch-black': 'darkness',
  },
  condition: {
    'blind': 'blinded', 'charm': 'charmed', 'deaf': 'deafened',
    'exhaust': 'exhaustion', 'fear': 'frightened', 'scared': 'frightened',
    'grapple': 'grappled', 'paralyze': 'paralyzed', 'stun': 'stunned',
  },
  damageType: {
    'slash': 'slashing', 'pierce': 'piercing', 'blunt': 'bludgeoning',
    'flame': 'fire', 'heat': 'fire', 'ice': 'cold', 'frost': 'cold',
    'electric': 'lightning', 'death': 'necrotic', 'holy': 'radiant',
  },
  ability: {
    'strength': 'str', 'dexterity': 'dex', 'constitution': 'con',
    'intelligence': 'int', 'wisdom': 'wis', 'charisma': 'cha',
  },
  skill: {
    'animal': 'animal_handling', 'intimidate': 'intimidation',
    'investigate': 'investigation', 'sneak': 'stealth',
  },
  actionType: {
    'strike': 'attack', 'hit': 'attack', 'run': 'dash', 'grab': 'grapple',
  },
  // ... 15+ more categories
};

"scared" works. "frightened" works. "fear" works. Zero prompt engineering required.

2. Guiding Errors

When fuzzy matching fails, the error teaches the fix. From ooda.mcp/src/tools/diff/editBlock.ts:

// When exact match fails, show what we found
if (similarity >= fuzzyThreshold) {
  return {
    success: false,
    message: 
      `Found similar text with ${Math.round(similarity * 100)}% similarity.\n\n` +
      `To apply this edit, use the exact text from the file:\n` +
      `\`\`\`\n${fuzzyMatch.value}\n\`\`\`\n\n` +
      `Preview of changes:\n${diffResult.unified}`,
    fuzzyMatch: { similarity, foundText: fuzzyMatch.value, inlineDiff }
  };
}

// When no match is close enough
return {
  success: false,
  message: 
    `Search text not found.\n\n` +
    `Closest match was ${Math.round(similarity * 100)}% similar.\n\n` +
    `Character differences:\n${inlineDiff}\n\n` +
    `Suggestions:\n` +
    `1. Use read_file_lines to see exact content\n` +
    `2. Copy the exact text from file\n` +
    `3. Check for whitespace differences`,
};
❌ Unhelpful

"Search text not found"
LLM guesses. Wrong. Guesses again. Wrong again.

✓ Guiding

"87% similar. Use this exact text: ..."
LLM copies exact text. Success on retry 1.

3. Batch Operations

Every file operation in ooda.mcp has a batch variant. 20 calls → 1 call.

batch_read_files - Read N files in parallel
batch_write_files - Write N files in parallel
batch_str_replace - Replace across N files
batch_exec_cli - Run N commands in parallel
batch_file_info - Get metadata for N paths
batch_copy_files - Copy N files in parallel
batch_search_in_files - Search with fuzzy
batch_keyboard_actions - Input sequences
// Batch search with fuzzy matching
batch_search_in_files({
  searches: [
    { path: "src/auth.ts", pattern: "validateToken" },
    { path: "src/types.ts", pattern: "TokenPayload" },
    { path: "tests/auth.test.ts", pattern: "describe.*token" }
  ],
  isFuzzy: true,           // Enable Levenshtein matching
  fuzzyThreshold: 0.7,     // 70% similarity threshold
  contextLines: 2,         // Include surrounding context
  maxMatchesPerFile: 50
});

// All matches in ONE response. Token savings: ~90%

4. Flexible Identifiers

LLMs remember names better than UUIDs. Every character tool in chatrpg.game accepts either identifier:

// By UUID (precise)
take_rest({
  characterId: "a7f3b2c1-...",
  restType: "long"
})
// By name (natural)
take_rest({
  characterName: "Fenric the Bold",
  restType: "long"
})

Server resolves names to IDs internally. The LLM uses what it remembers. Fewer lookups = fewer tokens.

5. Workspace Boundaries

In multi-agent workflows, scope creep kills token budgets. Agents read files they don't need, consider options outside their domain.

Task delegation includes explicit boundaries:

{
  task_id: "implement_auth_validation",
  assigned_to: "green-phase",
  
  // THE SCALPEL: Explicit boundaries
  workspace_path: "src/auth/",
  file_patterns: ["*.ts", "!*.test.ts"],
  
  in_scope: [
    "src/auth/validateToken.ts",
    "src/auth/types.ts"
  ],
  
  out_of_scope: [
    "src/auth/*.test.ts",  // Red phase owns tests
    "src/database/*",       // Different domain
    "src/api/*"             // Different domain
  ],
  
  acceptance_criteria: [
    "All tests in tests/auth/ pass",
    "No modifications outside workspace_path"
  ]
}

Agents that know their boundaries don't explore. They do their job and stop.

Production Impact

~95%
Retry reduction from fuzzy matching
20→1
Call reduction from batch ops
285+
Aliases across 15+ categories
0
Custom syntax to teach

The Five Patterns

1. Fuzzy Enum. Auto-correct typos with Levenshtein. Map synonyms with aliases.

2. Guiding Errors. "87% similar. Use this exact text: ..." beats "Not found" every time.

3. Batch Operations. If you might do it 5 times, expose a batch_* variant.

4. Flexible Identifiers. Accept name OR ID. Let the LLM use what it remembers.

5. Workspace Boundaries. Explicit in_scope / out_of_scope. Agents don't explore.

These patterns are live in ooda.mcp, chatrpg.game, and multi-agent.framework.