Context Engineering
Managing information flow and structure for optimal AI performance
192 techniques across 15 categories
Managing information flow and structure for optimal AI performance
Orchestrating multi-step agent interactions and task coordination
Advanced autonomous agent architectures and multi-agent systems
Sophisticated prompting techniques involving complex reasoning structures and meta-learning.
Fundamental prompting structures and conceptual frameworks
Techniques that guide the model through explicit reasoning steps
Techniques that enable LLMs to interact with external tools and environments
Methods for the model to reflect on and improve its own outputs
Techniques that incorporate external knowledge into prompts
Techniques to automate and improve prompt engineering
Techniques involving non-text modalities like images, audio, and video
Techniques optimized for specific domains or applications
Advanced techniques for organizing and coordinating multiple AI agents
Architectural design patterns for building secure and resilient LLM agents against threats like prompt injection.
Techniques for structuring prompts and engineering the context for optimal model performance.
Flash-Attention 3, sparse attention, and sub-quadratic scaling for processing massive contexts (200K+ tokens)
Advanced compression techniques for storing and processing large contexts efficiently
Attention mechanisms that only compute on relevant tokens for ultra-long contexts
Variable context size optimization based on task requirements and efficiency needs
Systematic approaches to managing context memory across long interactions
Integrating information across text, image, audio, and action modalities
Organized context into persistent, session, immediate, and transient layers
Intelligent ordering and filtering of context based on relevance and task requirements
Predictive context generation using world models for anticipatory reasoning
Moving from discrete token generation to continuous representations
Algorithmic approaches to compressing context while preserving meaning
Advanced attention mechanisms that bridge linear and softmax attention efficiency
Bounded execution contexts for agents with specific tools and resources
Distributing and coordinating context across multiple agents and systems
Efficient selection of important frames from long videos for multimodal processing
Gradient checkpointing and activation recomputation for context optimization
Anticipatory context loading based on predicted information needs
Standardized protocol for context delivery and tool interoperability
Caching pre-computed states of the context window (prefix caching) to reduce latency and cost for repeated queries.
Autonomous retrieval where an agent formulates queries, critiques results, and iteratively searches based on findings.
Distributed agent coordination with structured returns and validation
Persistent, shared state across multiple agents with isolation and validation
Systematic decomposition across strategic, tactical, and operational layers
Workflow execution optimization using graph theory and dependency analysis
Proactive task execution based on predicted needs with rollback capabilities
Specialized agents with intelligent routing and consensus mechanisms
Domain-specific agent development with capability-based routing
Coordination frameworks for managing multiple specialized agents
Structured communication protocols between agents with standardized formats
Dynamic workflow modification based on intermediate results and environmental changes
Autonomous retrieval-augmented generation with agent-driven retrieval strategies
High-performance remote direct memory access for distributed LLM systems
Dependency analysis, critical path identification, and optimization using graph theory
Automatic breaking down of complex tasks into manageable subtasks
Intelligent distribution of computational resources based on workflow needs
Structured approaches to error handling and workflow recovery
Identifying and executing independent tasks simultaneously
Systematic validation of workflow correctness and completeness
Real-time monitoring and optimization of workflow performance
Explicit workflow steps where the agent critiques its own output or intermediate state before proceeding.
Using a router (LLM or classifier) to dispatch queries to the most efficient model or specialized agent.
Unified architectures for multimodal understanding and action execution
Two-system architectures separating perception from action planning
GraphCoT-VLA for complex spatial reasoning and instruction following
Paradigm shift from discrete to continuous token generation
Combining speculative inference with sparse attention for efficiency
Full autonomous coding with system interaction and continuous learning
Multi-step reasoning with autonomous refactoring and project memory
Large-scale multi-agent orchestration with tree-of-thoughts integration
Natural language control of computer systems with session-based context
Combining diffusion models with autoregressive approaches for VLA
AI systems designed for physical world interaction with transfer learning
Adaptive control systems with safety guarantees for critical applications
Agent-driven decision systems with minimal human oversight
Specialized expert networks with intelligent routing mechanisms
Autonomous generation of training data using agent-driven quality assessment
Low-latency inference systems for real-time agent applications
Attention mechanisms across different modalities for unified understanding
Long-term memory systems for agents with learning and adaptation
Unified architectures for processing and understanding multiple modalities
Dynamic composition and chaining of tools by autonomous agents
Anticipatory execution for improved performance in agent workflows
Coordination protocols for multiple agents reaching agreement
Intelligent routing of tasks to appropriate agents based on context
Backup agent systems for handling degraded performance scenarios
Cross-specialist validation for ensuring output quality and consistency
Modeling reasoning as a graph where thoughts are nodes, allowing for non-linear exploration and combination.
Using a 'meta-model' or higher-level prompt to orchestrate multiple sub-models or generate task-specific prompts.
Iteratively improving an output by feeding it back into the model with specific critique instructions.
The simplest form of prompting, usually consisting of an instruction and input, without exemplars or complex reasoning steps.
Providing K > 1 demonstrations in the prompt to help the model understand patterns.
Prompting with instruction only, without any demonstrations or examples.
Providing exactly one demonstration in the prompt to help the model understand patterns.
The model's ability to learn from demonstrations/instructions within the prompt at inference time, without updating weights.
Prompts with masked slots for prediction, often in the middle of the text.
Standard prompt format where the prediction follows the input.
Using functions with variable slots to construct prompts in a systematic way.
Explicitly instructing the LLM with clear directions about the task.
Assigning a specific role or persona to the model.
Eliciting step-by-step reasoning before the final answer, usually via few-shot exemplars.
Appending a thought-inducing phrase without CoT exemplars, like 'Let's think step by step'.
CoT prompting using multiple CoT exemplars to demonstrate the reasoning process.
Exploring multiple reasoning paths in a tree structure using generate, evaluate, and search methods.
A two-stage approach: first generating a skeleton (outline) and then expanding points in parallel.
Extending Tree-of-Thoughts with more flexible graph structures for complex reasoning.
Breaking down complex problems into simpler subproblems and solving them sequentially.
Using recursive problem-solving approaches in prompting.
First devising a plan to solve the problem, then executing the plan step by step.
Taking a step back to ask higher-level questions before solving specific problems.
Expressing reasoning as executable programs rather than natural language.
Using a question-driven approach to guide reasoning through self-questioning.
Generating initial responses, then creating and answering verification questions to improve accuracy.
Assigning an agent role to the LLM that can use tools, make decisions, and interact with the environment.
Combining reasoning traces and task-specific actions in an interleaved manner.
Modular Reasoning, Knowledge and Language system combining neural language models with symbolic tools.
Reading natural language problems and generating programs as intermediate reasoning steps.
Correcting with Retrieval and Iterative Tool Interaction and Critique.
A code-first agent framework for seamlessly planning and executing data analytics tasks.
Agents specifically designed to interact with and use external tools effectively.
Agents that primarily operate through code generation and execution.
An iterative framework for code generation involving generation, implementation, testing, and modification.
Learning from self-reflection and environmental feedback to improve performance on subsequent attempts.
A lifelong learning agent with a growing skill library for open-ended exploration.
Integrating multiple tools into reasoning processes for mathematical problem solving.
Generating multiple reasoning paths and selecting the most consistent answer.
Model reviews and revises its own output.
Iteratively refining outputs through self-feedback without additional training.
Having the model verify the correctness of its own answers.
Adjusting confidence estimates to better match actual accuracy.
Working backwards from conclusions to verify reasoning paths.
Model asks itself follow-up questions to improve reasoning.
Applying self-consistency across different reasoning formats and approaches.
Encouraging the model to think about its own thinking processes.
Model generates its own examples for in-context learning.
Enhancing LLM responses by retrieving relevant information from external sources.
A retrieval technique that searches for demonstrations relevant to the input query.
Multiple rounds of retrieval and generation for complex tasks.
Combining retrieval with chain-of-thought reasoning in an interleaved manner.
Retrieval-augmented generation where the retrieval process is implicit and automatic.
Using retrieval to verify and correct generated content.
Using information from multiple files for code completion and generation.
Retrieving relevant context from multiple files to inform code generation.
Using algorithms to automatically improve prompt effectiveness.
Automatically generates and optimizes prompts for a given task.
Gradient-based prompt search for optimization.
Optimizing prompts in continuous vector spaces rather than discrete text.
Optimizing prompts at the discrete token level.
Combining continuous and discrete optimization approaches for prompts.
Learning continuous prompt embeddings while keeping the model frozen.
Using reinforcement learning to optimize prompts based on task performance.
Using large language models themselves to optimize prompts.
Applying genetic algorithms to evolve better prompts.
Using gradient information to optimize prompt effectiveness.
Incorporating 3D spatial information and models into prompts.
Using audio inputs as part of the prompt context.
Incorporating images as part of the prompt to guide model outputs.
Using video content as context for generating responses.
Using sequences of images to guide reasoning processes.
Combining reasoning over text and images in a step-by-step manner.
Extending graph-of-thought reasoning to multimodal inputs.
Learning from multimodal examples provided in context.
Converting images to textual descriptions for text-based models.
Specifying what should not appear in generated images.
Using pairs of related images to guide reasoning or generation.
Advanced code generation system using iterative refinement and testing.
Agents specialized for generating and refining code.
Applying structured reasoning to specific domains like mathematics.
Chain-of-thought reasoning for tabular data analysis.
Structured reasoning over tabular data with explicit table operations.
Date and time reasoning for temporal question answering.
Logic-focused chain-of-thought for logical reasoning tasks.
Prompting techniques specialized for mathematical problem solving.
Combining natural language reasoning with code execution for problem solving.
Breaking down code generation into modular components.
Designing structured workflows for complex task completion.
Using testing to guide iterative improvement in workflows.
A hierarchical task decomposition pattern where complex requests are broken into subtasks, delegated to specialized modes, and their results 'boomerang' back for integration.
Organizing AI systems into specialized operational modes, each with distinct capabilities, roles, and system prompts optimized for specific types of tasks.
Mode-specific validation mechanisms that monitor AI outputs for semantic drift, ensuring responses align with expected behavior and role-appropriate content.
Implementing strict schemas and validation to prevent errors from propagating between tasks in multi-agent systems through immutable inputs and sanitized outputs.
Community-maintained repositories of common AI system errors, their causes, reproduction steps, and correction strategies to enable systematic learning from failures.
Using structured markdown templates with YAML frontmatter to create reusable, configurable AI assistant workflows that work across different AI platforms.
Implementing structured rule hierarchies with global and project-specific configurations to guide AI assistant behavior consistently.
Structured prompting patterns for common development tasks like commits, PR reviews, issue analysis, and code quality checks.
Prompting techniques for integrating and orchestrating Model Context Protocol servers to extend AI capabilities with external tools and services.
Structured approaches for AI assistants to interact with GitHub repositories, issues, PRs, and project management through systematic research and action patterns.
Systematic approaches to managing AI agent configurations, including global settings, project-specific rules, and environment-specific adaptations.
Analyzing problems or solutions from multiple distinct viewpoints or roles to ensure comprehensive coverage and identify blind spots.
Systematic approach to creating well-formatted commits with conventional commit messages, semantic typing, and automated validation steps.
Systematic questioning technique that asks 'Why?' iteratively to drill down from symptoms to root causes of problems.
Automated creation of diagrams, flowcharts, and visual documentation from code structure, data models, or process descriptions.
Systematic technique for loading comprehensive project understanding by analyzing key files, structure, and conventions before performing tasks.
Systematic approach for continuously improving AI assistant prompts and rules based on emerging patterns, feedback, and performance metrics.
Structured patterns for automating web browser interactions, including element selection, timing management, and error handling strategies.
Multi-faceted code inspection methodology covering knowledge graphs, quality metrics, performance, security, architecture, and test coverage.
Systematic capture of application states and UI elements for documentation, testing, and visual verification purposes.
A security pattern where an agent can trigger pre-defined actions but is sandboxed from their outputs. This prevents feedback loops where tainted data from a tool's output could influence subsequent actions.
An agent generates a complete, static plan of action (e.g., a sequence of tool calls) *before* any exposure to untrusted input. This plan is executed without modification, preventing runtime deviations based on tainted data.
A pattern where a primary coordinating agent delegates the processing of multiple pieces of untrusted data to isolated, single-purpose sub-agents (the 'map' step). The results are then aggregated in a sanitized, structured format (the 'reduce' step).
A security architecture using two LLMs: a 'privileged' LLM that can access tools and sensitive data, and a 'quarantined' LLM that handles all untrusted user input. The privileged LLM is never exposed to untrusted content.
An advanced pattern, often seen as an evolution of the Dual LLM pattern, where a privileged LLM generates code in a secure, sandboxed Domain-Specific Language (DSL). This DSL defines the workflow and data flow, allowing for rigorous analysis and 'taint tracking' of untrusted data.
A security tactic where potentially malicious user input is deliberately removed from the LLM's context window at a strategic point in the workflow. This severs the causal link between a potential injection attempt and subsequent actions.
Attack technique using carefully crafted token sequences that cause models to produce harmful outputs regardless of input context.
Attack method where users instruct the model to take on fictional personas or characters that are not bound by the model's safety guidelines.
Gradual manipulation technique where attackers build trust and slowly escalate requests across multiple conversation turns to bypass safety measures.
Exploitation of conflicts between system instructions and user inputs, where attackers try to override system-level safety instructions with user-level commands.
Defense technique that trains models using a set of principles (constitution) to self-correct harmful outputs and maintain alignment with human values.
Training technique that exposes models to adversarial examples during training to improve robustness against attacks and jailbreaks.
Starting the model's response with a specific string to guide its output format and content.
Using XML tags to clearly separate different parts of the prompt (instructions, data, examples) to prevent confusion.
Using the system role to set the overall behavior, persona, and constraints of the model before user interaction.
Strategically organizing and optimizing the information provided in the context window to maximize relevance and model comprehension.
No techniques found matching your search.