Blog on GitHub Pages

Context Aware Tool Registry For Dspy

Managing Context-Aware Tools in DSPy for Training vs. Inference

The Problem

When building agentic systems with DSPy’s dspy.ReAct, you often face a dilemma: the tools you use during training/optimization are different from the tools you use during production inference.

  1. Training/Optimization: You might want to use mock tools, sandboxed environments, or simplified APIs to avoid side effects, reduce costs, or speed up the optimization loop (e.g., BootstrapFewShot, MIPRO).
  2. Inference: You need the real, production-grade tools that interact with live databases, external APIs, or user-facing systems.
  3. Code Separation: Your inference tools often live in a separate repository or deployment environment than your training code, so your training modules shouldn’t hardcode imports to production tools.
  4. Persistence: When you compile and save a DSPy program (save_program=True), the tool implementations are typically serialized with it. If you load this program in a different environment, it might try to use the old tools or fail if the dependencies are missing.

Standard dspy.ReAct initialization binds tools permanently to the module instance:

# Traditional approach (Problematic)
class MyAgent(dspy.Module):
    def __init__(self):
        # Tools are hardcoded at init time!
        self.react = dspy.ReAct("q->a", tools=[real_tool_1, real_tool_2]) 

The Solution: Context-Aware Tool Registry

We can solve this by decoupling tool definition (names/signatures) from tool implementation. We create a RegistryReAct factory that produces a ReAct module powered by a “smart proxy” dictionary. This proxy looks up the actual tool implementation from dspy.context at runtime.

Implementation

Save the following code in a utility file (e.g., dspy_utils.py). This code must be available to both your training and inference environments.

import dspy
from collections import UserDict
from dspy.adapters.types.tool import Tool

class ContextToolRegistry(UserDict):
    """
    A proxy dictionary that looks up tools in the global dspy.context at runtime.
    
    It enforces strict boundaries: a module can only access tools listed in its 
    allowed_tool_names, ensuring modules don't accidentally use tools intended 
    for other parts of the system.
    """
    def __init__(self, allowed_tool_names, default_finish_tool=None):
        # Initialize empty. We don't store tools internally.
        super().__init__()
        self.allowed_tool_names = set(allowed_tool_names)
        self.finish_tool = default_finish_tool

    def __getitem__(self, key):
        # 1. Handle 'finish' tool specially
        # If 'finish' is explicitly in context, use it. 
        # Otherwise fallback to the default one created during ReAct initialization.
        if key == "finish":
            registry = dspy.settings.get("tool_registry")
            if registry:
                tool = self._find_tool_in_registry(registry, "finish")
                if tool: return tool
            
            if self.finish_tool:
                return self.finish_tool
            raise KeyError("Tool 'finish' not found in context or default fallback.")

        # 2. Enforce allowed tools constraint
        # Modules cannot request tools they didn't declare during init.
        if key not in self.allowed_tool_names:
            raise KeyError(f"Tool '{key}' is not in the allowed list for this module.")

        # 3. Look up in active context
        registry = dspy.settings.get("tool_registry")
        if not registry:
             raise ValueError(f"Tool '{key}' requested, but no 'tool_registry' found in dspy.context.")

        tool = self._find_tool_in_registry(registry, key)
        if tool:
             return tool
        
        raise KeyError(f"Tool '{key}' required by module but not found in active context tool_registry.")
    
    def _find_tool_in_registry(self, registry, key):
        """Helper to find tool by name in dict or list registry."""
        if isinstance(registry, dict):
            if key in registry:
                t = registry[key]
                return t if isinstance(t, Tool) else Tool(t)
        elif isinstance(registry, list):
            for t in registry:
                t_obj = t if isinstance(t, Tool) else Tool(t)
                if t_obj.name == key:
                    return t_obj
        return None

def RegistryReAct(signature, tool_names, max_iters=10):
    """
    Factory that creates a dspy.ReAct instance connected to the context registry.
    
    Args:
        signature: The DSPy signature for the task.
        tool_names: List of strings (tool names) this module is allowed to use.
                    These tools MUST be present in context during initialization
                    to build the correct prompt signature.
        max_iters: Maximum ReAct iterations.
    """
    # 1. Get tools from context just for initialization
    context_registry = dspy.settings.get("tool_registry")
    
    if not context_registry:
        raise ValueError(
            "RegistryReAct initialization requires 'tool_registry' in dspy.context "
            "to build the initial signature."
        )
    
    # 2. Filter context tools to find the ones required by this module
    # We need actual tool objects to initialize ReAct so it can generate descriptions.
    init_tools = []
    
    # Build a temporary lookup for the tools currently in context
    if isinstance(context_registry, list):
        lookup = { (t.name if isinstance(t, Tool) else t.__name__): t for t in context_registry }
    else:
        lookup = context_registry

    missing = []
    for name in tool_names:
        if name in lookup:
            init_tools.append(lookup[name])
        else:
            missing.append(name)
            
    if missing:
        raise ValueError(f"Tools {missing} specified for module but not found in context registry.")

    # 3. Initialize ReAct
    # This builds the signature using the descriptions of the tools found in context.
    react = dspy.ReAct(signature, tools=init_tools, max_iters=max_iters)
    
    # 4. Replace the internal tools dict with our Proxy
    # Capture the 'finish' tool ReAct created so we can fallback to it.
    default_finish = react.tools.get("finish")
    
    # The proxy will now intercept all tool lookups.
    react.tools = ContextToolRegistry(tool_names, default_finish_tool=default_finish)
    
    return react

Usage

1. Defining Your Modules

Your modules declare intent (names of tools) rather than holding implementations.

import dspy
from dspy_utils import RegistryReAct

class FinancialAgent(dspy.Module):
    def __init__(self):
        super().__init__()
        # Declares intent to use "stock_price" and "company_news"
        # This module will ONLY have access to these two tools from the registry.
        self.react = RegistryReAct(
            "question -> answer", 
            tool_names=["stock_price", "company_news"]
        )

    def forward(self, **kwargs):
        return self.react(**kwargs)

2. Training Time (Mocks)

Inject mock tools during training. The optimizer works normally, compiling prompts against these mocks.

from my_training_tools import MockStockPrice, MockCompanyNews

# Tools must match names declared in the module
training_tools = [MockStockPrice(), MockCompanyNews()]

with dspy.context(tool_registry=training_tools):
    # Init works because tools are in context (needed for signature building)
    agent = FinancialAgent()
    
    # Compile
    optimizer = dspy.teleprompt.BootstrapFewShot(...)
    compiled_agent = optimizer.compile(agent, ...)
    
    # Save full program (architecture + state)
    # The saved file contains the RegistryReAct proxy logic, NOT the mock tools.
    compiled_agent.save("./saved_agent/", save_program=True)

3. Inference Time (Production)

Inject real tools at runtime. When the saved program is loaded, the ContextToolRegistry inside it will see your new context and execute the production tools.

from my_production_tools import RealStockPrice, RealCompanyNews

# Different implementations, same names ("stock_price", "company_news")
prod_tools = [RealStockPrice(), RealCompanyNews()]

# Load the saved program
loaded_agent = dspy.load("./saved_agent/")

# Run with production tools
with dspy.context(tool_registry=prod_tools):
    result = loaded_agent(question="What is Apple's current stock price?")