Rip the Tools Out - Ankur Kakroo

We glorify tool use. MCPs, agents with twelve tools strapped on, multi-step chains doing backflips. It looks impressive in a demo. Feels like progress. But ship it to real users and watch what happens. They wait. And wait. And then they close the tab.

Built an agent at work this week. Internal use case. Started the way everyone starts. Gave it a bunch of tools, let it reason its way through the workflow. Textbook agentic pattern. The kind of thing that gets a standing ovation in a conference talk. In production? Brutal. Every tool call is latency. Every reasoning step is a user staring at a spinner. The agent was correct. It was also unusable. Correct and unusable is a special kind of failure because you can't even point at the output and say it's wrong. It's not wrong. It's just slow enough that nobody cares.

So I ripped the tools out. Made the API calls myself. Hardcoded the paths I knew were predictable, restricted tool use to the genuinely ambiguous cases where the model's judgment actually mattered. Less elegant? Sure. Faster? Dramatically. And the outputs got more predictable too, because fewer decision points means fewer places for the model to wander off and get creative when you didn't ask it to.

The thing nobody says out loud: the LLM is the easiest part now. The models are absurdly capable. That stopped being the bottleneck a while ago. What's hard is everything around it. The harness, the orchestration, the part where you figure out how to deliver speed, predictability, and quality simultaneously. Users don't care that your agent can use fourteen tools. Users care that the thing works fast and does what they expected. That's it. That's the entire product requirement.

There's a design instinct in this space that defaults to "give the model more capabilities." More tools, more context, more autonomy. And sometimes that's right. But sometimes the answer is the opposite. Constrain the model, take away options, make decisions on its behalf so it doesn't have to. Not because it can't. Because the round trip costs more than the decision is worth. Every tool call you eliminate is latency you delete. Every decision you make upfront is variance you remove. The craft isn't in how much you let the agent do. It's in knowing exactly where the agent's judgment is worth the cost, and handling everything else yourself.