How to reduce MCP token consumption with code execution

This title was summarized by AI from the post below.

1mo

As we incorporate more MCP usage with CC, some ideas for trimming down token consumption from MCPs are now trickling out from foundation model providers (heavy users have been using tricks like these for a while): The Problem with using MCPs blindly: - All tool definitions loaded upfront → potentially hundreds of thousands of tokens consumed before any user prompts is ingested - Every intermediate result flows through the model context A Code Execution Solution: - MCP servers exposed as filesystem APIs (progressive disclosure) - Agents write TypeScript/JavaScript to orchestrate tasks - Intermediate data stays in execution environment - Only final summaries return to model context

Code execution with MCP: building more efficient AI agents anthropic.com

1 Comment

Jeremy Watt

1mo

article -> https://www.anthropic.com/engineering/code-execution-with-mcp

To view or add a comment, sign in

More Relevant Posts

Dr. John Rares Almasan
4w
Report this post
Cloudflare coined it “Code Mode”—letting LLMs write code to operate on MCP servers, blending classic software engineering patterns with agentic AI workflows. https://lnkd.in/gCMYF6Xi

Code execution with MCP: building more efficient AI agents anthropic.com
Like Comment
To view or add a comment, sign in
Jeremy Ottley
3w
Report this post
https://lnkd.in/eJbCdXtb "Today developers routinely build agents with access to hundreds or thousands of tools across dozens of MCP servers. However, as the number of connected tools grows, loading all tool definitions upfront and passing intermediate results through the context window slows down agents and increases costs. ... Tool descriptions occupy more context window space, increasing response time and costs. In cases where agents are connected to thousands of tools, they’ll need to process hundreds of thousands of tokens before reading a request."

Code execution with MCP: building more efficient AI agents anthropic.com
Like Comment
To view or add a comment, sign in
Preenesh Nayanasudhan
4w Edited
Report this post
Today we routinely build agents with access to hundreds of tools across dozens of MCP servers. However, as the number of connected tools grows, loading all tool definitions upfront and passing intermediate results through the context window slows down agents and increases costs. As MCP usage scales, there are two common patterns that can increase agent cost and latency: 1. Tool definitions overload the context window; 2. Intermediate tool results consume additional tokens. In this blog from Anthropic explore how code execution can enable agents to interact with MCP servers more efficiently, handling more tools while using fewer tokens. #MCP #Agents #AIAgents #Cost #Tokens #Context https://lnkd.in/gGzjUKii

Code execution with MCP: building more efficient AI agents anthropic.com
Like Comment
To view or add a comment, sign in
Conor Kelly
1mo Edited
Report this post
Agents scale better by writing code to call tools. Our latest blog discusses how code execution enables agents to interact with MCP servers more efficiently, handling more tools while using fewer tokens. Read here: https://lnkd.in/eJCyW94v

Code execution with MCP: building more efficient AI agents anthropic.com

2 Comments
Like Comment
To view or add a comment, sign in
KHALID HUSAIN
1mo
Report this post
Today developers routinely build agents with access to hundreds or thousands of tools across dozens of MCP servers. However, as the number of connected tools grows, loading all tool definitions upfront and passing intermediate results through the context window slows down agents and increases costs. In this blog we'll explore how code execution can enable agents to interact with MCP servers more efficiently, handling more tools while using fewer tokens. https://lnkd.in/gr5iSXSf

Code execution with MCP: building more efficient AI agents anthropic.com
Like Comment
To view or add a comment, sign in
Francisco Galarza
1mo
Report this post
Interesting pattern for tool overload. The agent writes its own code to interact with MCP servers and over time it builds a library of skills that it can reuse. I like that self-improvement pattern. It also makes me think that if agents are great at writing their own MCP clients, they could also write their own API clients (provided proper Open API spec is available), which leads to the thought: maybe we don’t need MCP then? MCP was born as a standard protocol to plug and play stuff without having to write a custom integration. The key point was that the model could figure out the interfaces at run time. It worked beautifully. But it is proving to be token intensive (slow and expensive). Anthropic themselves are suggesting writing clients on the fly instead for efficiency. Of course this brings more security questions as you’d need to allow the agent to execute its own code. Code that is connected and authenticated against your sensitive data. What do you think? https://lnkd.in/gERFfYDe

Code execution with MCP: building more efficient AI agents anthropic.com
Like Comment
To view or add a comment, sign in
Lucas Chen
1mo Edited
Report this post
⚙️ Managing C++ Library Dependencies the Modern Way — with vcpkg In most C++ projects, we depend on multiple third-party libraries — such as nlohmann-json, spdlog, OpenCV, libtorch, and Google Test, just to name a few. I’ve seen many developers still managing these dependencies manually — downloading libraries, linking them manually in CMake, or adding them as Git submodules. While this works, it has clear disadvantages: ● It’s difficult to keep dependencies up to date. ● It clutters CMakeLists with download logic and platform-specific hacks. ● It wastes build time, recompiling third-party libraries unnecessarily. No wonder C++ developers are sometimes known for “reinventing the wheel” — maintaining dependencies shouldn’t be this painful in modern development. 💡 A Better Way: Use vcpkg + CMake Presets + Git Submodules I now recommend managing dependencies automatically using vcpkg, combined with CMake presets and Git submodules. vcpkg is a free, open-source C/C++ package manager maintained by Microsoft and the open-source community. It supports over 2,600 libraries, making setup and maintenance smooth and repeatable. Here’s how to structure it: Add vcpkg as a Git submodule → No system-wide setup needed. Rebuilding on a new machine becomes effortless. Use vcpkg.json → Describe dependencies, versions, and platforms. Set your baseline in vcpkg-configuration.json. Set CMAKE_TOOLCHAIN_FILE → Point to vcpkg in your base configuration preset (via CMake Presets). Run one CMake workflow command → That’s it — everything builds and configures automatically. 🚀 Result This setup is simple, portable, and easy to maintain. Adding, deleting, or upgrading libraries becomes just one command away — no more dependency chaos. If you’re still managing dependencies manually, it’s worth giving vcpkg a try — it can make your C++ workflow far smoother and more modern. 👉 Check out my example GitHub repo for reference. https://lnkd.in/gWwg5k4E

GitHub - lucas-utd/hardware-project-template github.com
Like Comment
To view or add a comment, sign in
Mahendra Rao
1mo
Report this post
🚫 Stop Escaped JSON in Middleware (Docker) Shipping Docker logs to Middleware and seeing escaped JSON (tons of \\")? Here are two easy fixes 👇 1️⃣ Use Docker’s Fluentd logging driver Point it directly at the Middleware agent (Fluent Forward). ✅ No escapes ✅ Fields parse automatically 2️⃣ Keep json-file + Fluent Bit Tail *-json.log, use Decode_Field_As json log, then forward to the agent. ✅ Clean JSON in the Middleware UI I’ve shared working configs + validation checks (Compose, Fluent Bit, parsers). Ping me if you want the Gist 🔗 TL;DR ● Escaping happens because Docker’s default json-file driver wraps your JSON log inside another JSON envelope. ● Fix A (simplest): Switch the app’s logging driver to fluentd and point it at the Middleware agent’s Fluent Forward port — no file tailing, no escapes. ● Fix B (bridge): Keep json-file but run Fluent Bit to tail logs, unescape payloads, and forward them cleanly. 🧭 Pick A when you can change the logging driver. 🧱 Pick B if infra policy or tooling requires json-file. 🔗 Full breakdown + troubleshooting (login required): https://lnkd.in/eZ_ApTmh Credits: Thanks to the #middleware's community and Middleware #devs who surfaced and validated the fluentd-driver approach. #dockerlogging #fluentbit #apm #observability #middlewareagent #k8slogging #otelcollector #devops #middleware

Stop Escaped JSON in Middleware (Docker) — Two Reliable Fixes community.middleware.io
Like Comment
To view or add a comment, sign in
Robin Onsay
1mo
Report this post
It’s finally here! I’m thrilled to announce the 1.0 release of LibJuno, the C and C++ embedded micro-framework! Many developers try to write the "Library of Everything”. This is a library that claims to do it all. It can solve all your problems with a slight catch: The developers assumed you’d conform to a specific use-case. These libraries prescribe everything from how you’ll run your software to how you’ll exchange data. A “Library of Everything” assumes it knows your project and your project requirements. LibJuno assumes the opposite. You are the expert on your project and your project’s requirements. You know how your project needs to run, what tool-chains you would like to use, and the specific software mechanisms required to accomplish your goals. LibJuno provides design patterns and implementations that can be used on any target. It’s flexible enough that you can use as little or as much as the library as you’d like without lock-in with the framework. LibJuno is designed with memory safety in mind and zero dependencies, not even on the standard C/C++ libraries. LibJuno is designed to be light weight, meaning it doesn’t prescribe an executive or runtime. It can be included within any project without committing to the entire framework. LibJuno has been utilized on cFS projects as well as custom runtime implementations. LibJuno is extremely versatile and capable of running anywhere from an embedded micro-processor running Linux, to a bare-metal micro-controller. LibJuno offers hash maps, queues, stacks and heaps with memory safety and static memory guarantees in mind. It provides APIs for many functionalities to ensure consistent interfaces across platforms. Check out LibJuno at: https://lnkd.in/gT-za3N2

LibJuno robinonsay.com

1 Comment
Like Comment
To view or add a comment, sign in
Bhavana Ganji
1mo
Report this post
🚀 Why We Replaced REST with gRPC (and What We Learned) For years, our microservices talked over REST — simple, familiar, and well-supported. But as our system grew (hundreds of internal service-to-service calls per request), REST started showing cracks 👇 ⚙️ The Problem 📉 High latency — JSON serialization + HTTP overhead was adding ~30–40ms per internal call. 🧩 Inconsistent contracts — REST endpoints drifted, breaking internal clients. 🕸️ Harder versioning — small schema changes led to major deploy coordination. 💡 The Switch to gRPC We migrated critical internal services (User → Payment → Notification chain) to gRPC for internal communication. ✅ What Improved ⚡ Performance: Latency dropped 40–60% thanks to binary Protobuf serialization. 🔒 Strict schema contracts: .proto files enforced consistent message formats across services. 🔁 Streaming support: Real-time updates over bidirectional streams simplified event-driven flows. 🧠 Language agnostic: Frontend gateway (Node.js) and backend (Java, Go) all generated code from the same contract. ⚠️ What Challenged Us 🧰 Debugging gRPC traffic isn’t as straightforward as inspecting JSON REST calls. 🧪 Needed extra tooling (Postman → BloomRPC / grpcurl / Wireshark). 🔄 Rolling updates required stricter backward compatibility planning. 🧭 Takeaway ➡️ REST is still great for external APIs (human-readable, cacheable). ➡️ gRPC shines for internal microservice communication — fast, strongly typed, and efficient. If you’re scaling a distributed system, it’s worth experimenting with gRPC for high-frequency internal calls. Would you like me to share how we handled load balancing and observability with gRPC next? 👀 #JavaDeveloper #gRPC #Microservices #SpringBoot #SystemDesign #BackendArchitecture #APIDesign
2 Comments
Like Comment
To view or add a comment, sign in

1,671 followers

60 Posts

View Profile Follow

How to reduce MCP token consumption with code execution

More Relevant Posts

Explore content categories