As we incorporate more MCP usage with CC, some ideas for trimming down token consumption from MCPs are now trickling out from foundation model providers (heavy users have been using tricks like these for a while): The Problem with using MCPs blindly: - All tool definitions loaded upfront → potentially hundreds of thousands of tokens consumed before any user prompts is ingested - Every intermediate result flows through the model context A Code Execution Solution: - MCP servers exposed as filesystem APIs (progressive disclosure) - Agents write TypeScript/JavaScript to orchestrate tasks - Intermediate data stays in execution environment - Only final summaries return to model context
How to reduce MCP token consumption with code execution
More Relevant Posts
-
Cloudflare coined it “Code Mode”—letting LLMs write code to operate on MCP servers, blending classic software engineering patterns with agentic AI workflows. https://lnkd.in/gCMYF6Xi
To view or add a comment, sign in
-
https://lnkd.in/eJbCdXtb "Today developers routinely build agents with access to hundreds or thousands of tools across dozens of MCP servers. However, as the number of connected tools grows, loading all tool definitions upfront and passing intermediate results through the context window slows down agents and increases costs. ... Tool descriptions occupy more context window space, increasing response time and costs. In cases where agents are connected to thousands of tools, they’ll need to process hundreds of thousands of tokens before reading a request."
To view or add a comment, sign in
-
Today we routinely build agents with access to hundreds of tools across dozens of MCP servers. However, as the number of connected tools grows, loading all tool definitions upfront and passing intermediate results through the context window slows down agents and increases costs. As MCP usage scales, there are two common patterns that can increase agent cost and latency: 1. Tool definitions overload the context window; 2. Intermediate tool results consume additional tokens. In this blog from Anthropic explore how code execution can enable agents to interact with MCP servers more efficiently, handling more tools while using fewer tokens. #MCP #Agents #AIAgents #Cost #Tokens #Context https://lnkd.in/gGzjUKii
To view or add a comment, sign in
-
Agents scale better by writing code to call tools. Our latest blog discusses how code execution enables agents to interact with MCP servers more efficiently, handling more tools while using fewer tokens. Read here: https://lnkd.in/eJCyW94v
To view or add a comment, sign in
-
Today developers routinely build agents with access to hundreds or thousands of tools across dozens of MCP servers. However, as the number of connected tools grows, loading all tool definitions upfront and passing intermediate results through the context window slows down agents and increases costs. In this blog we'll explore how code execution can enable agents to interact with MCP servers more efficiently, handling more tools while using fewer tokens. https://lnkd.in/gr5iSXSf
To view or add a comment, sign in
-
Interesting pattern for tool overload. The agent writes its own code to interact with MCP servers and over time it builds a library of skills that it can reuse. I like that self-improvement pattern. It also makes me think that if agents are great at writing their own MCP clients, they could also write their own API clients (provided proper Open API spec is available), which leads to the thought: maybe we don’t need MCP then? MCP was born as a standard protocol to plug and play stuff without having to write a custom integration. The key point was that the model could figure out the interfaces at run time. It worked beautifully. But it is proving to be token intensive (slow and expensive). Anthropic themselves are suggesting writing clients on the fly instead for efficiency. Of course this brings more security questions as you’d need to allow the agent to execute its own code. Code that is connected and authenticated against your sensitive data. What do you think? https://lnkd.in/gERFfYDe
To view or add a comment, sign in
-
⚙️ Managing C++ Library Dependencies the Modern Way — with vcpkg In most C++ projects, we depend on multiple third-party libraries — such as nlohmann-json, spdlog, OpenCV, libtorch, and Google Test, just to name a few. I’ve seen many developers still managing these dependencies manually — downloading libraries, linking them manually in CMake, or adding them as Git submodules. While this works, it has clear disadvantages: ● It’s difficult to keep dependencies up to date. ● It clutters CMakeLists with download logic and platform-specific hacks. ● It wastes build time, recompiling third-party libraries unnecessarily. No wonder C++ developers are sometimes known for “reinventing the wheel” — maintaining dependencies shouldn’t be this painful in modern development. 💡 A Better Way: Use vcpkg + CMake Presets + Git Submodules I now recommend managing dependencies automatically using vcpkg, combined with CMake presets and Git submodules. vcpkg is a free, open-source C/C++ package manager maintained by Microsoft and the open-source community. It supports over 2,600 libraries, making setup and maintenance smooth and repeatable. Here’s how to structure it: Add vcpkg as a Git submodule → No system-wide setup needed. Rebuilding on a new machine becomes effortless. Use vcpkg.json → Describe dependencies, versions, and platforms. Set your baseline in vcpkg-configuration.json. Set CMAKE_TOOLCHAIN_FILE → Point to vcpkg in your base configuration preset (via CMake Presets). Run one CMake workflow command → That’s it — everything builds and configures automatically. 🚀 Result This setup is simple, portable, and easy to maintain. Adding, deleting, or upgrading libraries becomes just one command away — no more dependency chaos. If you’re still managing dependencies manually, it’s worth giving vcpkg a try — it can make your C++ workflow far smoother and more modern. 👉 Check out my example GitHub repo for reference. https://lnkd.in/gWwg5k4E
To view or add a comment, sign in
-
🚫 Stop Escaped JSON in Middleware (Docker) Shipping Docker logs to Middleware and seeing escaped JSON (tons of \\")? Here are two easy fixes 👇 1️⃣ Use Docker’s Fluentd logging driver Point it directly at the Middleware agent (Fluent Forward). ✅ No escapes ✅ Fields parse automatically 2️⃣ Keep json-file + Fluent Bit Tail *-json.log, use Decode_Field_As json log, then forward to the agent. ✅ Clean JSON in the Middleware UI I’ve shared working configs + validation checks (Compose, Fluent Bit, parsers). Ping me if you want the Gist 🔗 TL;DR ● Escaping happens because Docker’s default json-file driver wraps your JSON log inside another JSON envelope. ● Fix A (simplest): Switch the app’s logging driver to fluentd and point it at the Middleware agent’s Fluent Forward port — no file tailing, no escapes. ● Fix B (bridge): Keep json-file but run Fluent Bit to tail logs, unescape payloads, and forward them cleanly. 🧭 Pick A when you can change the logging driver. 🧱 Pick B if infra policy or tooling requires json-file. 🔗 Full breakdown + troubleshooting (login required): https://lnkd.in/eZ_ApTmh Credits: Thanks to the #middleware's community and Middleware #devs who surfaced and validated the fluentd-driver approach. #dockerlogging #fluentbit #apm #observability #middlewareagent #k8slogging #otelcollector #devops #middleware
To view or add a comment, sign in
-
It’s finally here! I’m thrilled to announce the 1.0 release of LibJuno, the C and C++ embedded micro-framework! Many developers try to write the "Library of Everything”. This is a library that claims to do it all. It can solve all your problems with a slight catch: The developers assumed you’d conform to a specific use-case. These libraries prescribe everything from how you’ll run your software to how you’ll exchange data. A “Library of Everything” assumes it knows your project and your project requirements. LibJuno assumes the opposite. You are the expert on your project and your project’s requirements. You know how your project needs to run, what tool-chains you would like to use, and the specific software mechanisms required to accomplish your goals. LibJuno provides design patterns and implementations that can be used on any target. It’s flexible enough that you can use as little or as much as the library as you’d like without lock-in with the framework. LibJuno is designed with memory safety in mind and zero dependencies, not even on the standard C/C++ libraries. LibJuno is designed to be light weight, meaning it doesn’t prescribe an executive or runtime. It can be included within any project without committing to the entire framework. LibJuno has been utilized on cFS projects as well as custom runtime implementations. LibJuno is extremely versatile and capable of running anywhere from an embedded micro-processor running Linux, to a bare-metal micro-controller. LibJuno offers hash maps, queues, stacks and heaps with memory safety and static memory guarantees in mind. It provides APIs for many functionalities to ensure consistent interfaces across platforms. Check out LibJuno at: https://lnkd.in/gT-za3N2
To view or add a comment, sign in
-
🚀 Why We Replaced REST with gRPC (and What We Learned) For years, our microservices talked over REST — simple, familiar, and well-supported. But as our system grew (hundreds of internal service-to-service calls per request), REST started showing cracks 👇 ⚙️ The Problem 📉 High latency — JSON serialization + HTTP overhead was adding ~30–40ms per internal call. 🧩 Inconsistent contracts — REST endpoints drifted, breaking internal clients. 🕸️ Harder versioning — small schema changes led to major deploy coordination. 💡 The Switch to gRPC We migrated critical internal services (User → Payment → Notification chain) to gRPC for internal communication. ✅ What Improved ⚡ Performance: Latency dropped 40–60% thanks to binary Protobuf serialization. 🔒 Strict schema contracts: .proto files enforced consistent message formats across services. 🔁 Streaming support: Real-time updates over bidirectional streams simplified event-driven flows. 🧠 Language agnostic: Frontend gateway (Node.js) and backend (Java, Go) all generated code from the same contract. ⚠️ What Challenged Us 🧰 Debugging gRPC traffic isn’t as straightforward as inspecting JSON REST calls. 🧪 Needed extra tooling (Postman → BloomRPC / grpcurl / Wireshark). 🔄 Rolling updates required stricter backward compatibility planning. 🧭 Takeaway ➡️ REST is still great for external APIs (human-readable, cacheable). ➡️ gRPC shines for internal microservice communication — fast, strongly typed, and efficient. If you’re scaling a distributed system, it’s worth experimenting with gRPC for high-frequency internal calls. Would you like me to share how we handled load balancing and observability with gRPC next? 👀 #JavaDeveloper #gRPC #Microservices #SpringBoot #SystemDesign #BackendArchitecture #APIDesign
To view or add a comment, sign in
-
article -> https://www.anthropic.com/engineering/code-execution-with-mcp