We stand with Ukraine
Go Wombat logo

How Can MCP Agents Run a Smart City Grid in Real Time?

Article by

Updated on June 1, 2026

Read — 7 minutes

Model Context Protocol agents can run parts of a smart city’s energy grid by reading from EMS, DERMS, weather feeds, and market platforms through standardised MCP servers, then proposing actions back into those systems through audited tool calls. The hard real-time control loop stays on physical controllers. The coordination loop, the part that today still relies on a human in the control room, is what shifts.

That distinction matters. Most utilities have spent the last decade automating individual subsystems and stitching them together with custom APIs. The agentic AI layer changes how those subsystems talk to each other. Anthropic released MCP as an open standard on 25 November 2024, and within a year, it became the default integration shape for production AI tooling. Bringing Model Context Protocol agents into the grid is the next architectural step, and the first European pilots are already running.

What Is an MCP-Enabled Agent Network in a Grid Context?

Five agents, one grid

An MCP-enabled agent network is a set of large language model agents that read from and write to grid systems through Model Context Protocol servers, with one orchestration agent planning across them. Each MCP server wraps a backend with a typed schema, an authorisation contract, and an audit trail.

The specification defines three server primitives, Tools, Resources, and Prompts, all callable through JSON-RPC. In twelve months, MCP moved from a developer convenience to the integration default for agentic AI, with pre-built servers for systems ranging from Postgres to GitHub.

Why this matters for a multi-agent system on the grid. REST APIs and Kafka topics already connect grid systems, but every new integration still asks for a custom client, custom auth, and custom serialisation. Move one vendor, and the wiring fractures. MCP fixes the shape of that wiring once, and any agent that speaks the protocol can call any tool.

The agents themselves are split by function. A pattern emerging in early pilots looks like this:

  • Forecasting agents take weather, calendar, and historical load data and produce day-ahead and intraday demand and generation profiles.
  • Dispatch agents read those forecasts and propose set points for distributed energy resources, battery storage, and demand response programmes.
  • Market agents talk to day-ahead and intraday platforms such as Nord Pool, EPEX SPOT, and Elexon BSC, and place flexibility bids.
  • Fault-isolation agents watch SCADA alarms, correlate them with topology and weather, and propose switching schemes.
  • A planner agent coordinates the others, brokers conflicts, and surfaces decisions for operator approval.

This is the same architectural move described in our piece on agents replacing REST controllers. What used to be a chatbot wrapper becomes the reasoning layer between systems, and a multi-agent system pattern carries the load that monolithic schedulers could not.

The Reference Architecture for Grid Agents

The Reference Architecture for Grid Agents

The reference architecture for an MCP-enabled grid agent network has three layers: MCP servers that wrap each backend, agents that reason and plan, and a coordination fabric that lets agents talk to each other.

MCP servers wrap the legacy stack

Every grid system gets one MCP server. SCADA over IEC 60870-5-104, ADMS such as Siemens Spectrum Power or GE Vernova PowerOn, DERMS, AMI head-end systems running DLMS/COSEM, weather and forecast feeds, market platforms. Each server exposes typed Tools (read meter, send curtailment signal, query forecast) and Resources (network topology, customer flex contract terms). The wrapper is thin. The grid system itself does not change.

Agents reason, plan, and act

Orchestration frameworks such as LangGraph, AutoGen, and CrewAI host the agent population. Each agent uses a base model with grid-domain prompts and a constrained set of MCP tools. Token budgets stay small because each agent only sees the schemas it needs. We use the same pattern in agentic AI for utilities and other regulated industries, and it transfers well because the audit and authorisation primitives are already in place.

Agent-to-agent coordination

Cross-agent communication uses standardised patterns rather than ad-hoc message passing. Our colleagues covered the trade-offs in the A2A protocol for agent collaboration. In the grid context, A2A handles the negotiation between the dispatch agent (wants to charge the battery now) and the market agent (sees a profitable export window in two hours). The planner agent breaks ties using a policy that the operator can read in plain English.

Where MCP Agents Already Operate on the Grid

Where MCP Agents Already Operate on the Grid

Several production-scale AI energy management platforms already run agentic workloads on European and US grids, even before MCP existed. Full MCP-native deployments on transmission grids are still ahead. The architecture pattern is real, and four references show the shape of what is coming.

Google DeepMind on wind portfolios

DeepMind applied a neural network to 700 MW of Google’s wind farms in the US central plains, forecasting power output 36 hours ahead. The result was a roughly 20 per cent increase in the value of that wind energy, achieved by making the supply schedulable rather than purely intermittent. The forecasting layer is exactly the kind of work a forecasting agent will own in an MCP network.

Octopus Energy and the Kraken platform

Kraken is the AI-native platform that runs Octopus Energy’s retail and flexibility operations. The platform is now licensed to 40 utilities across 27 countries, covering more than 50 million customer accounts. Licensees include EDF Energy (5 million UK customers migrated), E.ON and npower (10 million combined), Good Energy, and Origin Energy in Australia. Kraken is the largest live example of agent-style automation in the European energy retail and balancing markets.

HEDGE-IoT, Horizon Europe

The HEDGE-IoT project runs from January 2024 to June 2027 with six large-scale demonstrators in Finland, Greece, Italy, the Netherlands, Portugal, and Slovenia. The framework deploys federated AI/ML at the grid edge and in the cloud, with explicit work on cross-system orchestration. Successor projects under Horizon Europe are already specifying MCP-style interfaces in their architecture documents.

Alliander and Siemens Gridscale X

The largest Dutch DSO, Alliander, signed a strategic partnership with Siemens to deploy Gridscale X as a digital twin of the LV network. The vendor estimates a 10 to 30 per cent uplift in usable grid capacity through flexibility orchestration. This is the platform layer that grid agents will sit on top of, not replace.

National Grid ESO is moving in the same direction with smaller pieces. The operator has partnered with Open Climate Fix on satellite-based solar nowcasting and with the Alan Turing Institute on machine learning for balancing.

How Real-Time Works Across the Grid Stack

How Real-Time Works Across the Grid Stack

Real-time in a smart city grid is not one tier. It is four, and only one of those tiers belongs to Model Context Protocol agents.

On a marketing slide, real time means now. The grid uses a much wider taxonomy.

  • Sub-cycle (under 4 ms)

Protection relays, IEC 61850 GOOSE messaging, and fault clearing. Hard-wired physics. Not an agent decision.

  • A few seconds

Automatic Generation Control, primary frequency response. Hard-coded controllers run the loop. An agent may advise, never dictate.

  • 5 to 60 seconds

Voltage regulation, DER set point updates, and congestion response on the LV feeder. Agents propose. Controllers execute through MCP write tools, gated by policy.

  • Minutes to hours

Demand response activation, EV charging coordination, market bid submission, fault isolation planning, and day-ahead procurement. This is where MCP agents live.

The taxonomy decides what the agents are allowed to touch. The team behind our recent piece on AI agents processing industrial IoT streams walked through the same latency budgets for factory floors. Grids are stricter because the consequence of a wrong write is a black start, not a misshipped pallet.

A working rule of thumb: if the response time is short enough that a human cannot reasonably intervene, agents stay out of the actuation path. They observe, they explain, they propose. Set point execution stays with the physical controller, and the agent layer never bypasses it.

Governance, Compliance, and OT Cybersecurity

Any AI system used as a safety component in the management of electricity supply is a high-risk under the EU AI Act. That places almost every grid agent in scope.

Annex III, section 2, covers AI in the management and operation of critical digital infrastructure, including electricity. The European Commission has pushed the substantive compliance deadline for Annex III systems to 2 December 2027, but the requirements themselves are unchanged. Risk management, data governance, technical documentation, human oversight, conformity assessment, and registration in the EU database all apply. Penalties run to €15 million or 3 per cent of global turnover. Our piece on EU AI Act compliance for AI agents covers the documentation burden in detail.

NIS2 (Directive EU 2022/2555) sits alongside. Energy operators are essential entities, so incident reporting, supply chain audits, and management accountability for cyber resilience all bite. The transposition deadline passed in October 2024, and several member states are now actively enforcing.

For the OT layer itself, IEC 62443 remains the baseline. MCP servers that touch SCADA must sit on the right side of the Purdue model boundary, with network segmentation, signed binaries, and zero implicit trust between agent layers and physical controllers. Our work on OT cybersecurity controls for utility clients tends to start with a Purdue model audit before any AI conversation begins.

Three practical implications fall out of this:

  • Every MCP tool call to a write endpoint is logged with the caller identity, timestamp, model version, and policy reference.
  • Human-in-the-loop approval is mandatory for any agent action that changes a physical set point. Our blog post on human-in-the-loop for high-stakes AI gives the rationale.
  • Agent model cards are stored in the same documentation lifecycle as the rest of the OT change management.

Where Smart-Grid Pilots Usually Break

Where smart-grid pilots usually break

Smart grid AI pilots fail more often than presentations suggest. Five failure modes show up across the deployments we have reviewed.

Schema drift in legacy SCADA

Tag names change during a routine ADMS upgrade, and the MCP server returns the wrong field. The agent’s reasoning chain still produces plausible output, which makes the bug harder to spot than a hard crash.

Forecast confidence collapses into policy

A dispatch agent receives a p50 demand forecast and treats it as deterministic. Storms arrive. The agent has no learned posture for high-uncertainty regimes, and the operator gets a confident, wrong recommendation at the worst possible moment.

Latency jitter under storm load

A coordination pattern that holds at 100 ms tail latency breaks at 800 ms. The fault-isolation agent times out, the planner picks a stale state, and the demand response activation lands ten minutes late.

Operator trust collapses on the first bad call

One wrong recommendation during a real event, and the control room desk turns auto-mode off for the next two weeks. Trust is recovered slowly and only with transparent post-event reviews backed by operational dashboards for grid teams.

Cultural distance between data science and the control room

The model performs well on validation data and badly on the desk’s intuition. If the data team does not sit close enough to the operators to hear those reactions, the pilot dies politely.

Picture a Nordic DSO running 250,000 smart meters and a young agentic-AI pilot for low-voltage congestion. The model works on paper. On the third storm of the winter, the dispatch agent issues a curtailment recommendation that conflicts with a local flex contract the agent never saw. The operator overrides, calls the vendor, and the pilot enters a six-week governance review. None of these failure modes is surprising. All of them are avoidable with the right discovery work upfront.

What Leaders Should Remember

Model Context Protocol agents give grid operators a way to stop hand-writing integration code for every new AI use case. The agent network sits one layer above SCADA, ADMS, and DERMS, coordinating across them rather than displacing them.

Three working principles hold up across the early deployments. Slow loops sit with the agents, while fast ones stay with the controllers. Every write is audited, gated, and approved by a human at the boundary that matters. The architecture follows the existing OT stack, not the model.

Utilities that get this right in the next three years will spend less on integration, less on operator burnout during grid emergencies, and considerably less on the certification paperwork that the AI Act and NIS2 will eventually demand. Those who delay will be paying for both the legacy wiring and the agentic layer at the same time. If you want to test where your stack sits, start with a discovery session, and we will walk through the architecture with your team.

FAQ

What is the Model Context Protocol, and why does it matter for grid operators?

Model Context Protocol agents rely on an open standard released by Anthropic in November 2024 for connecting AI to data and tools through a typed JSON-RPC interface. For a grid operator, that protocol replaces dozens of bespoke integration jobs with one repeatable wrapper pattern, which is the difference between an AI roadmap that scales and one that lives on a single team’s shoulders.

Can MCP agents replace SCADA or DERMS?

No, and that is the point. SCADA, ADMS, and DERMS handle the deterministic, real-time control of the grid. Agents read from those systems and write back through audited tool calls, but never bypass them. Treating MCP as a coordination layer above existing OT is what keeps the architecture defensible during cyber audits and AI Act conformity assessments.

Is running AI agents on a live energy grid compliant with the EU AI Act?

It is compliant when the agents are documented as high-risk systems under Annex III, section 2, with risk management, human oversight, technical documentation, and EU database registration in place. The substantive deadline for Annex III is now 2 December 2027, but the documentation work takes the better part of a year. Start the conformity assessment well before the deadline.

What is the difference between MCP and A2A for grid systems?

MCP standardises how agents talk to tools and data sources. A2A standardises how agents talk to each other. A grid agent network uses both, with MCP wrapping every SCADA, DERMS, weather, and market backend, and A2A handling the negotiation between forecasting, dispatch, and market agents.

How long does it take to pilot an MCP agent network in a utility?

A meaningful pilot, scoped to one feeder or one flexibility programme, typically runs eight to fourteen weeks once the OT cybersecurity controls and AI Act documentation are agreed. Wider rollouts depend more on change management with the control room desk than on the engineering work itself.

How can we help you ?

How can we help you How can we help you How can we help you