Authored by Nicholas Gould , XMPro Solutions Architect
Claude Code is genuinely impressive. In a few hours, a technical user can build an agent that observes data, reflects on patterns, creates plans, and takes actions. Add a vector database for memory, connect it to an LLM, and you have something that looks remarkably like what we have built in our XMPro Multi-Agent Generative System MAGS.
So why would anyone buy a platform when they could build one?
MAGS is our framework for deploying AI agents in industrial environments, built on DataStreams - our real-time data orchestration platform with 150+ industrial connectors.These aren't chatbots or copilots. They're autonomous agents that monitor operations, reason about problems, coordinate with each other, and take actions within defined boundaries, 24/7, across hundreds of assets.
MAGS is "~90% business process intelligence and only ~10% LLM utility." That's a bold claim. This article examines what that 90% actually contains... and what you'd need to build if you went the DIY route.
I've spent the past year at XMPro working alongside the architects of MAGS, participating in implementation decisions, talking with partners and large industrial clients about their AI and agentic requirements. I've also built agent prototypes using Claude Code and similar tools.
That combination, seeing what clients actually need in production while knowing what's achievable with modern agentic tools,is what prompted this article.
The gap between a working demo and a production system running 24/7 across hundreds of assets is where the real engineering happens. That gap represents approximately 90% of what makes industrial AI agents actually work... and it's almost entirely invisible until you try to cross it.
Respecting the Tool
The capabilities of modern agentic coding tools deserve proper acknowledgment.
Here's what someone with a clear problem and strong technical knowledge can absolutely build in days or weeks:
- Cognitive loops that observe, reflect, plan, and act
- Memory systems using vector and graph databases
- Multi-agent coordination with message passing and shared state
- Tool integrations calling APIs, querying databases, triggering actions
- Sophisticated reasoning chains that handle complex decisions
- Working prototypes that genuinely impress stakeholders
This isn't hypothetical. If you've used Claude Code or similar tools to build anything of real complexity, and many of you have, you know what's achievable. The question isn't whether you can build industrial AI agents with modern tools.
The question is what happens on Day 2. And Day 200. And when the auditor arrives.
The Demo vs. Production Gap
It's easy to build an agent that works in demo conditions: selected data, predictable scenarios, controlled environment, known failure modes. The agent performs beautifully because it's operating in conditions designed for it to succeed.
Production is different.
Production means:
- 2am on a Saturdaywhen the OPC UA server drops connection mid-subscription
- Partial databecause three sensors are offline and the historian is running 47 seconds behind
- Conflicting objectiveswhen the energy optimisation agent and the throughput agent both want to control the same variable
- Audit requirementswhere "the AI decided" isn't an acceptable answer
- Staff turnoverwhen the developer who built it leaves and nobody understands the memory retrieval logic
- Scale when you need the same agent logic running on 400 pumps across 12 sites with different configurations
This is where the 90% lives.
The Five Layers of Hidden Complexity
XMPro's MAGS codebase contains over 30,000 lines of functional code. Less than 10% handle LLM integration. The remaining 90% implements business process capabilities that have nothing to do with language models.
What are those tens of thousands of lines doing?
1. Industrial Protocol Reality
Connecting to OPC UA isn't an API call. It's security modes, session management, subscription handling, browse paths across vendor-specific implementations, and error recovery when the server restarts mid-read. XMPro DataStreams supports 150+ connectors, each representing months of development and years of edge-case discovery.
You could build an OPC UA connector in a week. You'd spend the next six months discovering why it fails in production.
2. Separation of Control
"The agent can think, plan, and request... but the DataStream determines what actually happens."
This sounds simple. Implementing it so an agent genuinely cannot bypass the safety layer, regardless of prompt injection, memory manipulation, or tool abuse, is not. Proving that to a regulator is harder still.
XMPro puts this boundary at the infrastructure level. MAGS agents connect to DataStreams, which determine whether and how actions execute. The agent never touches execution code directly. This is the difference between "we have guardrails" and "we can demonstrate to an auditor that the AI cannot bypass controls."
3. Consensus and Coordination
What happens when Agent A wants to reduce pump speed for energy efficiency while Agent B wants to increase it for throughput? What if Agent C's maintenance plan depends on Agent A's decision?
Production multi-agent systems require formal consensus protocols, Byzantine fault tolerance for safety-critical decisions, deadlock detection, and conflict resolution across draft plans. MAGS implements seven consensus protocols with specific use cases, from simple majority to Byzantine consensus requiring 3f+1 agents to tolerate f failures.
Basic voting might be an afternoon of development with Claude Code. Production-grade consensus that handles network partitions and agent failures is months of engineering and expertise.
4. Memory Systems
Every agent framework has "memory." Most implementations amount to storing embeddings and retrieving by similarity.
Production memory systems require significance calculation (not every observation deserves storage), memory decay (following the Ebbinghaus Forgetting Curve), separation of memory types (episodic, semantic, procedural), and synthetic memory generation for edge cases you can't safely replicate in production. MAGS uses polyglot persistence, vector databases for semantic search, graph databases for relationships, time-series databases for temporal data, because different data types need different storage.
You might build basic memory, store embeddings, retrieve by similarity in a day. A production memory system that calculates significance, manages decay, and maintains consistency across multiple databases is months of engineering and expertise.
5. Governance That Satisfies Regulators
This is where many DIY approaches fall apart entirely.
MAGS implements Deontic Logic, a formal framework with five rule types: Obligation (must do), Permission (may do), Prohibition (must not do), Conditional, and Normative. These rules are enforced at runtime, not just documented. Every agent action is validated against the deontic framework before execution.
When an auditor asks "how do you ensure the AI doesn't exceed its authority?", the answer isn't "we trained it not to." The answer is: formal deontic rules validated at runtime with complete audit trails. That's what Bounded Autonomy actually means.
The Operational Burden
Let's talk about what happens after deployment.
This table isn't about capability. It's about where you want to invest your engineering effort, and every hour spent maintaining connectors, consensus mechanisms, and governance frameworks is an hour not spent on your actual business problems.
When DIY Actually Makes Sense
I'm not arguing that you should never build your own agent systems. There are legitimate cases where DIY is the right choice:
Research and experimentation: When you're exploring what's possible, speed of iteration matters more than production hardening.
Truly unique use cases: If your problem genuinely has no existing patterns and requires novel approaches, building from scratch may be necessary.
Non-safety-critical applications: When wrong decisions don't cause equipment damage, safety incidents, or regulatory violations, the governance burden is lower.
Deep in-house expertise: If your organisation has both strong AI engineering capability AND deep industrial domain expertise, you can potentially build and maintain production systems.
Budget constraints: Sometimes any platform is unviable, and a purpose-built solution is the only option.
Single-purpose, limited-scope applications: One agent, one asset, one use case, the complexity multiplication that comes with scale doesn't apply.
Be honest about which category you're in.
The Build vs. Buy Framework
When deciding whether to build or buy, ask yourself:
- Scale: How many assets, sites, and use cases will this eventually cover?
- Consequence: What happens if the system makes a wrong decision at 2am with no human oversight?
- Compliance: What does the auditor need to see, and how will you produce it?
- Continuity: Who maintains this when the original developer leaves?
- Core competency: Is building AI agent infrastructure your business, or is it applying AI agents to your business?
The answers usually make the decision clear.
Conclusion
We didn't start building MAGS last year when AI agents became fashionable. At XMPro, we've spent fifteen years building business process coordination platforms for industrial environments. The agent capabilities are built on top of DataStreams... a real-time data orchestration platform with 150+ connectors, proven at scale across mining, energy, and manufacturing.
Claude Code and similar tools have genuinely transformed what technical users can build quickly. A working agent demo is now achievable in days. But the gap between demo and production, between single-asset prototype and fleet-wide deployment, is where the real complexity lives.
That complexity doesn't disappear because you build it yourself. It just becomes your complexity to maintain.
The decision isn't really build vs. buy. It's: where do you want to invest your engineering effort?
