
9 read min
Explore the following topics in depth at o9’s two regional AI Summits for 400+ forward-thinking enterprise planning leaders and professionals: aim10x Europe in Amsterdam on June 4th and aim10x Americas in Chicago on September 23rd. Attendance is free but space is limited.
Let me describe a scene.
A planning team is shown a new AI-powered tool. The model behind it is impressive. It can ingest vast quantities of data, identify patterns no human could spot, and generate recommendations in seconds. The team is intrigued.
Then someone asks: "Why is it recommending this?" And the answer is, effectively, "Because the model says so."
That is the moment adoption dies. Not because the recommendation is wrong. It might be perfectly correct. But a supply chain planner who is accountable for millions of dollars in inventory cannot stake their reputation on a black box. A demand manager who must defend a forecast to the CFO needs to explain the logic behind the numbers. A procurement leader negotiating with suppliers needs to understand why the system is flagging a particular risk.
This scene plays out in organisations around the world every day, and it reflects a deeper problem. As organisations race to embed AI into their operations, the conversation has been dominated by benchmarks, parameters, and performance scores. Which model is most accurate? Which architecture scores highest on a given test? These are important questions, but they miss the one that ultimately determines whether AI gets adopted or abandoned: do the people who must act on AI's recommendations actually trust it?
I believe that trust, not marginal accuracy gains, will be the deciding factor in AI adoption. And the path to trust runs not through the accuracy of larger language models alone but through explainability, consistency, and alignment with business logic.
The Trust Deficit
The evidence that enterprise AI has a trust problem is now hard to ignore. According to MIT's NANDA initiative, roughly 95% of generative AI pilot programmes fail to deliver measurable impact on the bottom line. Meanwhile, S&P Global research found that 42% of companies abandoned the majority of their AI initiatives before reaching production in 2025, up from just 17% the previous year. These are not failures of technology. The models are more capable than ever. They are failures of adoption.
And the hallucination problem makes it worse. Large Language Models are remarkably capable, but they carry a well-documented risk: they generate responses that sound authoritative and precise but are, in fact, fabricated. Industry research suggests that nearly half of enterprise AI users have made at least one major business decision based on hallucinated content. In a consumer context, a hallucination might be a minor inconvenience. In an enterprise context, where decisions cascade across supply chains, financial plans, and contractual commitments, the consequences can be severe.
Even when an LLM does not hallucinate, it cannot show its working. It cannot point to the specific data relationships, business rules, and causal chains that led to a given output. It produces answers without audit trails. For a planner whose job depends on being right, and being able to prove why, that is simply not good enough.
Why Accuracy Alone Is Not the Answer
The instinct in much of the AI industry has been to solve this problem by making models more accurate. If the model is right 98% of the time instead of 95%, the thinking goes, people will trust it.
But this misunderstands how trust works in practice. Decades of research into human-automation interaction support this point. In their foundational review, Lee and See (2004) found that trust in automated systems is shaped not by raw performance alone but by how well people can understand what the system is doing, predict how it will behave, and assess whether its goals are aligned with their own. Their framework identifies three bases of trust: the system's performance, the process by which it operates, and the purpose it serves. When any of these are opaque, trust breaks down, regardless of how accurate the system might be.
This maps directly to what we see in enterprise AI today. Trust is a function of three things.
- The first is explainability: can I understand why the system reached this conclusion?
- The second is consistency: does it behave predictably, or does it give me a different answer every time I ask the same question?
- The third is alignment with business logic: does its reasoning reflect the way my business actually works, including our constraints, our policies, and our commercial relationships?
An LLM on its own struggles with all three. Its reasoning is opaque. Its outputs can vary with each prompt. And it has no inherent understanding of the business structures and rules that govern enterprise operations. You can fine-tune it, prompt-engineer it, and wrap it in guardrails, but you are still trying to retrofit trustworthiness onto a system that was not designed for it.
To be clear, LLMs represent a genuine breakthrough. Their ability to process unstructured data, interpret natural language, and learn from patterns is extraordinary. The question is not whether LLMs are powerful. It is what you pair them with.
What About RAG and Guardrails?
Some will argue that retrieval-augmented generation, prompt engineering, and output guardrails are sufficient to make LLMs trustworthy for enterprise use. These approaches help. RAG grounds model outputs in retrieved documents, which reduces hallucination. Guardrails catch outputs that fall outside acceptable bounds. Prompt engineering can steer the model toward more reliable responses.
But these are mitigation strategies, not structural solutions. RAG can retrieve the wrong document, or the right document with outdated information, and the model will still present its answer with full confidence. Guardrails catch problems after they occur; they do not prevent the model from generating flawed reasoning in the first place. And prompt engineering is inherently fragile: small changes in phrasing can produce meaningfully different outputs, which is precisely the consistency problem that erodes planner trust.
The core issue remains. These approaches do not give the system an understanding of how the business works. They do not encode causal relationships between demand signals and supply constraints. They do not embed the rules, policies, and commercial structures that a planner uses to evaluate whether a recommendation makes sense. They improve the output without changing the underlying architecture, and that means the trust gap persists.
The Enterprise Knowledge Graph Changes the Equation
This is where a different approach is needed. Several paths are emerging to close the trust gap. The one we have invested in at o9, and the one I believe is most structurally sound, is the combination of symbolic AI and LLMs in what we call a neurosymbolic architecture.
We combine the strengths of LLMs with the strengths of symbolic AI: structured knowledge models, causal relationships, business rules, and constraint logic. The Enterprise Knowledge Graph is the backbone of this symbolic layer. It encodes the actual structure of an enterprise: how products relate to suppliers, how demand signals flow through channels, how capacity constraints interact with lead times, how financial targets connect to operational plans.
When an AI agent built on this architecture provides a recommendation, it is not generating a probabilistic guess. It is traversing a structured, auditable graph of real business relationships. And that means every output can be explained. Not in vague, after-the-fact rationalisation, but in precise, step-by-step reasoning that a planner can inspect, challenge, and validate.
This solves the explainability problem in a way that no amount of LLM fine-tuning can. When our system tells a planner that a stockout risk has increased, it can show exactly which demand signals shifted, which supply constraints tightened, and which inventory buffers are insufficient, all traced through the knowledge graph. The planner does not have to take it on faith. They can see the logic and verify it against their own expertise.
It solves the consistency problem too. Because the symbolic layer operates on defined structures and rules rather than probabilistic generation, the same inputs produce the same outputs. Planners can rely on the system to behave predictably, which is a prerequisite for integrating it into operational workflows.
And it solves the alignment problem. The Enterprise Knowledge Graph is built in partnership with domain experts who encode the specific logic of each business: its policies, its constraints, its commercial relationships. The AI does not have to infer how the business works from patterns in data. It knows, because the knowledge has been explicitly modelled.
I should be honest about the trade-off. Building an Enterprise Knowledge Graph requires real investment in domain modelling, and it is not a trivial undertaking. It demands time from subject matter experts, careful design, and ongoing maintenance as the business evolves. But that investment is precisely what creates the structural trust that general-purpose models cannot provide. The difficulty of building it is, in a sense, the point: trust is earned through rigour, not convenience.
Trust Drives Adoption, Adoption Drives Value
The value of enterprise AI is not realised in a lab. It is realised when a planner trusts a recommendation enough to act on it. When a logistics team trusts an automated replenishment order enough to let it execute without manual review. When a leadership team trusts an integrated business plan enough to make investment decisions based on it.
When that trust is absent, even accurate systems fail. I have seen organisations deploy AI-powered planning tools with impressive models and polished dashboards, only to watch the planning team quietly rebuild the process in spreadsheets because they could not verify the AI's logic. Within a year, the tool is shelfware. The accuracy was never the problem. The trust was.
Now consider what happens when trust is built deliberately. One of the world's largest multinational brewers set out to increase touchless planning across its global operations. Rather than simply deploying a tool and hoping for adoption, they built a systematic feedback loop. Each planning cycle, the team examined variances from the previous cycle, identified bottlenecks still prone to manual intervention, and pinpointed root causes. Each cycle, targeted fixes were implemented so the platform absorbed more routine work and the plans it produced got better. They measured success not just by whether the system was live, but by whether planners were actually using it, whether plans held up in execution, and whether users would recommend it to colleagues.
The results followed from the trust, not the other way around. Touchless planning rates increased by 20%. Forecast accuracy improved by 11% at the monthly level. Planners freed up 30% of their time. The transformation generated over $100 million in value, while achieving a four-year low in inventory and a four-year high in service levels. The technology mattered, but it was the disciplined loop of transparency, validation, and continuous improvement that made adoption stick.
Three Recommendations for Leaders
If trust is the true bottleneck to AI adoption, and I believe it is, then the path forward requires more than investing in bigger models. Here are three things I would encourage every enterprise leader to do.
- First, treat explainability as a buying criterion. Before signing a contract or greenlighting a build, ask: can this system show a planner why it reached a recommendation, trace it back to specific data and business rules, and produce an audit trail that satisfies governance? If the answer is no, adoption will stall no matter how good the demo looks.
- Second, invest in modelling your business logic upfront. General-purpose AI tools do not understand your constraints, policies, or commercial relationships. Budget for domain modelling as a first-class workstream with dedicated experts, not a configuration exercise after go-live. The organisations that get this right build a compounding advantage: each planning cycle earns more trust and unlocks more autonomous decision-making.
- Third, join us at the aim10x Summits. Dr. Ashwin Rao, former Wall Street quant, Adjunct Professor of Applied Mathematics at Stanford University, and now EVP of AI Strategy and R&D at o9, will share his perspective on the future of neurosymbolic AI and its real-world impact on enterprise planning. We'll be at aim10x Europe in Amsterdam on June 4th and aim10x Americas in Chicago on September 23rd. Attendance is free, but space is limited, so register early and bring your team.
I look forward to seeing you there.

aim10x Europe 2026:
o9’s Global AI Summit
Explore how Europe’s most forward-thinking organizations are redesigning the operating model to enable agile, adaptive, and autonomous planning and execution.

aim10x Americas 2026:
o9’s Global AI Summit
See how leading organizations across the Americas are transforming their operating models to turn VUCA into value with agile, adaptive, and autonomous planning and execution.
About the authors

Igor Rikalo
President & COO at o9 Solutions
Igor Rikalo is the President and Chief Operations Officer of o9 Solutions. He oversees the global operations of the organization and plays an integral role in ensuring the business continues to scale at a global level. At o9, he has developed a successful track record of building high-performing teams, managing global strategic initiatives, and delivering strong business results.











