How to Safely Operationalize AI: From Hype to Enterprise Value

Blogs and Articles

Aiimi CTO Paul Maker shares his AI fundamentals for leaders who need to guide safe, impactful AI adoption in their enterprises, based on a session held at Navigate25 – a full day event of AI and data governance thought leadership, hosted at Bletchley Park (UK) by Aiimi and Iron Mountain.

July 17, 20255  mins
Ai digital concept image

Cutting Through the AI Hype

AI promises automation, insight, and innovation at scale. But amidst all the noise, what does it truly take to deploy artificial intelligence safely and effectively in enterprise environments?

Paul Maker, CTO at Aiimi, offers a clear-eyed perspective on operationalizing AI. Drawing on hands-on experience with real-world deployments, Paul advocates for a measured, structured approach grounded in transparency, explainability, and above all, safety.

The Twin Pillars: Explainability and Safety

At the heart of Paul’s message are two foundational principles: explainability and safety.

“In business, we can explain why we make decisions. So, if we’re going to divest some of that decision-making to AI, explainability must underpin it,” Paul states.

AI systems must not only deliver answers, they must do so in ways that can be clearly understood, verified, and trusted. This means organisations must build architectures that allow users to trace outputs back to source data, scrutinise model reasoning, and confirm accuracy through human oversight.

Equally, safety is paramount. This includes technical measures like data access controls but also involves organisational readiness like making sure employees are informed about how and when to use AI. Legal, security, and compliance teams – the “protectors of the realm”, as Paul calls them – must be brought into the design and deployment process from the outset.

RAG Over Model Memory: A Pragmatic Approach

One of the most practical insights shared by Paul is the preference for Retrieval-Augmented Generation (RAG) over model memory when deploying AI with enterprise data.

Early hype suggested that organizations could simply pour all of their data into large language models and receive brilliant results. But in reality, relying on model memory can be problematic, with a high risk of hallucination (when a model delivers a false result) because of the inevitable gaps in knowledge. In environments where information insights lead to critical business decisions, this is not just unhelpful – its potentially catastrophic.

RAG, on the other hand, decouples the knowledge base from the model. It conducts a search for relevant information and then uses the model to generate outputs based on that specific, curated context. This allows organisations to:

  • Control and verify the source of truth
  • Reduce hallucination risk
  • Respect data permissions
  • Maintain explainability

“It’s like handing someone a book and asking them to summarise chapter two – you control the scope of knowledge that the model sees,” Paul explains.

Shrink the World: Why Less is More in Context

Language models have grown in capacity, with context sizes reaching into the millions of tokens (units of text, like a word, part of a word, or even a punctuation mark, that the model uses to process and understand language). However, more context, or information, isn’t always better for an AI model. In fact, as Paul notes, providing excessive context increases the likelihood of hallucinations, where models confidently generate incorrect or misleading outputs.

Instead, AI solutions should focus on shrinking the world to the smallest, most relevant set of data needed to answer a query. This not only improves accuracy but helps control costs, improves performance, and makes it easier to apply access rules and citations.

Information Retrieval and Classification: Foundational Steps

Safe AI begins with high-quality, secure, and relevant information. Two key enabling disciplines are:

  • Information Retrieval: Modern semantic and vector search methods overcome limitations of keyword search by accounting for vocabulary mismatch and intent. These are vital in surfacing accurate data from large, unstructured repositories – a huge part of the enterprise data landscape. This information can then be used for RAG-based use cases, like AI assistants.
  • Classification: Labelling information by topic, department, or sensitivity is a powerful way to apply security rules, streamline secure discovery, and enable AI data governance. Automated techniques, such as clustering, can help scale classification without relying solely on manual effort – a feat that’s no longer feasible at enterprise scale, if you want to get data ready for AI.

Paul underscores that this dual focus – organising and surfacing the right data – forms the foundation for trustworthy AI.

Security and Access Control: Non-Negotiables

In the AI context, traditional access controls become more critical than ever. Many organizations use access control lists and sensitivity labels to govern who can see what. However, once data is fed into a model, particularly during training or fine-tuning, those controls can become ineffective.

This is where RAG excels. Because the model doesn’t retain the data, it only processes it during the query, security controls remain enforceable.

“When you use RAG, you only feed the model what the user is allowed to see. That’s the key to safe AI,” Paul says.

This approach allows AI to be deployed across sensitive, dynamic environments without compromising confidentiality or security.

Explainability, Citations, and Human Oversight

For AI to be truly trusted, it must be auditable. That means maintaining detailed logs of:

  • Input prompts
  • The data used to generate the answer
  • The output itself
  • Decisions made or actions taken

Citations are particularly valuable, enabling users to verify where information originated. This creates a safety net around hallucinations, which are an unavoidable artefact of how language models work.

Paul recommends human-in-the-loop systems where AI results are always subject to review, especially when they impact critical decisions or backend systems. This balance of automation and oversight is central to safe AI.

Agentic AI and Function Calling: Automation with Guardrails

Agentic AI – where models perform multi-step tasks and even control workflows – is rapidly gaining attention. But Paul cautions that with power comes complexity.

Through function calling, models can request actions like retrieving data from systems, updating records, or executing backend queries. While this expands AI’s utility, it also requires robust guardrails, including:

  • Access controls on individual tools
  • System prompts to limit scope
  • Real-time user validation

Reducing UI Complexity: A Shift in Interaction Design

Generative AI has the potential to replace rigid UIs with natural language interaction. Instead of navigating menus and forms, users can simply ask questions, issue commands, or request summaries.

This flexibility can:

  • Shorten development cycles
  • Improve accessibility
  • Adapt interfaces to user needs dynamically

However, Paul stresses that designing these interactions requires careful thought. User experience now encompasses data transparency, clear feedback loops, and error-handling when AI is uncertain.

A Use Case Focus

Successful early AI adoption means focusing on use cases that are repeatable, high-value, and verifiable. Paul advises organisations to avoid “mega chatbots” and instead focus on narrow, solvable problems where AI adds immediate value to productivity, efficiency, or insight – with a measurable ROI.

This includes using AI to support advanced tasks like:

  • Planning incident responses by sourcing relevant data from multiple systems to collate insights that inform critical decision making
  • Automating document classification and compliance labelling at scale, such as to prepare for a data migration
  • Handling legal disclosures and subject access request (SAR) workflows

From Public to Private Models: Data Privacy Considerations

One common concern is the use of public models such as ChatGPT, popular with individuals. These platforms may log or reuse inputs for their own training, creating compliance risks when used for business data.

In an enterprise setting, Paul recommends using private versions of large models hosted in controlled cloud environments, such asAzure OpenAI, and deploying open-source models like Llama or Mistral within virtual machines for maximum security. Coupled with sensible approaches to use cases and appropriate training for teams on when and how to use AI tools, these approaches are non-negotiable for most legal and IT security teams.

Education and Confidence-Building

Adoption of AI often stalls – not because of technical limitations, but because of cultural and organizational resistance. According to Paul, this often stems from hype-driven confusion and a lack of understanding.

“The ‘protectors of the realm’ in legal and IT are there to keep us safe. They need education, not just approval,” Paul says.

Organisations should invest in internal literacy programmes that explain AI, clearly demystifying terms like hallucination, training, and information retrieval and showing how the technology aligns with the organization’s core governance principles.

Conclusion: Build Responsibly, Act Confidently

AI has extraordinary potential. But to realise it safely, organizations must start with:

  • A foundation of clean, classified, and permissioned data
  • A design approach that prioritises explainability and traceability
  • Carefully scoped use cases that deliver value and build trust
  • Ongoing human oversight and interactivity

As Paul Maker concludes, “It’s not about removing people from the loop. It’s about empowering them to work smarter, safer, and faster with AI.”