The Questions SOC 2 Doesn't Answer

The security frameworks most enterprises rely on to evaluate AI vendors were built for a different era. SOC 2, ISO 27001, NIST – they are all rigorous, important, and increasingly insufficient. Not because they're wrong, but because the questions boards and regulators are now asking simply weren't on the table when those standards were written.

How does the platform ensure an agent only sees what a specific user is permitted to see? Can you reconstruct why a particular answer was given, to that user, at that moment? Where does data actually sit while the model is reasoning on it? Who controls the audit trail? And what does "audit trail" even mean when the output can trigger a real business decision?

These aren't edge cases. For regulated enterprises deploying generative AI in 2026, they're the questions that determine whether a deployment is defensible to a regulator, to a board, to an auditor showing up six months after the fact.

To work through what those questions actually demand, we sat down with Nebo Cvijetic, Squirro’s CISO in Operations & SaaS – someone who spends his days on exactly this terrain, fielding the concerns that certifications don't quite reach. What follows is a practical conversation about where the standard frameworks end, what responsible AI governance actually looks like in a regulated environment, and how to think about building something that holds up when it counts.

Squirro: The standard security certifications – SOC 2, ISO 27001, NIST – were mostly written before generative AI became a board-level concern. From where you sit, serving regulated industries, what's the question you keep hearing that those frameworks weren't built to answer?

Nebo Cvijetic: The question has shifted. It used to be: "Are you certified?" That's still important. But now the question underneath it is: "Can you prove the AI behaves responsibly in our environment?"

And that's a harder question to answer with a certificate.

For regulated customers, responsible behavior means specific things. Does the AI only surface data the user is actually permitted to see? Can you trace an answer back to a trusted source? Where does processing happen, and who controls the audit trail? SOC 2 and ISO 27001 weren't designed to answer those questions – not because they're poorly designed, but because those questions didn't exist at scale when the frameworks were written.

This is why frameworks like the EU AI Act and ISO/IEC 42001 are becoming relevant additions, not replacements. Traditional certifications prove that a company has strong controls. For AI, customers increasingly want proof that the platform itself behaves in a controlled, explainable, and auditable way. Those are different assurances. They require different evidence.

Squirro holds ISO 27001 certification and recently completed SOC 2 Type I attestation. What do those actually tell a CISO – and what should they understand those certifications don't cover?

ISO 27001 tells you that we operate a structured, audited information security management system. It means we have defined governance, risk management, access controls, incident management, supplier oversight, and continuous improvement processes – and that an independent body has reviewed them. That's meaningful.

SOC 2 Type I adds a layer of assurance specifically for the US market. It means our Security Trust Services controls were independently assessed and found to be suitably designed at a specific point in time.

What neither certification tells you is that security risk is zero – nothing does. And what SOC 2 Type I doesn't yet show is how those controls perform over time. That's what Type II is designed to demonstrate.

The honest framing is this: these reports give you a mature and independently reviewed security foundation. What remains open is the operating history, the specific scope of each report, and any customer-specific requirements around data residency, architecture, or integration that need to be worked through separately. The certificate gets you to the conversation. It doesn't replace it.

Data sovereignty is often treated as a hosting question – where are the servers? But you've described it as something more layered than that. Can you unpack what it actually looks like across an AI deployment?

"Where is the data center located?" is the right question for a storage audit. But it's only about a third of the question for a full AI deployment.

The more complete question is: where does each part of the AI process happen, who can access it, and under which rules? Because an AI workflow has a lot of moving parts that don't always share the same jurisdictional profile.

Consider what's actually in play: where customer data is stored, where the retrieval layer runs, which model is doing the reasoning, whether prompts or responses are logged and where, where audit logs are kept, and who is operating or supporting the environment. In practice, data may remain cleanly in one region while model processing, support access, or monitoring touches another jurisdiction entirely. That gap is where organizations get caught.

The real concept is operational sovereignty. It's not just about residency; it's about control across the full AI lifecycle. Where does accountability sit at each layer? Who can access what, and when? The organizations that get this right have clear architecture, well-defined responsibilities between themselves and their platform provider, and logging that covers the entire workflow – not just storage.

If a CISO at a regulated enterprise asked you what to actually check – beyond the certifications they're already collecting – what would you tell them?

I'd tell them to look past the certificate and pay attention to how the platform behaves in practice.

The questions worth asking: Does the AI respect user permissions at retrieval, not just at login? Can answers be traced to specific, trusted sources? Are prompts and outputs logged in a way that would satisfy a regulator six months from now? Where exactly is data processed, and who has access to it during that process? How is human oversight handled when the model is wrong?

Beyond that, I'd push back on the instinct to default to maximum control for every deployment. It's understandable – security teams are paid to be cautious. But the right model isn't always the most restrictive one. It's the one that matches the actual risk of the use case.

For highly sensitive or regulated data, a private cloud or dedicated environment is often the right answer. For lower-risk internal use cases, a well-controlled SaaS model can be both secure and significantly more efficient. The goal is to calibrate control to sensitivity, business impact, and regulatory obligation – not to make every AI deployment more complex than it needs to be. Complexity doesn't equal security. It usually just equals friction.

Auditability has become a central demand – from regulators, from compliance teams, from boards. But a lot of what gets labeled an "AI audit trail" is really just a log of the prompt and the response. What does real auditability actually look like?

A prompt-and-response log is a screenshot. It tells you what was said. It doesn't tell you whether it was allowed, where it came from, or why the model said it.

Real auditability means you can reconstruct what happened, why it happened, and whether it was permitted to happen. That's a much higher bar.

In practice, it means capturing the full context of a transaction: who asked the question, what data permissions they held at that moment, which sources were searched, which documents were actually used in the response, which model configuration was applied, and whether any human review or escalation took place. For regulated organizations, this matters because an AI answer isn't just text. It can drive a customer communication, inform a risk assessment, or trigger an operational action. The downstream consequences are real.

The test is simple. If an auditor or regulator asks six months from now – "Why did the AI give this answer to this user?" – can you show a clear, reliable, tamper-resistant record? If yes, you have an audit trail. If you can only produce the prompt and the response, you have something that looks like accountability and stops well short of providing it.

That gap is exactly where most enterprise AI deployments sit right now. Closing it isn't a technical afterthought – it's a design decision that has to be made early.

Discover More from Squirro.