Senior Director – Technology
Today’s enterprises are under constant pressure to balance profitability with rapid innovation. Artificial Intelligence sits at the intersection of both—unlocking new business possibilities while simultaneously driving operational efficiency at scale. AI is evolving rapidly from generative to agentic and increasingly autonomous systems, with research continuing toward artificial general intelligence (AGI).
According to Gartner, by 2028, 33% of enterprise software applications will include agentic AI. Forrester has coined the term “TuringBots” to describe AI, genAI, and agentic-powered software testing tools. In its best-practice guidance, Forrester also highlights key roadblocks to adoption, including underperformance, hallucinations, misalignment with organizational standards, and governance challenges.
Together, these trends underscore a critical reality. While the scale of enterprise genAI deployment is accelerating, so are the risks, unknowns, and operational pitfalls. At Virtusa, we address this through Virtusa Helio assure, our AI-powered assurance capability within the broader Virtusa Helio platform, designed to provide continuous, intelligent validation across models, workflows, and agentic systems. Combined with our responsible AI framework, Virtusa Helio assure, supports a “first-time-right” approach—bringing together robust model assurance with scalable agentic orchestration and validation.
With digital assurance powered by the Virtusa Helio platform, we combine best-in-class practices from real-world customer implementations with our proprietary AI agents and orchestration capabilities. This starts with a structured model assurance and validation framework to enable:
To enable reliable, large-scale evaluation of model outputs, we leverage Judge Large Language Models (LLMs), human-in-the-loop validation, and automated evaluation platforms through partnerships with Nova and LangSmith.
The figure below illustrates the AI and model assurance accelerators we have built and leveraged to accelerate deployments across customer environments and our internal AI systems.
Together, these capabilities help build business confidence by measuring the proportion of correct judgments and enabling a deterministic, data-driven view for informed decision-making.
The second pillar of the two-pronged approach is utilizing agents to test agents—one of the fastest and most powerful ways to assure AI outcomes at scale. The key validation touchpoints for agentic systems include:
As organizations deploy multiple agents across the SDLC and STLC, agentic validation becomes mission-critical. Ensuring seamless agent hand-offs, tool usage, response prioritization, and continuous learning requires an automated, scalable assurance framework.
Our agent testing framework is purpose-built to address these complexities and enable enterprises to confidently scale agentic deployments.
The figure below illustrates our structured approach to building and scaling agent-testing-agent validations.
To see how the two-pronged approach works in practice, here are a few success stories from enterprise deployments.
A leading global non-profit professional association with a community of over 2.9 million professionals collaborated with us to deploy AI assistants for project management professionals. Our model assurance approach achieved 99.4% accuracy using:
Originally launched as a genAI pilot in the early days of adoption, the platform has now evolved into a SWARM (scalable, workflow-driven, autonomous, role-based multi-agent) agentic architecture, with an expanded target audience spanning enterprises and individual users. This scale demanded rigorous assurance across agent hand-offs, multi-tool access, and top-k response validation.
Model assurance spanned the entire lifecycle—from data preparation and input validation to process and output validation. We leveraged synthetic Q&A data generation pipelines, LangChain-based evaluation metrics, ground-truth accelerators, and Judge LLMs to assess custom metrics such as answer relevance, contextual grounding, and content prioritization.
Responsible AI dimensions, including guardrails, bias, prompt injection, and negative scenario testing, were implemented using the retrieval augmented generation assessment system (RAGAS) 2.0 framework and custom bidirectional encoder representations from transformers (BERT) models. With high levels of automated coverage, the program progressed from 70% accuracy in pilot to 99.4% at general availability, delivering an 80% reduction in manual testing effort.
For a global healthcare insurer, we helped build and deploy a genAI-powered benefit inquiry system supporting human agents. The solution was validated end-to-end using our internal model assurance framework and accelerators.
The business impact included a 30% reduction in manual effort and successful scaling to 10,000+ clients.
The Bank of America Global Research predicts that the total addressable market for agentic AI will reach $155 billion by 2030. At the same time, enterprises are facing increasing POC fatigue, with many struggling to scale genAI pilots into production with confidence. As McKinsey aptly notes, “AI agents offer a way out of the genAI paradox.”
Scaling enterprise genAI successfully requires more than innovation—it demands an assembly-line approach to assurance, with built-in testing, evaluation, responsible AI guardrails, and continuous validation. A trusted partner like Virtusa enables organizations to industrialize AI adoption with confidence, ensuring that AI systems are not only powerful but also reliable, safe, and enterprise-ready.
Senior Director – Technology
Sundar brings over 25 years of experience in the IT industry, spanning AI-led delivery, pre-sales, and sales-driven Centers of Excellence (CoE). He has conceptualized and built multiple testing accelerators and has helped Fortune 500 clients define strategy and drive AI-led innovation across the SDLC. Passionate about technology, Sundar focuses on transforming quality assurance through genAI and agentic AI to enable end-to-end lifecycle transformation. In his current role, he leads the digital assurance practice in North America across industry verticals.
Note:
Subscribe to keep up-to-date with recent industry developments including industry insights and innovative solution capabilities
Learn more about our generative AI services