success story

Enterprise data platform to serve as central repository

Our client, an independent company that owns and operates a collection of subscription-based businesses that focus on scientific and academic research, patent analytics, and regulatory standards, faced several challenges that led to a complex data ecosystem with huge operational costs.

 

 

The Challenge
  • Unscalable legacy system with inconsistencies across products
  • Multiple systems leading to multiple points of failure
  • Large technology spread with high licensing cost and high maintenance costs
  • Data replication problems across multiple databases with no single source of truth
  • Highly manual process operations with no automation
  • The need to provide self-service access to professionals, licensees, and profession boards
The Solution

Virtusa started this engagement with a consulting phase to understand the client's business challenges, objectives, and desired expectations of business outcomes. This phase emphasized articulation of the problem statement, synthesis of the results, and rapid prototyping of potential solutions to ensure alignment between stakeholders on outcomes. We then proactively conceptualized, architected, and implemented a highly scalable data platform in line with the customer's growth vision.

  • Delivered an open-source big data (Hadoop)-based scalable architecture serving as a one-stop shop for content across all products
  • Built a self-serve model empowering functional experts to perform data discovery and product building
  • Embraced a cloud-native design enabling autoscaling and flexibility
  • Built in automated QA and data trends available out of the box
  • Used source and sink for the normalization processes
The Solution
The Benefit

By leading this multi-year, multi-partner engagement, Virtusa successfully implemented a highly scalable data platform:

  • Single source of truth via a centralized content repository
  • Automated manual processes, reducing the processing time from one week to three hours
  • Eliminated data volume limitations (a problem for 10 years) by moving from the legacy estate to a cloud-based model
Related content