A catalyst for innovation, a leading bioscience research institute based in the Midwest, the Indiana Bioscience Research Institute (IBRI), is working toward its mission of becoming the first industry-inspired institute in the development of solutions that improve the health of those suffering from diabetes, metabolic disease, and poor nutrition. Amazon Web Services (AWS) and Virtusa were brought in to assist the IBRI and Fuse by Cardinal Health to provide the technical skills and infrastructure to assess and optimize a simulated electronic health records (EHRs) dataset from Fuse to match the characteristics of IBRI’s type 2 diabetes (T2D)-related EHR datasets.
The IBRI uses these EHR datasets with its research collaborators to drive insights into patient subgroups, disease characteristics, and complications of patients with T2D and to develop and validate T2D-related prediction models. An improved data analytics platform and a “realistic” simulated dataset that matched the complexities of a real EHR dataset as provided by Fuse has the potential to accelerate certain of the IBRI’s research activities.
- Healthcare and life science data exist in silos and are expensive in nature. Sourcing and maintaining complete, high-fidelity patient data is slow, expensive, and error prone. Cleansing data from multiple sources is time intensive, and PHI compliance limits access and usage.
- Access to real electronic health record (EHR) data is hindered by legal, privacy, security, and intellectual property restrictions. Also, de-identifying patient data is complex & costly.
- There is an incredible risk in sharing actual detailed clinical data across industry
- Enhanced the synthetic data generation process in such a way that Proxi™ utilizes a state transition machine which uses publicly available census information, clinical pathway information derived through research, and inferred knowledge models based on real-world data to simulate the EHR data for populations of specified demographic regions
- Built the Model Discovery agent (MDA), which analyzes real-world medical datasets and learns distribution based inferences, which can be used as a model to generate new data that reflects the real-world data.
- Built a deep statistical compare tool that uses statistical comparison methodology to match generated synthetic data against the real-world data, so that the simulation can be improved on an iterative basis.
- Built and optimized the methodology to simulate the health records for the entire US population, leveraging the vLifeTM platform.
- In the engagement with the Indiana Biosciences Research Institute (IBRI), demonstrated the efficacy of simulated data as a viable alternative to sourced EHR data with relation to the Type 2 Diabetes Mellitus condition
- Engaged the University of Texas Health Science Center for a pilot program by leveraging Virtusa’s vLife™ platform and Cardinal Health’s simulated data (Proxi™) to accelerate its research efforts in Type 2 Diabetes using Machine Learning and Deep Learning technologies.
- Provided rapid iteration capability on simulated data through the development of the deep statistical compare tool