Perspectives

It Pays to Modernize Your Data Architecture

Arvind Purushothaman
Practice Head and Senior Director – Information Management & Analytics
Article

In today’s world, where data is collected at every interaction, be it over the phone or via mobile, PC, or sensors and with or without us knowing, it becomes important to have a strategy for the data. Traditionally, data has been seen as something used to “run the business,” but in today’s context, it can actually “be the business” if monetized well. An example of Internet of Things (IoT) data in a customer context is the wristband one wears at amusement parks that provides real-time data about customer interaction at all times. This data can be processed in near real time to push out relevant offers and alerts to enhance the customer experience. The question is how organizations prepare themselves to take advantage of data.

The key lies in building a modern data architecture that is open, flexible, and scalable, something that can accommodate your existing data assets as well as potential new ones. Before we talk about specific steps to modernize data architecture, let’s look at typical challenges:

  • Many applications within the organization have been around for 20 or more years. While the usage for some of them is known, it is still not clear who is leveraging the data in each application and for what purpose. How do we find out?
  • To meet their reporting needs, organizations have built multiple data assets, including data warehouses and data marts. Additionally, they have power users collating data from multiple sources and creating reports using Excel. Numbers are inconsistent and vary based on who is preparing them and the intended purpose.
  • Organizations have multiple applications and coexisting data assets, from mainframe-based to client-server, Web, and newer cloud-based applications. They struggle to find the right people to support the applications, especially the older ones.
  • Organizations are aware of the new developments in the big data space, including NoSQL databases and the Hadoop ecosystem, and have typically embarked on some initiatives to get started. The main challenge involves integrating them with the traditional data warehouse technologies.
  • People, and by extension, their skills, are the biggest assets of any organization. CIOs are concerned about having to find an army of programmers for populating Hadoop-based data repositories. The other big concern is how to leverage existing SQL skills that people have acquired over the years.

These are valid concerns, and some are more applicable than others based on the context. Nonetheless, given the inevitable need to be able to better monetize data and modernize technology platforms, it is important to have a strategy. I recommend the following approach:

  • Data asset inventory: Create a complete list of data assets—legacy, data warehouses, data marts, data islands. Identify the data flows between these assets and the usage patterns. It might be particularly hard for some legacy systems, but this serves as the starting point for any consolidation and modernization.
  • Data asset rationalization: Based on the list of data assets and the usage, it is important to rationalize them. What this means is to identify if the same data is coming from multiple applications, and if so, which is the authoritative source and which can be retired. This important exercise can help consolidate the number of data assets to a manageable few. In this context, master data management is critical to ensure you have high-quality data.
  • Data lineage: Undertaking a data lineage exercise to identify data flows—creating detailed documentation especially for the legacy applications—is a must. This process greatly reduces the risk of dependency on key personnel and also makes it easier to migrate to a future state architecture.
  • Data infrastructure: Have a big data and cloud strategy in place to bring in newer technologies in a pilot mode. Start with a non-legacy application to understand the technology, and move applications over in conjunction with data asset rationalization. “Data on cloud” is going to be an important component of modern architecture, especially when dealing with IoT data.
  • Data technology: It pays to understand the different options available in a crowded and rapidly evolving marketplace and to select the right technologies that fit into your architecture from a technology standpoint as well as a people standpoint. For example, using a data integration tool with big data connectors will eliminate the need for people who can write MapReduce code.

Creating a holistic data strategy in light of changes in the business and taking a structured approach will definitely help lay a solid foundation that will be the basis for monetizing data.