A response to McKinsey's article "Reducing data costs without jeopardizing growth".

By DataRoad Team

“Becoming data-driven” is today’s motto and goal for pretty much every organization. This term can be found in slide decks of the most important consulting firms, said in press conferences by policy-makers and even in interviews with representatives of some of the oldest companies. As a consequence of the pandemic where most digitalization made human activity possible in spite of lockdowns, data has become more important than ever.

However, “becoming data-driven” is easier said than done as constantly experienced most (if not all) organizations daily. Answering questions, such as Which sectors and segments will drive demand? What are the favorite sales channel of Customer X?, requires lots of efforts, money, skills and knowledge to develop the right capabilities. Additionally, building such capabilities comes at a price since most organizations will have to modernize their data architecture, ingest data from novel sources, design algorithms to derive insights, and hire or train the talent to do it all. The prices tag for these efforts can run from hundreds of millions of dollars for a midsize organization to billions of dollars for the largest companies as shown by the figure below. That figure is mainly based on McKinsey Global Data Transformation (2019) to measure the spending on data-related matters per industry.


This article quotes entire paragraphs of an article called “Reducing data costs without jeopardizing growth” published by Davide Grande, Jorge Machado, Bryan Petzold, and Marcus Roth (2020) which explains the problems and possible cost reductions if companies take certain actions regarding their data landscapes. On top of that, it will explained how DataRoad could help any organization tackle the same problems and offer similar or even superior benefits.

Data may be abundant, but managing data isn’t cheap

“Many organizations are unaware of just how much they are spending on data because costs are diffused across the enterprise. Third-party data expenditures might come out of the business unit’s budget, for example, while reporting cost resides in relevant corporate functions, and data-architecture spend is managed in IT.”

“When pulled together, the tally can be jarring. A midsize institution with 5billionofoperatingcosts,forexample,spendsmorethan5 billion of operating costs, for example, spends more than 250 million on data across third-party data sourcing, architecture, governance, and consumption (Exhibit 2). How data cost breaks down across these four areas of spending can vary across industries. For example, industries such as consumer packaged goods that don’t directly engage customers often have a higher relative spend on data sourcing. The result, however, remains the same: managing data is a large source of cost at most organizations.”


“Addressing this fragmentation can deliver quick wins. Targeted improvements in data sourcing, architecture, governance, and consumption can help companies tamp down waste and manual effort and put high-quality data within easier reach. These efforts can cut annual data spend by 5 to 15 percent in the short term (Exhibit 3). Longer term, companies can nearly double that savings rate by redesigning and automating core processes, integrating advanced technologies, and embedding new ways of working. To get these benefits, here are the four things that leaders need to do.”


Optimize third-party data procurement

“After crunching the numbers, a regional bank in the United States discovered it was spending about $100 million annually to procure credit-risk data and market data, among other external data. To fund wider data transformation, it had to bring that figure down. The bank began by taking inventory of all the different data feeds it licensed and how frequently they were used. It found that a handful of third-party data sources accounted for the majority of all use, and a significant percentage of data was being used by individuals whose roles did not require real-time updates. By eliminating unused and underused feeds, defining clearer permissions around data access, and allowing credit-risk scores and other proprietary data to be reused for longer periods, the bank would be able to cut data costs up to 20 percent.”

“Thoughtful measures like these can reduce unnecessary third-party spend. Amending existing vendor contracts and instituting usage caps for the most commonly used feeds can provide additional gains. Later, as contracts come up for renewal, companies can compare the value and pricing they’re getting against alternative data sources (the number of which is growing rapidly) to find the best match and negotiate the most favorable terms.”

“We also recommend setting up a central vendor-management team with business-unit- and function-level gatekeepers to oversee data subscriptions, usage terms, and renewal dates. With appropriate procurement and business sponsorship, this team can help manage demand for third-party data and optimize vendor agreements.”

DataRoad approach

The benefits offered by DataRoad in this regard are listed below:

  1. With DataRoad, the bank would have been able to do the same for even a lower cost and more easily since DataRoad would have allowed the bank not to have to reinvent the wheel.

  2. With DataRoad, price negotiation would have been much more straightforward since DataRoad will have a powerful engine that does this kind of negotiating between data constumers and producers automatically.

  3. Setting up a central vendor-management team to manage all the aspects of data procurement become very straightforward with DataRoad. More specifically, DataRoad supports this use case by default since it offers an overview of data sources, providers, agreements, and pricing per third-party dataset.

Simplify data architecture

“A leading global bank had more than 600 data repositories in different silos across the business. Managing these repositories cost the bank 2billionannually.Recognizingthatthiswasunsustainable,thebankcreatedajointenterprisedataarchitectureteamconsistingoftheCIOandrelevantbusinessleaders.Together,theyagreedtosimplifythedataenvironmentinto40uniquedomainsandstandardizegoldensourcerepositories,allowingthemtodownsizeand,insomecases,fullydecommissiondatarepositories.Thestreamliningshavedmorethan2 billion annually. Recognizing that this was unsustainable, the bank created a joint enterprise data-architecture team consisting of the CIO and relevant business leaders. Together, they agreed to simplify the data environment into 40 unique domains and standardize “golden source” repositories, allowing them to downsize and, in some cases, fully decommission data repositories. The streamlining shaved more than 400 million in annual data costs while also improving data quality, making it easier for the bank to update systems and integrate insights into its processes.”

“Like this bank, many mature organizations suffer from fragmented data repositories. Storing and maintaining those troves can eat up between 15 and 20 percent of the average IT budget. The lack of standardization around data-management protocols can also create a validation headache, resulting in lost time as teams chase down needed information and increased error when they use the wrong data. To get the performance they need, organizations must revisit their core data architecture.”

“In the short term, organizations can generate savings by optimizing infrastructure—for example, by offloading historical data to lower-cost storage, increasing server utilization, or halting renewals of server contracts. Additionally, firms can take a hard look at the entire architecture-development portfolio and slow down or stop low-priority projects while also reducing deployment of high-cost vendor resources. Likewise, companies don’t have to wait for the target architecture to begin extracting value from their data. More widespread use of application programming interfaces (APIs) can allow businesses to put the data buried within their legacy systems to work without having to design costly custom workflows.”

“Over the longer term, bolder, transformational shifts can generate significantly higher savings. For example, migrating data repositories to a common, modern data platform (for example, a data lake) and evolving the infrastructure to a cloud-centric design allow a company to rationalize legacy environments and reduce average capacity required to handle spikes in computation and storage. In addition, organizations can initiate changes to boost productivity more broadly—for example, employing metrics and scorecards to improve performance, automating manual activities, and nearshoring or offshoring some resources.”

DataRoad approach

DataRoad would have made simplifying this bank’s data infrastructure much more easily since DataRoad gives a central overview of the data being used in the enterprise and an overview can serve to see which datasets are seldom or not used at all. Additionally, DataRoad minimizes unnecessary data duplication by keeping data at the source. By avoiding hard-coded mappings through the so-called implicit mappings, the relationships between data are less complex and more stable making it easier to deep dive into the data and its dependencies and resulting in more resilient queries.

Organizations could benefit from DataRoad since DataRoad’s main purpose is to serve as a highly optimized general-purpose data infrastructure. Instead of having to reinvent the wheel, organizations could simply adopt DataRoad.

DataRoad can allow businesses to put the data buried within their legacy systems to work without the need for expensive custom workflow or API design.

DataRoad will also enable all kinds of data migrations since DataRoad’s core purpose is to transport data from point A to B in the most efficient way.

Streamline data consumption

“In our experience, between 30 and 40 percent of the reports that businesses generate daily add little to no value. Some are duplicative, and others go unused, with the result that considerable resources are wasted.”

“To manage consumption more effectively, best-in-class companies map reports by topic, such as commercial reports and board reports. They then redesign data-gathering processes, automate pipelines, and explore new ways to model and visualize data and deploy the results in a paperless fashion. Rapid prototyping and testing cycles refine the report-generation process. This holistic approach helps to synthesize production across the organization, ensuring that the reports and metrics generated are of high quality and take relatively little effort to curate. Using methods like these, a European bank trimmed the number of reports it produced by 80 percent and reporting-related costs by 60 percent.”

“Organizations can gain additional benefits by making their business-intelligence capabilities available to employees on a self-serve basis. Remaining business-intelligence resources could then focus on more complex reporting needs and issue remediation.”

DataRoad approach

Because of DataRoad’s general-purpose nature, it will have a studio to easily develop and implement low-code use cases to support not only reporting but also process automation. DataRoad’s would do for … what Adobe’s PhotoShop Studio did for the digital art work, namely allowing all kinds of individuals and organizations to unleash their creativity by offering a versatile platform which enables instant idea exploration and experimentation, thereby fostering a dynamic playground for innovation.

Adopting data-driven approaches to optimize costs in other functions

“Organizations can extract cost savings not only by improving efficiency and performance within the data function but also by applying data to identify potential cost savings in other parts of the business. Procurement is an especially promising area. Using artificial intelligence, for example, businesses could detect higher-than-average rates of energy consumption in different locations or atypical travel-cost patterns, and then use those insights to provide recommendations on how to derive greater efficiency. Likewise, specialized algorithms can scan invoices, vendor data, contract data, and service consumption to spot anomalies in the underlying spend. Such practices can help lower total procurement costs by as much as 10 percent in some organizations. For example, a European home-appliance manufacturer used advanced analytics to scan more than 12 million invoices across 5,000 suppliers, identifying opportunities to reduce total costs by 7.8 percent.”

DataRoad approach

Cost-cutting efforts could greatly benefit from DataRoad since it would offer a detailed overview of the costs associated to a certain datapoint. In turn, DataRoad would feed other technologies such as AI with high-quality data to spot such anomalies.