Aperity Data Manager

Why Data Governance is Necessary in the Beverage Industry – David Palmer

Why Data Governance is Necessary in the Beverage Industry

David Palmer

JULY 30, 2020

Back in 2017, The Economist published an article titled “The world’s most valuable resource is no longer oil, but data.” Since then we have seen an increase in the number of machine learning platforms and a large increase in the demand for data scientists who can exploit this data for optimized decision making. To stay competitive, companies have been grabbing all sorts of data, wrangling it together and running it through algorithms or putting it into visualizations tools to gain valuable insights.

In a three-tier system of distribution (supplier, distributor, retailer) that we at Aperity see in the beverage industry and other fast moving CPG industries, companies are looking to move beyond depletion data and integrate retail, survey and e-commerce data to try to gain a complete picture of their supply chain and their customers’ buying behavior. While this data will answer questions, will it answer the questions correctly? If bad, incorrect, out of date or poorly harmonized data is used as an input, one can only expect that bad, incorrect or out of date answers will come out the back-end.

So, what is the solution? Data science and machine learning are helpful but you can’t get to consistent data accuracy without data governance.

Data governance can be defined to include the people, processes and technologies needed to manage and protect a company’s data assets in order to guarantee generally understandable, correct, complete, trustworthy, secure and discoverable corporate data.

Based on the definition above, the key objective of data governance is to ensure the data is available, usable, standardized, accurate, trusted and secure. To accomplish this objective, an organization must deploy the appropriate people, processes and technologies. Data owners/stewards must be identified to define business rules/definitions and validate their correct application within an organization. Processes must be put in place to ensure the right data is available to the right people at the right time. Technologies must be deployed to execute integration and transformation procedure on the data while providing visibility into data lineage, data transformation and data processes to retain end user trust of the data.

Within a three-tier system for distribution, this objective can be a struggle. Data must be integrated, cleansed and harmonized between multiple parties and made available to the appropriate users across multiple organizations, in the appropriate format at the appropriate time. To gain trust in the data, users from multiple organizations are demanding visibility into the process to understand where the data originated, how was the data transformed, how current is the data, is the data accurate and if data has to be restated with what impact. In short, users are expecting the data to be certified as up to date and accurate. Processes or systems that leverage a black box approach (data comes in one side, is transformed and goes out the other with no visibility into the process) no longer work. If something looks off in the data, users will assume it is incorrect, opt not use the data for decisions and revert back to gut-based decisions.

With that as a backdrop, what platform features should companies in a three-tier system for distribution look for to help them obtain and leverage data as a competitive asset? As expected, they must look for systems that are scalable and extendable. In addition, due to the increasing value of data and the prevalence of poor-quality data, they must look for systems that are also open, provide visibility into their data processes and that can certify the data is accurate, so users can trust and leverage the data. To provide certification and confidence in the data, users should have visibility into:

  • Status of data submission by all constituents
  • Quality of the data being submitted by all constituents (current and historical submissions)
  • Validation checks on the data as it moves through the process
  • Lineage of all the data being submitted from ingestion to end user consumption
  • Definitions of all standardization/harmonization rules (product/store maps) applied to the data
  • Data restatements, when needed, status and impact on the end user data
  • Timeliness of getting standardized/harmonized data to end users

While this visibility may seem like on obvious requirement of any data system, many (both purchased and custom coded) do not provide this type of functionality. The primary focus of the system was to ingest, process and present business data to the end users. Little thought was given to visibility into the process. Companies have tried to bolt on this type of these visibility but only ones designed as open from the start can provide the appropriate type of transparency that users require to trust the data.

As stated in the beginning of this article, data has become an organization’s most valuable asset. But like oil, data must also be refined and processed, with the right controls, process and monitoring in place, to ensure the machines that use the resource can trust the quality of the product and leverage it as a competitive advantage. Data quality is entirely reliant on good data governance.


David Palmer is Chief Product Officer at Aperity, Inc., an innovative data management and analytic solutions provider transforming how data is shared. With more than 20 years of strategic and hands-on experience transforming companies into data-driven organizations, David is a visionary business intelligence and analytics executive who has designed, developed and launched new analytical products for a variety of industries including beverage, health care, telecom and finance.