September 14, 2022

Why data governance and privacy should work together

Data governance is a set of systems, rules, and responsibilities that control how your organization uses information. Data governance helps your organization protect sensitive data, such as personal, financial, and trade secrets, in order to meet regulatory compliance goals and other obligations. It also helps you use data more effectively, providing better analytics and decision making, increasing efficiency, and reducing waste and redundancy.

Here’s how to create a data governance program that will enhance your data privacy program, improve your security, and make your company more effective.

Why does data governance matter?

Everything your organization does depends on data, from order fulfillment, to market research, product strategy, accounting, and partner management. Data governance can provide a major competitive advantage by enabling you to organize, track, and utilize that data more effectively than your competition.

At the same time, a lot of your data comes with risks and obligations attached. You need to protect and regulate the use of personal data to fulfill your obligations under laws like CPRA and GDPR. You must carefully monitor financial data to detect errors and fraud. And of course, you have to document a wide range of activities, from product design to cybersecurity logs.

The data governance process enhances both sets of goals, helping you use data more effectively, while protecting you from the risks that data entails.

On the surface, improving data utilization might seem like a very different task than regulating personal data use or documenting a product. But in reality, many data governance strategies and policies help meet both positive and negative goals. For example, tagging data helps your analysts access the information they need more easily, but it also enables you to track and mitigate the risks associated with that data. Similarly, documenting and mapping your data flows helps identify and address privacy risks, while also ensuring the data your business depends on stays accurate and available.

Data governance enables you to more effectively steward your data, accomplishing a wide range of goals, including:

Analyzing data flows and the data footprint
Coordinating an overall data and privacy strategy
Protecting data from attacks, leaks, and other risks
Protecting privacy by upholding customer choices about how their data is used
Regulating internal audit and documentation practices
Supporting strong decision making and high information quality
Eliminating data duplication and repeat work
Ensuring that stakeholders get the data they need
Minimizing access to sensitive data

Data governance, security, and privacy

But wait a minute! You already have a privacy program to regulate personal data usage, and a cybersecurity team to prevent hacks and leaks. So why is the purview of data governance so broad? Heck, why do you need data governance at all?

The most important reason is that cybersecurity and privacy can’t adequately address all your data needs. Cybersecurity teams are typically focused on endpoint and network security. Their goal is to fortify your infrastructure against attacks. But there are risks inherent to data that can’t be addressed with this strategy.

To understand the benefits of data governance, consider what happens when you run a routine analytics report. The app pulls data from multiple sources and processes it, to highlight key aspects or trends.

Even that simple action can pose risks to privacy that traditional cybersecurity can’t address, such as:

Disclosing confidential information through inadequate access control
Combining data from multiple sources in a way that could de-anonymize information
Allowing malicious actors to infer trade secrets
Exposing sensitive data through a compromised partner app
Aoviding misleading information through the use of inaccurate or outdated data

Similarly, privacy policies can’t be effectively implemented without a data governance approach. For example, under the CPRA, if a consumer asks your organization to delete their personal data, it doesn’t just apply to your organization; you also need to make sure that any partners you’ve shared the data with also delete it.

That means you need to understand everywhere your data flows both inside and outside your organization. You also need a system to label and track data, so you can track down a particular consumer’s data when you receive a request. Finally, you need a way to track and confirm you’ve taken the compliance actions, such as deleting a consumer’s data, providing a copy of that data to the consumer, or limiting third party access based on a consumer’s opt out. You can’t do all those things by solely focusing on privacy.

With security, privacy, compliance, efficiency, and other data concerns bleeding into each other, you need to build a holistic approach to data. A data governance framework enables you to build a coherent strategy that addresses all your data priorities.

Utilizing the DGPC framework

Your data governance framework needs to mesh will with other frameworks, such as compliance regulations and security privacy best practices. Data governance for privacy, confidentiality, and compliance (DGPC) is an excellent framework for this purpose. Designed by Microsoft, DGPC focuses on three core areas: people, processes, and technology.

People

The first step is to organize a data governance team. Typically, this team will be led by a head of data or chief data officer, overseeing a group of data stewards.

The role of the chief data officer is to oversee the governance program. They’re in charge of designing the overall data governance strategy, defining roles and responsibilities within the team, setting priorities, and enforcing standards.

Under the chief data officer are the data stewards, who implement the program. They’ll handle tasks like building access control policies, breach notifications, and escalation paths; enforcing compliance; mapping data flows; and other areas of governance.

Process

The DGPC process typically begins with document reviews. Your team will need to consult a range of materials, including:

Regulatory documents, such as statutes and rulings from regulatory bodies
Best practices and standards
Internal privacy, security, and technology use policies
Internal governance strategies and timelines
Audits and compliance review documents

Your team will also need to consult with other stakeholders, including legal counsel, security teams, and department leads.

The next step is to create guiding principles and policies for data governance. These principles should take into account the company’s current data use practices and challenges, along with legal obligations, risk appetite, privacy priorities, and other relevant factors.

Finally, your team will need to set objectives. Drill down to identify security, compliance, privacy, and other risks in your data flows. For example, looking at your customer database, you might find:

Insufficient access controls that give certain apps or stakeholders too much information, compromising security and compliance
Inadequate tagging, making it hard to comply with customer requests, or find relevant data
Lack of data verification, impacting customer service, strategic decision making, or other issues
Gaps in your technology policy, resulting in stakeholders using untrusted analytics apps

Alongside risks, you should look for opportunities. For example, your organization may have duplicate data sets or a vast store of unnecessary data. Deleting unneeded data, creating automatic deletion processes, or sanitizing your data could improve performance, cut hosting costs, and improve decision making.

Technology

The DGPC breaks down technology into a number of different areas, enabling your data stewards to systematically analyze your infrastructure for risks and gaps. The process begins with the information lifecycle, which focuses on data flows, then proceeds through four different technological domains. Some of this may overlap with your existing IT security controls, so it’s a good idea to collaborate with relevant stakeholders and review documentation to avoid duplicating work.

Information lifecycle

The information lifecycle models how data flows through your organization. Typically, this is divided into five stages: creation, storage, usage, archival, and disposal. However, DGPC adds another stage: The transfer stage, which occurs repeatedly throughout the lifecycle.

Creation covers all the data your organization gathers, produces, or captures. This includes data provided by outside vendors, entered by customers, created internally, or captured by cookies, apps, or other means.

Storage is the phase where data is placed in a database or other storage medium. It also includes secondary storage, such as backups and high-availability systems.

Usage encompasses all the ways data is employed in your organization. It may be processed, viewed, pulled into a report, tagged, modified, or shared outside your organization. It’s important to track how sensitive data is used, recording all user access along with any changes made to the data.

Archiving is the process of storing data that is not currently in use in a secondary environment outside of production. This could be to satisfy retention requirements, save resources, prepare data for periodic deletion, or provide a backup for a later date. Because archives may sit untouched for a long time, there could be unanticipated risks associated with archived data. For example, they may contain sensitive data stored without modern data protection.

Deletion should be a regular part of your data lifecycle. For sensitive personal information and other sensitive data, it enables you to protect confidentiality and decrease the risk of a data breach. It also controls your data footprint, saving resources and removing redundant information. Regulating and automating data deletion is a key task for data governance. It’s important to ensure data is deleted regularly and properly, removing every copy of the information.

Transfer occurs at the beginning of every stage but deletion. When you acquire, store, use, or archive data, you’re actually creating a copy of the data, and transferring it.

This creates an entire data lifecycle within the larger data lifecycle. For example, let’s say you’re reviewing and tagging a cloud database. Your cloud software is reading the data, copying the relevant parts, and sending it to your terminal, where it’s stored temporarily. When you add a tag, the data is modified and sent back over the internet, updating the database. Your software will also locally store or archive some of the data you’ve worked with, and delete redundant information after you’re finished.

These data copies are every bit as sensitive as the original data, and must be protected in the same way. You need to analyze the transfer vehicle that moves the data, and ensure it’s adequately protected. For example, if the data is traveling across the open Internet, it should be encrypted across its journey to protect it from malicious third parties.

You also need to look at any tools that read or access the data. If the recipient is using a desktop data-mining tool to report on a dataset, what sensitive data are they generating? Will they be creating spreadsheets with sensitive data? Could the data-mining tool itself be vulnerable?

The way stakeholders use technology could also pose hazards. They could email sensitive reports, for example, or store them locally or in a third-party cloud platform. They could even lose control of sensitive data by having a laptop stolen or falling victim to a hacker.

Finally, you should consider the technology tools, policies, and practices of partners and other third parties. If your company shares sensitive data, it’s important to ensure that their systems provide at least as much protection as yours do.

Technology domains

In addition to the data lifecycle, you need to make sure the tech you use to access, process, and control your data doesn’t pose excessive risks. There are four technology domains in DGPC: secure infrastructure, identity and access controls, information protection, and auditing and reporting.

Secure infrastructure is infrastructure that protects against hackers, insider attacks, and other risks. It includes your operating system, network, applications, and storage devices.

Identity and access controls restrict who can use assets. They authenticate users, manage accounts, and control access to resources and data.

Information protection is persistent protection that safeguards confidential data. Your company needs a system to correctly classify confidential and sensitive data, and protections to keep it secure wherever it resides.

Auditing and reporting ensure your data protections and compliance systems are working as designed. This includes compliance automation, regular auditing and continuous monitoring.

Data governance for business

Protecting data is only half the story. Your data governance program should also make your data more useful, accurate, and effective.

Regulating data architecture

Your tech stack has serious implications for productivity, data analysis, cost, scalability, and other factors. But on top of that, it also plays a major role in determining the risks, priorities, and benefits of your data governance program. By factoring governance into your data architecture design, you can make more informed and strategic decisions about technology.

Even a simple choice like where to host can have major implications for data governance. Cloud hosting provides a secure solution and eliminates a lot of work and overhead, but it also means you’re dependent on a provider to protect your data, and must use the open Internet to access it. On-premise hosting can give you more control over information, but it requires you to handle security and maintenance in-house. Considering your data governance priorities, resources, and timeline can help you choose the most effective hosting option.

Search and discovery: making data locatable

One of the most important data governance goals is making sure stakeholders can find the data they need quickly and easily. If your data governance framework neglects search and discovery, it’s going to lead to slowdowns and unnecessary support calls, while taking away valuable time from your engineering team.

This will also hinder adoption. If people can’t find the information they need when they need it, they’ll find ways to work around your system, leading to a bloated footprint, broken workflow, and unnecessary data risks.

Metadata and documentation to make data understandable

Your data needs to be tagged and documented in a way that makes it easy to understand. Datasets from different tools (or from different versions of the same tool) may be organized differently, or include different categories of information.

You need to give metadata and documentation as much attention as the data itself. This will ensure that users can ingest data quickly, and interpret it correctly.

Keeping data quality high

Data flows are complicated. Data can travel from a diverse range of sources and go through many extractions and transformations in its lifecycle. A small misconfiguration can lead to missing or inaccurate data. Focusing on data quality will reduce breaks in your data flow, and ensure they’re easy to fix when they do happen.

There are five factors that go into data quality:

Lineage: Ensure that you can track the flow of your data, charting upstream sources to downstream ingestors. Carefully collect and organize metadata so you can monitor your data flows. Building good visibility into data lineages will ensure you can quickly detect and fix breaks when they do occur.
Freshness: One key indicator of data breaks is lack of freshness. If your data cadence suddenly slows, it’s a good sign that something is broken.
Distribution: Another way to detect data breaks is to look for changes in distribution — i.e. unexpected values. If you start getting very different numbers, there may be a break in your system interfering with data accuracy.
Volume: Changes in the size of a table or the number of records can indicate that something in your data flow has broken. For example, if a weekly report drops from 200,000 records to 100,000 records, it may mean that one of your data sources isn’t correctly sending data.
Schema: Your data schema — the way data is broken up and organized in a table — shouldn’t change spontaneously. If the data is formatted or ordered in a different way than you expect, it may indicate a break in your data flow.

Quick security and access provisioning

The more restrictive your security becomes, the more efficient it needs to be. Your data governance program needs to be able to change access quickly in a variety of circumstances, including:

Onboarding and offboarding: New users should automatically receive needed data access as part of the onboarding process. Just as importantly, your system needs to automatically revoke access when a user leaves.
Role change: If a user changes positions in the company, you need a way to quickly update their access, closing off sensitive information they no longer need and provisioning whatever resources their new position requires. Users shouldn’t have to wade through an entire offboarding and re-onboarding process every time they move to a new role.
Special projects: Users may need access to data or other resources for a limited time to complete a special project. This is particularly important in the development cycle, where long waits on provisioning can quickly add up. Make sure your system can grant temporary access without compromising security or confidentiality.

Preparing for compliance

Every step you take in data governance can bring you closer to your compliance roles — with the right automation and workflow. Wherever you are in the governance process, TerraTrue can help align your data governance goals with your compliance obligations.

If you’re just getting started, our collaborative workflow and customizable templates can help you get moving, saving you the headache of building a workflow from scratch. TerraTrue also serves as a repository for your documentation, making it easy to review and share documents between internal stakeholders or with auditors.

As you progress, our data mapping capability will help you quickly build insight into your data flows, while our extensive compliance resources will accelerate the task of reviewing documentation.

TerraTrue will continue to evolve with you as your data governance goals and strategy solidify. The software can be configured to automatically flag issues based on your compliance priorities and other goals, making it easy to identify objectives, achieve progress, and demonstrate value.

And as your data governance program matures, TerraTrue’s machine learning will continue to increase your efficiency. Each review teaches the software about your needs, speeding up everything from routine documentation to in-depth compliance reviews.

Data governance will never be easy, but we can make it a lot easier. Contact us for a free demo today.

Platform

Solutions

Resources

Customers

Company