What is referential conformity

Data Integrity: Importance, Types & Risks

Imagine the following: A pharmaceutical company extols the safety of its latest wonder drug. The supervisory authority then checks the manufacturer's offshore production site and grants the immediate cessation of production, as important data for quality control are missing. Unfortunately, such examples of poor data integrity are far from rare. Problems with the correctness and consistency of data can be found in all industries and can cause major problems.

In the age of big data, where more data is processed and stored than ever before, it is extremely important to implement measures to protect the integrity of the data collected. The first step in keeping your data secure is to understand the basics of data integrity and how it works. Read on to find out what data integrity is, why it's so important, and how to keep your data in tip-top shape.

What is data integrity?

The term data integrity refers to the correctness, completeness and consistency of data. The security of data with a view to regulatory requirements, such as B. the data protection, CCPA and DSGVO, and the protection of the data fall under this term. To ensure data integrity, various processes, rules and standards are implemented in the design phase. Once the integrity of the data has been ensured, the information stored in a database remains permanently correct, complete and trustworthy, no matter how often it is accessed. In addition, data integrity ensures that your data is protected from external influences.

Types of data integrity

A distinction is made between physical and logical data integrity. Both use different processes and methods that ensure data integrity in hierarchical and relational databases.

Physical integrity

Physical data integrity is about ensuring the correctness and completeness of the data while it is being stored and used. If there are natural disasters, power outages or hacker attacks that disrupt database functions, this affects the physical integrity. User errors, memory erosion, and numerous other problems can also prevent data processors, system and application programmers, and internal auditors from accessing correct data.

Logical Integrity

Logical integrity ensures that data remains unchanged while it is in use in a relational database. It also protects data from hackers and user errors - but in a completely different way than is the case with physical integrity. There are four types of logical integrity.

Entity Integrity

Entity integrity is based on the creation of primary keys or unique values ​​that identify certain data components to ensure that data does not exist more than once in the database and that no field in the table is null. It is a function of relational systems that store data in tables that can be linked and used in various ways.

Referential Integrity

Referential integrity refers to the sequence of processes that ensure that data is stored and used in a consistent manner. Rules for the use of foreign keys embedded in the database structure ensure that only permitted changes, additions and deletions can be made. Such rules can contain restrictions (so-called constraints) which prevent the entry of data duplicates, guarantee the correctness of the data and / or block the entry of irrelevant data.

Domain Integrity

Domain integrity describes the entirety of all processes that ensure the correctness of each individual data element in a domain. In this context, a domain is understood to be the set of acceptable values ​​that a column can contain. Here, too, constraints and other measures can be used to limit the formats, types and scope of the data entered.

Custom integrity

This term is understood to mean the rules and constraints defined by the user for his individual requirements. Sometimes entity, domain, and referential integrity are not enough to protect data. Then specific business rules must also be defined and integrated into the integrity measures.

What data integrity is not

With all the aspects of data integrity, it is easy to lose sight of the true meaning of the term. It is often mistakenly equated with data security and data quality, but these both have their own meaning.

Data integrity is different from data security.

Data security refers to the entirety of all measures with the help of which the integrity of data is guaranteed. This includes the use of systems, processes and procedures that prevent unauthorized or potentially harmful data access by third parties. Data security breaches can be minor and therefore easy to contain, but they can also be very extensive and cause considerable damage.

While data integrity is about keeping data correct and usable over its entire lifespan, data security aims to protect data against external attacks. Data security is one of the many facets of data integrity. However, data security does not include the many processes that are required to protect data against manipulation in the long term.

Data integration is different from data quality.

Does the data stored in your database meet your company standards and the requirements of your company? Data quality answers these questions using various processes that measure the age, relevance, correctness, completeness and reliability of your data.

Like data security, data quality is just one building block of data integrity, albeit a very crucial one. Data integrity encompasses all aspects of data quality, but goes one step further: With the help of various rules and processes, it controls, among other things, how data is entered, stored and passed on.

Data integrity and GDPR compliance

Data integrity is key to complying with data protection regulations like the GDPR. Companies face substantial fines if these laws and regulations are not complied with. In some cases, violations can even be punished beyond this. Repeated non-compliance can also endanger the very existence of a company.

Fortunately, there are ways and means to ensure the data integrity required by GDPR and other data protection regulations. You can learn more about this in our Practical Steps to GDPR Compliance series.

Data integrity risks

Data is exposed to various integrity risks. Here are some examples:

  • User error: Incorrect or duplicate data entry, accidental deletion, failure to observe relevant protocols or errors in the implementation of data protection mechanisms can all impair data integrity.
  • Transmission error: If data cannot be easily transferred from one location in the database to another, a transfer error has occurred. Such errors occur when, in relational databases, a data element is present in the target table but not in the original table.
  • Bugs and Viruses: Spyware, malware and viruses are software components that can penetrate a computer in order to change, delete or steal data there.
  • Compromised hardware: Sudden crashes of computers or servers and problems with the functioning of computers and other devices indicate massive errors and could be an indication that your hardware has been compromised. Compromised hardware can display data incorrectly or incompletely, restrict or prevent data access or make it more difficult to use data.

Data integrity risks can be easily minimized or even eliminated using the following steps:

  • Restrict data access and define rules that prevent data from being changed by unauthorized parties.
  • Use validation to ensure that your data is accurate as it is collected and used.
  • Back up your data regularly.
  • Use logs to check when data was added, modified or deleted.
  • Perform internal audits on a regular basis.
  • Use error detection software.

Data integration - the first steps

Protecting the integrity of your company data using traditional methods can be a daunting challenge. Secure, cloud-based data integration platforms offer a modern alternative and give you a real-time view of all your data. With the help of the most modern cloud integration tools, numerous source data applications can be connected so that you can access all of your company data from a central location.

You can find out how you can create optimal framework conditions for data integrity in our Definitive Guide to Data Governance.