Creating a Data Redaction Capability to Meet GDPR Requirements Using EDB Postgres (2023)

Table of Contents
Approach Conclusions Videos

Creating a Data Redaction Capability to Meet GDPR Requirements Using EDB Postgres (1)The GDPR (General Data Protection Regulation) goes into effect on May 25, 2018 throughout the European Union. The regulation focuses on the secure management of personal information. For more detailed information, consult the official GDPR website or listen to our recent GDPR Webinar.

A successful GDPR-compliant implementation requires the use of many technical capabilities, such as authentication, authorization, access control, virtual database, and encryption. One of the techniques often considered is data redaction - a technique that limits sensitive data exposure by dynamically changing data as it is displayed for specific users.

Get ready for GDPR: 6 things database administrators should know. Download Whitepaper.

For example, a social security number (SSN) is stored as ‘021-23-9567’. Privileged users can see the full SSN, while other users only see the last four digits ‘xxx-xx-9567’.

In this article I describe how we can use standard EDB Postgres capabilities to create user-specific data redaction mechanisms. Keep in mind that all the fields shown in the example are examples of personal data, not just the ones that have been redacted for the purpose of this example.

N.B.: EDB Postgres Advanced Server 11, targeted for release in late 2018, will have native data redaction capabilities. Those will be more robust and performant than the techniques described in this blog.

(Video) 5 Ways to Make Your Postgres GDPR ready


The approach leverages the Postgres search_path feature to direct privileged users to the raw unredacted data when they run a query, and to direct non-privileged users to a view that implements redaction logic.

Step-by step walkthrough

  1. Two schemas are defined in the database ‘mycompany’:
    1. Schema ‘employeedata’ has the detailed data in a table called ‘employees’
    2. Schema ‘redacteddata’ has a view called ‘employees’ that refers to the table ‘employeedata.employees’ and applies user-defined functions and standard SQL to redact data in select columns
  2. A sample data set with employee IDs, names, social security numbers, birth dates etc. is created in the table ‘employees’ in the schema ‘employeedata’.
  3. A library of redaction functions for SSN, email, dates, salaries and phone numbers apply data type specific redaction techniques
  4. The ALTER command is used to set the search_path in Postgres to direct non-privileged users to the views, and privileged users to the underlying unredacted data
  5. A function is used to show how non-privileged users could query based on redacted fields

All the code described in this example is released under PostgreSQL open source license, and is intended for use with EDB Postgres Advanced Server 10 in Oracle-compatible mode. This code is in intended for illustration purposes only.


-- connect to the database as user enterprisedb

DROP DATABASE IF EXISTS mycompany;CREATE DATABASE mycompany WITH OWNER = enterprisedb;

-- connect to the new database

(Video) Introducing Data Redaction - An Enabler to Data Security in EDB Postgres Advanced Server

\c mycompany;

-- the schema employeedata will hold the personally identifiable information (PII)

CREATE SCHEMA employeedata;

-- the schema redacteddata will hold the view that does the redaction

CREATE SCHEMA redacteddata;

-- create table with employee information

CREATE TABLE employeedata.employees(id integer GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,name varchar(40) NOT NULL,SSN varchar(11) NOT NULL,phone varchar(10),birthday date,salary money,email varchar(100));

-- add sample data

INSERT INTO employeedata.employees (name, ssn, phone, birthday, salary, email) VALUES( 'Sally Sample', '020-78-9345', '5081234567', '1961-02-02', 51234.34, ''),( 'Jane Doe', '123-33-9345', '6171234567', '1963-02-14', 62500.00, ''),( 'Bill Foo', '123-89-9345', '9781234567','1963-02-14', 45350, '');

-- define redaction functions. Add the SECURITY DEFINER to specify that the function is to be executed with the privileges of the user that owns it.

CREATE OR REPLACE FUNCTION redact_ssn (ssn varchar(11))RETURNS varchar(11)/* replaces 020-12-9876 with xxx-xx-9876 */AS$$ SELECT overlay (ssn placing 'xxx-xx' from 1) ;$$LANGUAGE SQL SECURITY DEFINER;
CREATE OR REPLACE FUNCTION redact_date (input_date date)RETURNS date/* sets the year to 0 */AS$$ SELECT input_date - ((extract (year from input_date) + 1) *interval '1 year') ;$$LANGUAGE SQL SECURITY DEFINER;
CREATE or REPLACE FUNCTION redact_email(email varchar(100),visible integer DEFAULT 1,red_char char default 'x')RETURNS varchar(100)/*Redacts the first part of an email address, starting with visible number br of characters. Same for second part after the '@'. The redaction character can be set. Checks if the email address has exactly one '@' and at least one '.'. Otherwise returns 'illegal@email.address'*/AS$$DECLAREpos1 integer;pos2 integer;part1 varchar; -- this is the name before the @part2 varchar; -- this is the domainpart3 varchar; -- this is the suffixBEGIN--- check if this is an email addressIF (SELECT REGEXP_COUNT (email, '@',1) = 1) AND (SELECT REGEXP_COUNT (email, '\.',1) >= 1)THENSELECT POSITION('@' in email) INTO pos1;SELECT LENGTH(email) - POSITION('.' IN reverse(email)) INTO pos2;part1 = RPAD(SUBSTRING(email, 1,visible),pos1-1, red_char );part2 = RPAD(SUBSTRING(email,pos1+1,visible), pos2-pos1, red_char );part3 = SUBSTRING(email, pos2 +1, LENGTH(email));RETURN part1 || '@' || part2 || part3;ELSERETURN 'illegal@email.address';END IF;END$$ LANGUAGE PLPGSQL SECURITY DEFINER
CREATE OR REPLACE FUNCTION redact_salary (salary money)RETURNS money/* always returns 0 */AS$$ SELECT 0::money; $$LANGUAGE SQL SECURITY DEFINER;
CREATE OR REPLACE FUNCTION redact_phonenbr(phone_nbr varchar(10),visible integer DEFAULT 4,red_char char default 'x')RETURNS varchar(10)/*Replaces all digits, except for the last visible digits, with the redaction char*/AS$$SELECT overlay (phone_nbr placing rpad(red_char, length(phone_nbr) - visible, red_char) from 1);$$LANGUAGE SQL SECURITY DEFINER;

-- define redaction view in the schema redacteddata. It calls the redaction functions to redact data in certain columns.

(Video) Creating a Modern Analytic Environment with IBM Cloud Pak for Data and EDB PostgreSQL

CREATE OR REPLACE VIEW redacteddata.employees ASSELECTid,name,redact_ssn(ssn) ssn,redact_phonenbr(phone) phone,redact_date(birthday) birthday ,redact_salary(salary) salary,redact_email(email) emailFROM employeedata.employees;

-- create privileged user

CREATE ROLE privilegeduser LOGIN PASSWORD 'password';

-- grant access to schema and tables

GRANT USAGE ON SCHEMA employeedata TO privilegeduser;GRANT ALL ON ALL TABLES IN SCHEMA employeedata TO privilegeduser;

-- set search path

ALTER ROLE privilegeduser IN DATABASE mycompany SET search_path = "$user", public, employeedata;

-- create redacted user

CREATE ROLE redacteduser LOGIN PASSWORD 'password';

-- grant access to schema redacteddata schema and views

GRANT USAGE ON SCHEMA redacteddata TO redacteduser;GRANT ALL ON ALL TABLES IN SCHEMA redacteddata TO redacteduser ;

-- set search path

(Video) Advanced security workshop 2—data redaction

ALTER ROLE redacteduser IN DATABASE mycompany SET search_path TO "$user", public, redacteddata;

-- define a function that allows to search by SSN, but redacts other data

CREATE OR REPLACE FUNCTION employee_ssn (ssn varchar)RETURNS TABLE (ID integer, name varchar(40), ssn varchar(11), phone varchar(10), birthday date, salary money, email varchar(100))/*Allows a non-priviledged user to search by SSN (a redacted data element). The function returns a table that implements the redaction functions. Obviously this function could be called in an exhaustive search loop to guess the SSN.*/AS$$SELECTid,name,redact_ssn(ssn) ssn,redact_phonenbr(phone) phone,redact_date(birthday) birthday ,redact_salary(salary) salary,redact_email(email) emailFROM employeedata.employees WHERE ssn =$1;$$LANGUAGE SQL SECURITY DEFINER;

-- connect to database as privilegeduser

Server [localhost]:Database [edb]: mycompanyPort [5444]:Username [enterprisedb]: privilegeduserPassword for user privilegeduser:edb-psql (10.1.5)mycompany=> select * from employees;id | name | ssn | phone | birthday---+--------------+-------------+------------+--------------------1 | Sally Sample | 020-78-9345 | 5081234567 | 02-FEB-61 00:00:001 | Jane Doe | 123-33-9345 | 6171234567 | 14-FEB-63 00:00:001 | Bill Foo | 123-89-9345 | 9781234567 | 14-FEB-63 00:00:00(3 rows)

--- connect to database as redacteduser

Server [localhost]:Database [edb]: mycompanyPort [5444]:Username [enterprisedb]: redacteduserPassword for user redacteduser:edb-psql (10.1.5)mycompany=> select * from employees;id | name | ssn | phone | birthday---+--------------+-------------+------------+--------------------1 | Sally Sample | xxx-xx-9345 | 5081234567 | 02-FEB-02 00:00:001 | Jane Doe | xxx-xx-9345 | 6171234567 | 14-FEB-02 00:00:001 | Bill Foo | xxx-xx-9345 | 9781234567 | 14-FEB-02 00:00:00(3 rows)

-- redacteduser tries to access unredacted dataset

mycompany=> select * from employeedata.employees;ERROR: permission denied for schema employeedataLINE 1: select * from employeedata.employees; ^

-- redacted user uses table function to query dataset and queries based on redacted column

mycompany=> select name, ssn, phone, salary from employee_ssn ('123-89-9345');name | ssn | phone | salary----------+-------------+------------+--------Bill Foo | xxx-xx-9345 | xxxxxx4567 | $0.00(1 row)


Data redaction is one of many techniques being brought to bear on the challenges of GDPR and on other challenges of dealing with confidential data. This article shows how we can use the Postgres search_path, user defined functions and views to improve data protection.

(Video) What's New in EDB Postgres 11

This post benefited from helpful feedback from EnterpriseDB colleagues Phil Allsopp, Robert Haas and Vibhor Kumar.

Marc Linster, Ph.D., is Senior Vice President, Product Development, at EnterpriseDB.

Creating a Data Redaction Capability to Meet GDPR Requirements Using EDB Postgres (2)


1. EPAS15 - The Most Secure Postgres
2. Creating a Multi-Layered Secured Postgres Database
3. How To Install and Initialize EDB Advanced Server 12 (Postgres)
4. EDB Postgres Advanced Server - What’s new?
5. Les nouveautés d'EDB Postgres 11
6. Best Practices in Security with PostgreSQL
Top Articles
Latest Posts
Article information

Author: Tyson Zemlak

Last Updated: 04/13/2023

Views: 6054

Rating: 4.2 / 5 (63 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Tyson Zemlak

Birthday: 1992-03-17

Address: Apt. 662 96191 Quigley Dam, Kubview, MA 42013

Phone: +441678032891

Job: Community-Services Orchestrator

Hobby: Coffee roasting, Calligraphy, Metalworking, Fashion, Vehicle restoration, Shopping, Photography

Introduction: My name is Tyson Zemlak, I am a excited, light, sparkling, super, open, fair, magnificent person who loves writing and wants to share my knowledge and understanding with you.