Monday, 27 June 2016

Data scrambling in SQL

When developing an application, it can often be quite useful to utilise the production data for testing. However, the danger of this is that production data is then being stored on the relatively less-secure development devices. The answer to this solution is to scramble the data in such a way that the original information is no longer recoverable, but it still takes a similar form.

For simplicity, random 0-9a-zA-Z strings are the core of this, using the function RAND().

Note: RAND() is not cryptographically-secure, but that's ok in this scenario as we're generating these values independently of the original data. If you need cryptographically-secure random values, use CRYPT_GEN_RANDOM()).

Script Architecture

Yes - unfortunately due to SQL constrains there needs to be mild consideration of architecture. Namely, we want to use scalar functions as the generators of random values as these will fit in easily to an UPDATE script. However, SQL does not permit the use of RAND() inside a scalar function, as running RAND() has side-effects on the DB (due to the deterministic nature I presume). We can bypass this by creating a minimal view which returns a RAND() value:

CREATE VIEW [dbo].[vw_RandomNumber]
AS
SELECT RAND() AS RandomNumber

GO

This forms the basis of our random data functions.

Functions

RandomString

CREATE function [dbo].[udf_RandomString](@length INTEGER)   
    RETURNS varchar(MAX) AS BEGIN

DECLARE @charPool VARCHAR(MAX) =
                 'abcdefghijkmnopqrstuvwxyzABCDEFGHIJKLMNPQRSTUVWXYZ23456789'
DECLARE @poolLength INTEGER = Len(@charPool)

DECLARE @loopCount INTEGER = 0
DECLARE @randomString VARCHAR(MAX) = ''

WHILE (@loopCount < @Length) BEGIN
SELECT @randomString = @randomString +
            SUBSTRING(@charPool, CONVERT(int, 
            (SELECT TOP 1 RandomNumber FROM vw_RandomNumber) * @poolLength), 1)
SELECT @loopCount = @loopCount + 1
END

    RETURN (@randomString)  
END 

This script does not generate the lowercase letter "l" and number "1", and capital letter "O" and number "0" due to their similar appearance. Maybe slightly overkill, but hey!

RandomEmail

CREATE function [dbo].[udf_RandomEmail](@length INTEGER)   
    RETURNS varchar(MAX) AS BEGIN

    RETURN LOWER((dbo.udf_RandomString(@length) + '@email.com')  )
END  

Full script example

UPDATE WEB_Booking
SET JobName = dbo.udf_RandomString(12),
DeliveryAddress = '123 ' + dbo.udf_RandomString(6) + ' Rd',
DeliverySuburb = dbo.udf_RandomString(8),
DeliveryState = 'NSW',
DeliveryPostcode = '2000',
PickupAddress = '123 ' + dbo.udf_RandomString(6) + ' Rd',
PickupSuburb = dbo.udf_RandomString(8),
PickupState = 'NSW',
PickupPostcode = '2000',
SiteContactName = dbo.udf_RandomString(4) + ' ' + dbo.udf_RandomString(6),
SiteContactNumber =  '04 1234 5678'

Numbers have not yet been scrambled, this is a potential future improvement though it was considered low value for this project, as there was no need for addresses or contact numbers to be unique and diverse.

6 comments:

  1. IEEE Final Year projects Project Centers in Chennai are consistently sought after. Final Year Students Projects take a shot at them to improve their aptitudes, while specialists like the enjoyment in interfering with innovation. For experts, it's an alternate ball game through and through. Smaller than expected IEEE Final Year project centers ground for all fragments of CSE & IT engineers hoping to assemble. Final Year Project Domains for IT It gives you tips and rules that is progressively critical to consider while choosing any final year project point.

    JavaScript Training in Chennai

    JavaScript Training in Chennai

    ReplyDelete
  2. A few gatherings even let you promote your administrations, yet some likewise charge you an expense for this, so by and by, do your exploration. joomla data entry

    ReplyDelete
  3. An IP address is basically the Internet's variant of a home or work locale, important for accepting information, similar to a location is important to get physical mail or bundles. These addresses were a result of IPv4, the Internet Protocol's fourth cycle, gotten from 32-piece double number blends. 192.168.l0.1

    ReplyDelete
  4. I really like this post as you have shared here. I would like to thank you for posting it. Keep Posting. OEE Software

    ReplyDelete
  5. Safespy is a free spy app to catch cheating spouses without touching the target phone. The app’s services are available for the subscriber on both Android and iOS phones.Click here please.Cell phone spy apps are among the top solutions.


    ReplyDelete
  6. Interesting and interesting information can be found on this topic here profile worth to see it. email spam words

    ReplyDelete