We recently got a question from someone about how to encrypt large SQL Server database for compliance reasons with minimal impact on software applications.
There are number of ways to approach SQL Server data encryption. For compliance or whatever the reason.
Conclusion: Most enterprise SQL Servers use a combination of several encryption methods and techniques. They are listed below.
Two major encryption approaches
- Only encrypt the items needed to satisfy compliance requirements. OR
- Implement encryption that is the hardest to break (not just to satisfy the auditors)
Often #1 gets done now. And #2 is taken care of over time.
The solution which requires zero changes on the application is often preferred.
Things to consider when encrypting SQL Server data
- Data & log drives.
- Where backups are stored.
- Data in memory.
- When DB is restored, sensitive data should not be visible.
- Communication protocols with SQL Server (they are not encrypted by default).
6 ways to encrypt data inside SQL Server
Let’s look at some possible options for encrypting SQL data.
- Only drives where SQL data is stored
- Transparent Data Encryption (TDE)
- Dynamic data masking
- Column level encryption
- Always Encrypted
- A combination of encryption methods
1. Only drives where SQL data is stored
The pros
- If the drive is stolen the files should be useless.
- The performance hit is only a few percentage points on CPU. Should be close to unnoticeable.
- Often SQL Server storage is RAID’ed, so stealing a drive doesn’t get you the full data (but it depends on RAID level).
- Easy to implement.
- Zero downtime to turn on Windows.
- This satisfies a lot of different compliance methods.
The cons
- The only use case where drive encryption helps is when the hard drive is stolen. And you have to ask yourself how likely that is. Often it’s a ‘No’. If you’re in the cloud, your changes are even smaller.
- Doesn’t prevent other types of theft.
- If the hacker is on the server already (or it’s internal theft) – this doesn’t help much.
- If the hacker is on the server already, files can be copied to a non-encrypted location and drive encryption will be useless at that point.
- This doesn’t solve an issue of restoring the DB onto another server — say lower environments – such as development, quality assurance or others — sensitive data will still be visible.
The bottom line
Implement. Just know that this doesn’t fix all of the scenarios.
2. Transparent Data Encryption (TDE)
This encrypts data at rest. If the DB backup or files are stolen, it’s impossible to restore the backup without certificates and keys.
The pros
- Much more secure.
- Satisfies many compliance regulations.
- Easy to setup.
- The data is encrypted at rest.
The cons
- The data in RAM is unencrypted!
- If you lose the certificate, kiss your DB goodbye. This is a big one. Even Microsoft support won’t be able to help. So unless you have a solid way to manage keys & certs, DO NOT implement this. Often we see this feature get turned on by someone that doesn’t fully understand how this works (say sysadmin). And maybe sysadmin does a good job managing certificates and keys. But you have to take this further, such as having multiple copies of those certificates and keys in case sysadmin goes whacko or gets hit by the bus.
- Performance has a huge hit here. Because all data is now encrypted, SQL won’t be able to perform lot of operations as fast as before.
- Compression won’t be able to be used. Compression is one of the really useful features that speeds a lot of things up. Since everything will now be encrypted this means another performance hit.
When using TDE, if you lose the certificate, kiss your DB goodbye. This is a big one. Even Microsoft support won’t be able to help.
The bottom line
Don’t do it (if you can avoid it).
3. Dynamic data masking
This works at the presentation layer. Say there is a query: SELECT SSN FROM employees
, it could show SSN
as XXX-XX-XXXX
instead of the actual number.
The pros
- Easy to implement.
The cons
- On the surface, it looks like a nice feature. However, it has a lot of holes. I’m not going to get into them here, but if you want to use dynamic data masking, I would use a third-party product. GreenSQL had a good one. They’ve been purchased since, and are called HexaTier. I believe GreenSQL code worked at the SQL communication protocol layer, where there is a proxy in between your client and the actual server. This requires changing your app to hit a new host (or some DNS config changes).
If you want to use dynamic data masking, I would use a third party product.
The bottom line
Don’t. This is not a popular feature. There are a few more shortcomings that make it close to useless.
4. Column-level encryption
This is where sensitive columns are encrypted one by one.
You still need to guard the DB files, and the backups, etc. But if your DB is stolen and restored, the data inside should be close to useless since every column that matters is encrypted. Another cool thing is when data is restored to lower environments (DEV, QA, UAT, TEST, etc) developers won’t have access to sensitive data. I would still recommend cleansing the DB for lower environments. But it’s already good as it is.
The pros
- You can chose what to encrypt.
The cons
- It requires app changes, which may take time.
- You may miss some columns.
The bottom line
Implement.
Column level encryption will take time, but this is how things are done, when encryption is done right.
You can thank me later!
5. Always Encrypted
The pros
- Data is Always Encrypted – in motion and at rest. SQL can’t even see it, because it’s done at the SQL driver level, SQL Engine doesn’t even know what data it has.
The cons
- Only available on SQL 2016 or higher.
- Performance. if a column is Always Encrypted, SQL will not know what data is in the column. So, for example, if you do SELECT * FROM credit_card_holder WHERE street_name LIKE ‘%parkway%’ SQL will not be able to filter and will only get you 1% of the data.
That means SQL will return the full data set to the DB driver, which decrypts every name and then does the string comparison.
Using Always Encrypted is the same as if you were looking for a single person in the Yellow Pages: instead of getting 1000 rows, you’d get 5 million!
The bottom line
Don’t if you can avoid it. I guess that this feature is built for the public sector, such as the FBI, CIA, etc. Businesses don’t really use this.
6. A combination of encryption methods
How do the best enterprises encrypt data in SQL Server?
This is what we see the most: the best enterprises use the following combination of encryption methods.
- Encrypt the disk using the Win feature. Hits CPU at about 1%, so there’s no reason not to.
- Encrypt SQL backups using SQL native feature.
- Encrypt at the column level.
- To add another safety layer on top of column-level encryption, store salt and other security details in a separate DB. That way, if your product DB is stolen, it is useless without that DB2. HA!
- Perhaps use TDE on top of all this? Most likely not.
And BAM! You’re at the top of the SQL data encryption game!
Agree? Disagree? Comment below!