SQL Server Security

Why SQL Server Backups Fail When You Need Them Most: A 3-Month Bank Outage Case Study

Updated
14 min read
Written by
Mark Varnas

Imagine your bank couldn’t process transactions for 3 months. 

This isn’t hypothetical; it happened to my wife’s sister. Her bank suffered a ransomware attack so severe that customers couldn’t access their money for over THREE MONTHS.

So, salary deposits trapped, bills unpaid, no access to YOUR money. 

The bank doesn’t know how much you have, what orders are outstanding, who paid what. Everything lives in the database, and it’s all gone.

Why? Their backup strategy failed when they needed it most.

This bank thought they had everything covered. They had backups. They had fancy enterprise storage. They had MSPs managing their systems. 

But when crypto hit, all those “bulletproof” systems crumbled.

After 15+ years optimizing SQL Server environments and seeing multiple clients get hit with ransomware, we’ve learned that most companies are gambling with their data using backup strategies that work great … until they don’t.

The difference between a 3-day recovery and a 3-month nightmare often comes down to understanding why SQL Server backups are fundamentally different from everything else in your IT environment.

The Hidden Dangers of “Fancy” Backup Solutions

Here’s what companies misunderstand about SQL backups: 

“Fancy” backup technologies often fail when applied to SQL Server databases. 

When I say fancy, that means snapshotting or tools provided by cloud/SAN vendors that aren’t fully built for SQL Servers.

I’ve seen companies spend $1M+ on storage solutions only to get USB 2.0 speeds when disaster strikes.

These enterprise-grade storage arrays come with impressive marketing materials about snapshot technology and instant recovery capabilities. The sales pitch is compelling, too. 

But here’s the problem most CTOs don’t realize: these tools are built for 95% of use cases –  operating system files, application data, documents. SQL Server represents that 5% where fancy solutions fall apart spectacularly.

We had one client running a 22TB database who relied entirely on SAN snapshots. 

Their MSP configured it, tested it, and assured them it was bulletproof. When ransomware hit, they discovered their snapshots had been capturing seven different database files at slightly different microseconds. 

SQL Server took one look at this “backup” and said: “Screw you. That’s not a valid backup.”

The database files didn’t match. The transaction log was inconsistent. Months of “perfect” backups were completely useless.

The cheaper, faster backup option often becomes the most expensive mistake you’ll ever make. 

Companies love these snapshot solutions because they’re fast, easy to implement, and don’t impact system performance during backup windows. 

But speed and convenience mean nothing when you can’t restore your data.

Case Study: When Ransomware Meets Poor Backup Strategy

Let me tell you about another client who thought they had immutable backups: protected from ransomware. 

Unfortunately, their environment got hit by attackers who encrypted everything: all servers, all computers, everything.

The outcome? 

Even though they thought they had immutable backups, it doesn’t mean they actually did. The backups got encrypted too because of a misconfiguration that left them vulnerable.

Here’s how modern ransomware works: it’s not some script kiddie randomly attacking systems. 

These are sophisticated operations that spread like cancer through your network. The virus gets in and spreads everywhere silently. 

Once it’s sure it has infected everything – including your backup systems – then it locks everything up.

You log in one morning and get that note: “Pay us X Bitcoin, and if you don’t pay us in 24 hours, you’re gonna pay us double. In another 24 hours, the price doubles again.” 

Often, these attacks run on automatic because there might not even be a human behind them managing the process.

We’ve heard of other stories, such as a company that manages 3,000 restaurants and got hit. They paid $50,000 and thought they got off easy. 

Later, they realized they would have paid a million if they knew what was actually locked up.

This is why the bank in our opening story couldn’t recover for three months. 

Their “enterprise backup solution” had been quietly compromised. Every backup they thought was safe had been encrypted along with their production systems.

The most insidious part? 

They didn’t know their backups were worthless until they tried to restore them. By then, it was too late to negotiate. The attackers had moved on, and the bank was left with nothing but encrypted files and angry customers.

Read more on Enterprise SQL Server Security in this blog

Why SQL Server is Different: The Technical Reality

SQL Server is different. 

It’s always live, with data split between memory and disk, plus transaction logs. Additionally, there could be multiple data files that might live on different drives.

Think of SQL Server like a busy restaurant kitchen during dinner rush. The chef (SQL Server) is simultaneously:

  • Taking orders (new transactions)
  • Cooking food (processing queries)
  • Keeping track of what’s been served (transaction log)
  • Managing ingredients across multiple stations (data files on different drives)

Now imagine trying to take a “snapshot” of this kitchen. 

You’d need to capture the exact state of every station, every order, every ingredient at the exact same microsecond. 

Miss the timing by even a fraction, and your snapshot shows half-cooked meals and confused orders.

This is exactly what happens with SAN snapshots and SQL Server.

When backup happens, all database components need to be backed up at that exact millisecond, or you run the risk of them being inconsistent.

Most snapshot tools are designed for static files: think Word documents or application binaries. 

These files don’t change while you’re backing them up. But SQL Server files are constantly changing. Half the database might be in memory, transactions are in flight, and the transaction log is recording every modification.

A traditional SQL Server backup uses the database engine’s built-in mechanisms to create a transactionally consistent, point-in-time copy. 

It knows how to coordinate all these moving pieces. 

To make things worse, these files are 1000 times bigger than your Word doc. A corrupted Word document is annoying. 

A corrupted 5TB database is much worse.

Proper monitoring could have prevented the issues. 

The Performance vs. Security Trade-off: Why Companies Make Fatal Mistakes

Many companies abandon old and time-tested backup methods after experiencing minor performance blips. 

I’ve seen this pattern repeatedly: we implement proper SQL Server backups with full recovery mode and transaction log backups. 

Everything works perfectly for a month or two.

Then someone in IT notices the backup process is using system resources during business hours. Or the CFO sees the monthly cloud storage bill and questions why backup costs increased.

The conversation usually goes like this: “Our old snapshot backup took 30 seconds. This new process takes 2 hours and costs $300 more per month. Can’t we go back to the fast method?

One client actually asked us to roll back our backup implementation because it was “too slow and too expensive.” 

They wanted to save a couple of hundred dollars monthly. When we explained they were risking their entire business to save what amounts to a rounding error in their IT budget, they insisted anyway.

Six months later, they got hit with ransomware. The “fast” snapshots they reverted to were useless. The data was gone.

This is the irony of backup strategies: the pain of proper backups is immediate and visible (slower performance, higher costs), while the pain of backup failure is delayed and catastrophic (complete data loss, business shutdown).

Explore SQL Server Best Practices in our other blog. 

Building a Bulletproof SQL Server Backup Strategy: Immutable Backups and Testing

SQL Server requires specialized backup approaches that 95% of general backup solutions miss. 

Here’s how we build systems that actually work when disaster strikes:

The Three-Location Rule

In IT, if you have two backups, you have one. If you have three, you have maybe two. This isn’t pessimism, it’s reality based on decades of failures.

We implement what’s called immutable backups – one-way writes that can’t be modified even if attackers gain admin access to your systems. 

In AWS, we write to S3 buckets with write-once policies. In Azure, we use immutable blob storage. The key is that once the backup is written, nothing can change or encrypt it.

But here’s the complexity most people miss: SQL Server can’t natively write backups directly to immutable cloud storage like S3 with object lock or Azure immutable blobs. 

You have to write the backup somewhere locally first, then copy it to the secure location. If ransomware hits during this process, it can encrypt the local backup before it reaches safety.

The solution is a multi-stage process:

  1. Create a SQL Server backup to local storage
  2. Immediately copy to network storage (separate from production network)
  3. Push to immutable cloud storage
  4. Verify each backup at every stage

Testing: The Step Everyone Skips

On top of that, you don’t just want backups, you want the backup being tested regularly. 

It might sound overkill, but you’ll look like a genius when disaster strikes and you can restore to last Tuesday morning at 9:37 am and recover your data perfectly.

We automate backup testing using isolated environments where we regularly restore backups and verify data integrity. 

Every month, we pick random backup files and perform complete restore tests. If a backup can’t be restored in our test environment, it gets flagged immediately.

The testing process includes:

  • Verify backup file integrity
  • Restore to the isolated test server
  • Run database consistency checks
  • Validate critical business data
  • Document restore time and any issues

The Full Recovery Model Advantage

For critical systems, you want full recovery mode, not simple recovery. 

This means SQL Server logs every transaction, allowing point-in-time recovery. 

Yes, it uses more storage and requires transaction log backups every 15-30 minutes. But when disaster strikes, you can recover to the exact moment before the attack started.

I’ve seen companies recover from ransomware with only 3 minutes of data loss because they had proper transaction log backups running every 15 minutes. 

Compare that to the 3-month outage from our opening story.

Red Flags: Signs Your Backup Strategy Will Fail

After seeing hundreds of backup failures, certain warning signs always appear before disaster strikes:

Your MSP Pushes “Easy” Solutions: MSPs love snapshot tools because they’re additional revenue streams. They use enterprise backup software like Veeam across all clients. It works for 99% of use cases, but SQL Server is that 1% exception.

No Regular Restore Testing: If you can’t verify the last successful restore of your database backups, you don’t have a recovery strategy – you have data files that may or may not work. 

Backup Chain Confusion: Complex backup strategies with full, differential, and incremental backups sound sophisticated but often fail because one missing piece breaks the entire chain.

Performance Complaints: If your team regularly complains about backup performance impact, someone will eventually disable or downgrade the backup process to “fix” the problem.

Cost Optimization Focus: When backup storage costs become a regular budget discussion, the temptation to cut corners grows dangerous.

Single Point of Failure: All backups stored in one location, managed by one system, or dependent on one network connection will fail when you need them most.

Frequently Asked Questions

How often should SQL Server backups be tested?

Monthly restore testing is the minimum for critical systems. Based on our experience managing hundreds of SQL Servers, monthly testing catches backup issues before they become disasters. We’ve seen too many companies discover backup failures only when they need to restore from ransomware attacks.

What’s the difference between SQL Server backups and regular file backups?

SQL Server databases operate continuously, with data actively changing in memory and being written to disk along with ongoing transaction log updates. Regular backup tools don’t understand this complexity and often create inconsistent backups. We’ve seen SAN snapshots that appeared successful but were completely useless because they captured database files at slightly different times.

Can cloud snapshots replace traditional SQL Server backups?

Cloud snapshots are useful for disaster recovery but shouldn’t be your only backup method. After 20+ years optimizing SQL environments, we’ve seen cloud snapshots fail during ransomware attacks because they don’t provide the granular point-in-time recovery that proper SQL backups offer.

How much does proper SQL Server backup cost compared to snapshot solutions?

Proper SQL backups typically cost 20-30% more than snapshot solutions, but backup failure can cost millions. We’ve worked with companies that saved a few hundred dollars monthly on “cheap” backups only to lose their entire business when ransomware struck.

What is immutable backup storage and why is it important?

Immutable storage uses write-once technology that prevents any modifications, even by administrators. This is critical for ransomware protection because modern attacks specifically target backup systems. We implement immutable backups for all critical clients after seeing multiple ransomware incidents where traditional backups were encrypted along with production data.

How long should SQL Server backup retention be?

For compliance and recovery flexibility, we recommend at least 90 days of backup retention with longer retention for monthly and yearly archives. Based on real ransomware cases we’ve handled, attackers often remain dormant in systems for weeks before activating, so longer retention helps ensure clean restore points.

What’s the most common SQL Server backup mistake?

Trusting backup success without regular restore testing. In our experience auditing hundreds of SQL environments, over 60% of companies have never successfully restored their backups. Having untested backups is worse than having no backups because it creates false confidence.

Can MSPs handle SQL Server backups properly?

Many MSPs use general backup tools that work for most applications but fail for SQL Server. We’ve seen MSPs implement Veeam or similar solutions that appear to work but create inconsistent SQL backups. Always verify your MSP understands SQL Server-specific backup requirements.

How do you protect backups from insider threats?

Implement backup segregation where backup systems use different credentials and network access than production systems. We’ve seen cases where disgruntled employees or compromised admin accounts deleted backups along with production data. Proper segregation prevents single points of failure.

What should you do if you discover backup failures?

Immediately stop making changes to production systems and assess the scope of backup failure. Based on our emergency response experience, the first 24 hours are critical for data recovery. Document everything and consider engaging SQL Server specialists who have experience with backup recovery scenarios.

Conclusion: Your Data’s Life Depends on Getting This Right

The bank customers who couldn’t access their money for three months learned a harsh lesson.

Snapshots are great for large volumes of small files. But they often fail in SQL Server world. All too often, we have seen the more ‘fancier’ options fail to execute when disaster strikes.

The companies that survive ransomware attacks all have one thing in common: they invested in proper SQL Server backup strategies before they needed them. 

They understood that SQL Server is different. 

They tested their backups regularly.
They implemented immutable storage.
They treated backup strategy as business insurance, not an IT task.

Your choice is simple: invest in proper SQL Server backups now, or explain to your customers why their data is gone forever. 

Because when crypto hits your network at 2 AM on a Saturday, those fancy snapshot tools won’t save you.

Speak with a SQL Expert

In just 30 minutes, we will show you how we can eliminate your SQL Server headaches and provide 
operational peace of mind

Article by
Mark Varnas
Founder | CEO | SQL Veteran
Hey, I'm Mark, one of the guys behind Red9. I make a living performance tuning SQL Servers and making them more stable.

Discover More

SQL Server Health Check SQL Server Migrations & Upgrades SQL Server Performance Tuning SQL Server Security SQL Server Tips

Discover what clients are saying about Red9

Red9 has incredible expertise both in SQL migration and performance tuning.

The biggest benefit has been performance gains and tuning associated with migrating to AWS and a newer version of SQL Server with Always On clustering. Red9 was integral to this process. The deep knowledge of MSSQL and combined experience of Red9 have been a huge asset during a difficult migration. Red9 found inefficient indexes and performance bottlenecks that improved latency by over 400%.

Rich Staats 5 stars
Rich Staats
Cloud Engineer
MetalToad

Always willing to go an extra mile

Working with Red9 DBAs has been a pleasure. They are great team players and have an expert knowledge of SQL Server database administration. And are always willing to go the extra mile to get the project done.
5 stars
Evelyn A.
Sr. Database Administrator

Boosts server health and efficiency for enhanced customer satisfaction

Since adding Red9 to the reporting and DataWarehousing team, Red9 has done a good job coming up to speed on our environments and helping ensure we continue to meet our customer's needs. Red9 has taken ownership of our servers ensuring they remain healthy by monitoring and tuning inefficient queries.
5 stars
Andrew F.
Datawarehousing Manager
See more testimonials