Service King reduces SQL Server outages from 90 per quarter down to zero

Challenges

Unstable SQL Server
Slowness
Lack of experience

Solution

Benefits

Stability
SQL Speed
Improved SLAs
Faster processing
Increased SQL Server capacity
Savings in hardware upgrade

"Red9 are the experts in large scale MS SQL Database environments."

They have mastered the understanding of complex, heavy load TSQL transactions. If you would like to sleep better at night and know your database will be running well in the morning, then I would recommend finding a DBA company with the kind of skill set Red9 has proven time and time again.

Tim JanssenApplication Manager, Service King Collision

Background

Service King Collision is one of the largest and most respected automotive repair shops in the US. They have been in business for over 40 years starting with a single location, and have now grown to 345 locations across 24 states.

When Service King approached Red9, they had opened 68 locations and were looking to expand further.

A couple of major technical issues were holding them back – daily SQL server outages and severe performance issues. Their SQL Server would fail on a nearly daily basis causing 30-45 minute outages (sometimes multiple times per day). During these down periods, agents across all of their locations would be locked out from checking in new customers or executing any of their other administrative tasks.

The lost productivity was not only hurting their bottom line, it was inconveniencing their customers and preventing them from expanding further. On top of that, performance issues were causing every action taken in their systems to move at a snail’s pace, further adding to employee and customer frustration.

Service King’s leadership team was worried that these outages and performance issues would hold them back from continuing to acquire new locations and expand their business if left unresolved. At the time, they believed their SQL Server had hit capacity and were considering making a $1m+ hardware purchase.

Red9 was able to identify and resolve the core issues causing the downtime and slowness without an additional hardware purchase and ultimately saved Service King Collision hundreds of thousands of dollars in the process.

Problem

There were many.

Unscheduled Windows Cluster failovers in the middle of business hours. Cluster failovers were taking 30-45 minutes. Only 25% of Windows Cluster nodes were functional. Out of four node Windows Cluster, only one was capable of running production workload. SAN storage array was misconfigured and working at 40MB/s (or USB2 speeds). SQL Backups could not be taken because it was taking longer than 24h to run a full backup. SQL Filestreaming feature was being used but outgrown manageability. There was SQL corruption in multiple databases. SQL Server was not configured properly, leaving significant gaps in maintenance, monitoring and alerting, and feature usage. Massive security problems, which accidentally gave admin-level access to everyone in the company. VMWare was not configured properly causing further stability and performance issues. Severe SQL Server performance problems

At Red9 we always start with our proprietary 145 point SQL Server Health Check which we used to uncover a variety of technical issues.

Solution

Unlike clients initial instinct to buy more hardware, we fixed many problems: Windows Failover Cluster misconfiguration, networking configuration changes, SAN storage speed problems, backups and restores, SQL data corruption in one of the key databases (which Microsoft Support could not even fix), third-party software corrupting databases, tuned SQL Server with best practices, fixed misconfiguration, started utilizing all SQL Server features available, fixed numerous SQL performance problems, configured VMWare host and VMWare guess for best practices, addressed lack of monitoring, alerting, and proactive care.

Results

Unscheduled Windows Cluster Failovers from about 90 per quarter down to zero.
Full backups were taking 30h+. We were able to cut backup duration by 2x and set up a proper backup schedule for fulls, differentials, and transactional backups.
SAN storage speeds were increased 11-15x. Storage was further improved. We created multiple data file groups. We then handpicked the most used tables and indexes and placed those on the new filegroups, and we placed those files groups on three SAN speed layers: fast, medium, and slowest. The most frequently accessed data was layered on the fastest layer of the SAN (SSD array). Data accessed with medium frequency was placed on mid-layer (10,000 RPM drives). Data rarely accessed, was put on the slowest SAN layer (7,500 RPM drives). We then created an automated process to migrate data that was no longer needed out of production databases to keep db files as small and as fast as possible.
We fixed data corruption – one corrupted db page at a time. Microsoft Support’s recommended solution would have caused too much downtime and was not practical. Luckily corruption was all in filestream data.
We reconfigured third-party antivirus software which was the culprit of corruption.
Multiple SQL Server settings changed, making SQL more stable, and faster.
Massively improved RPO (recovery point objective) and RTO (recovery time objective) times
Fixed massive security issue, where one of them was causing 1,000+ employees to have sysadmin rights on the server.
Implemented multiple new SQL Server features, such as Compression, partial availability, AlwaysOn, filestream optimizations, filegroup backups, resource governor, column store indexing, partitioning
Old data was migrated out of production database into data warehouse.
Deployed proper monitoring and alerting.
We used SQL performance tuning and improved the speed of about 200 most frequently run and most resources intensive TSQL queries. SQL performance improvements ranged from 10x-5,000x.

Testimonials

"Red9 has incredible expertise both in SQL migration and performance tuning."

The deep level of knowledge of MSSQL is incredible and the combined experience and expertise has been a huge asset. Additionally, the flexibility and professionalism throughout a difficult migration was highly appreciated.

Richard StaatsCloud Engineer

"The customer realized a 60% decrease in CPU utilization as a direct result of Red9's work."

With little notice Red9 was able to find and fix a number of different SQL misconfigurations and wasteful practices as our customer moved their SQL environment from a rack solution with Fiber SAN to the Azure cloud.

Jason MilgramAzure Practice Director, Champions Solutions Group, AllAware.com

"Rare… Penetrating Knowledge of SQL."

Red9’s cool confidence and eagerness to deliver results are surpassed only by their deep penetrating knowledge of SQL. Whether it's SQL design, development, optimization, or recovery, Red9 is a SQL Gem.

David LernerPresident, LernerSolutions.com

Tim JanssenApplication Manager, Service King Collision

"Significant Performance Improvement in minutes."

"In just a few minutes, Red9 was able to find significant performance improvement opportunities in our database. These improvements were in areas we hadn’t even considered. We plan to continue to work with Red9 on a regular basis and continue to enjoy the benefits of their knowledge."

Kess EburuFounder and Director at GoodsExpress.com

"Our site is flying now!"

"After SQL Database query tuning – Our site is flying now. I have already have hired Red9 three times. I am very happy I found them!"

Emil KadlecOwner, Fantasy Sports Publications, Inc.

"Great deal of experience with performance monitoring"

"Beyond deep technical expertise, communication is very important to us. Red9 was timely in responses, met set expectations and timelines, and kept us in the loop when an unexpected issue came up. I would absolutely recommend Red9."

Rob McDonaldCOO/CISO & Principal, E8 Versa-Tek (IT support, management, infrastructure provider)

Service King reduces SQL Server outages from 90 per quarter down to zero

Background

Problem

Solution

Results

Testimonials

Contact Us

Schedule a call to discuss your SQL Server needs.

Services

Company

Resources

Contact