SQL Server: diagnosing and fixing performance issues
Many SMEs have applications that run on SQL Server. In the more recent versions, SQL Server 2005 and 2008 is largely self tuning so it may well run for months unattended. However, performance issues can still raise their ugly head.
New reports run like a slow loris or queries that used to take one second now give you plenty of time for brewing coffee. With these kinds of problems it can be difficult to know where to start: here’s how to diagnose, categorise and fix them.
Where to start
When a customer reports a problem with a database application the pressure is immediately on because the customer will want the problem sorted yesterday. The good news is that databases are logical applications so if you approach the problem systematically, it shouldn’t be difficult to isolate the problem and fix it.
Database performance problems typically fall into six main types:
Disk – databases tend to work disks harder than other applications and so the disks often become bottlenecks
Poor Database Design – for example, too few indices
Poor Database Administration – such as putting the data and indices on the same disk
Application – the application may be sending bad SQL to the database
Network – a slow network can kill a fast database
Memory – databases are memory hogs: Does the server have enough?
Of course, these can interact; the network may be overloading because the application is sending bad SQL or the disks can become bottlenecks because of bad administration. However, assigning the problem to one of these types is the first step in diagnosing database performance issues.
Problem solving must be carried out systematically but that doesn’t mean all of the tests have to be complex. My first step is usually to go and stand next to the server; I may even put a hand upon it. Despite appearances this isn’t the ‘laying on of hands’; I’m not trying to ‘empathise with the server and understand its pain’. I’m simply listening to it.
There are usually two possibilities. Either it is humming quietly to itself and the disks are making the occasional ‘chirp’, or the gentle humming of the fans is obscured by the frantic thrashing of the disk heads – you can even feel the movement in the server’s case. If it’s the latter then we already have a good indication that the disks are the bottleneck.
Next, even if the heads are thrashing, I’d still take a look at the CPU workloads: perhaps you have more than one problem. This can be done very simply using the Task Manager which can also give you an idea of the network loading.
So, within five minutes you’ve already gained a feeling for the disk activity and the CPU and network loadings. It’s time to find out a little more. There are several useful tools here: Microsoft Management Console (MMC), SQL Server Management Studio, SQL Server Profiler, Query Analyser and the Database Engine Tuning Advisor. If you aren’t familiar with these, practice with a non-mission-critical database before trying this for real at a client site. These steps cover SQL Server 2005 but most of them are also relevant to SQL Server 2008.
Select Control Panel > Administrative Tools > Performance to fire up a copy of Microsoft Management Console (MMC) with the System Monitor Snap-in already added. Clicking the + button allows you select from a host of objects: many are general system objects (such as Processor and Print Queue) but many are also highly relevant to database work. For example, there are 36 SQL Server-related objects (SQLServer Backup Device, SQLServer.Locks) as well as others that can be invaluable (PhysicalDisk, Network Interface).
Each of these objects has multiple ‘counters’ that can be monitored. For instance, if you select ‘SQL Server Databases’ you’ll find it has 22 counters as diverse as Bulk Copy Rows/sec and Transactions/sec. Drill down and examine in detail how and when these parameters change when the server is stressed (see planning for problems).
Now, suppose your investigations suggest that the problem doesn’t lie with the disks or the network but with some of the queries. That could be because the database is designed poorly or because the queries themselves are badly written. Either way you need more information about how they are running. Start SQL Server Management Studio and select Tools > SQL Server Profiler. We can use Profiler to capture the queries that are running against the database as a Trace. Once that’s done we can examine the queries in detail. To see this in action we’ll use AdventureworksDW, one of the sample databases that comes with SQL Server 2005 and 2008.
Within Profiler select File > New Trace, choose the correct database server, enter a name for the trace and choose the standard (default) trace. Traces can either be saved to a file or to a table. It is often very useful to save the trace to a table for later analysis but for now we’ll simply write it to a file so select ‘Save to file:’ and supply a location. In the Events Selection tab deselect everything except SQL:BatchCompleted and click the Run button.
The trace is now running and will collect all queries so now we need to run some against the AdventureWorksDW database. If we were doing this for real, you’d use Trace to capture real queries against your client’s databases; here we can simply open some of the views supplied with AdventureWorksDW. Flip back to Management Studio and drill into AdventureWorksDW until you can see the views. Right click on dbo.vAssocSeqOrders and select ‘open View’. Flip back to Profiler and you’ll see that a number of queries have run. This is because this particular view, while it appears simple, calls other views when it runs. dbo.vAssocSeqOrders itself shows as the last one in the list because it is, of course, the last to complete. We can instantly see in the trace which queries are taking a long time to run (Duration) and using up resources in terms of CPU, Reads and Writes.
If you were doing this with a live server, you, would already have identified the ‘expensive’ queries but you can do more. We can get SQL Server to look at any of the queries, show us in detail how it is actually being executed and even suggest ways of speeding it up.
Stop the trace (File, Stop Trace) and then highlight the last row in the trace. You’ll see the SQL for the view called dbo.vAssocSeqOrders:
SELECT OrderNumber, CustomerKey, Region, IncomeGroup FROM vAssocSeqOrders
Select the SQL, copy it into the clipboard, return to Management Studio and press the ‘New Query’ button in the button bar. Into the window that opens, paste the SQL. Make sure that the AdventureWorksDW is selected in the button bar and run the query.
As well as running it, we can ask SQL Server to tell us how it is executing the query. From the View menu, select ‘Include Actual Execution Plan’ and rerun the query. A new tab appears which shows the plan.
This gives us a great deal of information about where time is spent during query execution. For example, one of the table scans (of FactInternetSales) is taking up 33% of the entire query time (hovering the mouse cursor over the appropriate part of the plan pops open a box with much more information.) This level of detail is often very useful but may also be a case of ‘too much information!’ so ask SQL Server for suggestions as well. Select ‘Analyse Query in Database Engine Tuning Advisor’ from the Query menu. This will open another window (which often takes a while to open fully). Select the AdventureWorksDW database and then press Start Analysis.
This runs the query and looks at possible changes in the design of the database that would make the query run faster. In this case it has identified that the query would run 50% faster if we added eight indices.
This advice may well be sage but we shouldn’t blindly follow the recommendations of the advisor (see ying and yang below). In practice you might well prefer to run Profiler on a client machine (and save the trace to a file on that client) rather than on the server itself in order to avoid adding to the workload of an already overstressed server.
Common problems and solutions
There are solutions for the common performance issues that you will see in databases in SMEs. Disks are often bottlenecks. The average disk is built for average use but databases are not average applications. They work disks very hard. The usual problem is that the data cannot be read from and written to the disk fast enough and the obvious solution is often overlooked. Suppose you have 200GB of data on one disk. If you split that between two disks, you get twice the I/O capability for the same data volume.
Then there are the log files. Databases typically log the transactions that are run against them in a log file. Suppose that a user adds ten new orders to the database. The first new order will alter a table in the database and then an entry will be made in the log file. The next transaction will again change the table, another log entry will be made and so on. Now if the table and the log are on the same disk then all of those 20 writes will require the head of the disk to move between the table and the log. This may not sound like much, but disk head movements are painfully slow in database terms. The solution is to put the table and the logs on separate disks. Right-click on the database in Management Studio, select Properties and Files and SQL Server will tell you where the logs and data are located. This is also the place where you can create new filegroups and start changing the defaults.
Exactly the same logic applies to indices and data: if disk access is limiting, think seriously about separating them onto different disks as well.
Indices are often the most cost-effective way to speed up a database. All database engines will index primary keys by default but, oddly, not foreign keys, many of which need to be indexed for optimum query performance. However, one of the best solutions is often simply to throw hardware at the problem. As outlined in ying and yang, virtually every change to the database itself will have good and bad effects. Better (or just more) hardware will often speed the system up with no ill effects except cost, and hardware prices have plummeted in the past few years.
Suppose you find a badly organised database. You can spend days working out the best strategy for tuning it. Every day the customer is paying for your time. It may be more cost effective simply to add more hardware. Database purists tend to hate this solution (“the database design is inelegant, it offends the purity of the relational model, it must be fixed!”) but we are, after all, trying to provide value for our customers.
Your initial analysis may have revealed that the CPUs are flatlining at 100% or that all of the memory is in use. You could tune the database to remove the underlying problems but the most cost-effective solution may simply be to move to a box with more CPUs or to slot in more memory.
Many of the performance issues that we see in databases appear complex, which is why you need to approach them systematically. When you do that the apparent complexity often falls away leaving the problem clear and the solution obvious. And remember: common sense is the most important tool we have – don’t leave home without it.
As you start to look at the values we get when the server is stressed, the obvious question is “How does this compare with normal performance?” For example, suppose you find a value of 20 transactions per second. Is that good or is it bad? Is it an indication of a problem?
I can’t tell you because normal behaviour varies so much between servers and database applications.
What we need to know at this point is the normal number of transactions per second for this server when there isn’t a problem; that at least will give us a datum point from which to work. So, if one of your SME customers acquires a new server it makes sense to benchmark it before there is a problem. In other words, on a day when the server is under normal load and is behaving well, go and measure all of the parameters that you feel might be useful. That way, when there is a problem, you have some a benchmark for comparison.
Ying and Yang
Every change that you make to the structure of a database is likely to have several effects. In the example given in the article, we can speed up a query by adding indices; the problem is that, while indices speed up queries they also slow down the entry of data. This relationship is very asymmetrical (you usually get a much greater speed increase) so adding the indices may well be the best idea, but you do have to bear in mind that most changes will have multiple effects both desirable and undesirable. This is partially what makes good database design as much an art as a science.
MSSQLCity.Com is “dedicated to providing useful information for IT professionals using SQL Server”. What could be better? Check the Tips section for optimization details:
An excellent white paper from Microsoft on Troubleshooting Performance Problems in SQL Server 2005: