Wednesday, September 25, 2013
I did a presentation on the SQL File system Tuesday for Training on the Ts
The answers to questions asked during the presentation follow:
The answers to questions asked during the presentation follow:
Questions Asked by Attendee
Q: Are there any changes in the file internals for Sql 2014? Does Sql Azure have the same
file architecture? Does it expose the same data and log files as on premise Sql Server versions?
A: I'm not aware of any changes to the filesystem in 2014. There are always performance improvement
in each release but the basic file structure that we talked about hasn't changed for several releases.
SQL Azure uses a very similar physical structure but a Azure database is a logical database and the
physical structure of the database isn't exposed to the user.
Q: Back in the day, we used to try and always allocate our DBs in allocations & growth sizes that
correlated (fit) onto 8k pages (size of physical disk page). What is the physical size these days, 64bit etc?
A: Good point. SQL Server space is allocated in 64k chunks so if the file size isn't evenly divisible by
64k there will be some unused space at the end of the file.
Q: can u please share the sqlserver capacity Planning templates or URL's
A: Unfortunately, I haven't ever run into a good SQL Server capacity planning system. I generally
come up with a solution tailored to the particular application. Here are some of the links I use for
Q: how allocation is different, when allocation unit set to 64k
A: Not sure what you're asking. Allocation is always 64k. Are you talking about NTFS file
system block sizes? If so, an NTFS block size of 64k is recommended for SQL Server
Q: I heard from another DBA that he creates 8 files automaticaly for TempDB, no matter the
server's setup. What are you comments from this?
A: Not a bad thing to do. I don't see any reason why that would cause a problem on a 4
core box. The only down side is a little extra complexity.
Q: If temp space is requested by an application, does SQL allow it to use only one of the
8 files allocated to tempdb or can it use multiple?
A: SQL Server allocates space from the available files in a round robbin fashion if the
files are the same size. If the files are different sizes SQL uses something called proportional
fill to allocate more space from the larger files so they stay approximately the same percent full.
This is why all files for a filegroup should be the same size. If not the big file is a hot spot.
Q: is anyone else having audio issues?
A: Sorry about that - maybe there was a network issue
Q: is it recommended to run disk defrag where SQL mdfs and ndfs are located?
A: Yes but it's only necessary if the disk is fragmented. If your files don't autogrow the disk
volume shouldn't get fragmented.
Q: Is it worthwhile to update an existing database minimum file size after the database is
created and in use?
A: Yes, if the data size is increasing you can prevent further fragmentation.
Q: To Defrag the Windows filegroup, don't we have to thake the DB files offline, defrag
windows file, then bring SQL file online and Defrag it?
Q: Best Practice we employ - use create table for tempDB's otherwise it locks all the
Allocation Maps until the entire insert into is completed
A: True. Good point
Q: Very good presentation. Thank you so very much.
Q: How can I defrag the SQL file system after doing a windows file system defrag.
A: The SQL file system doesn't get fragmented but SQL indexes do. Rebuilding
the indexes willl remove that fragmentation
Q: When does truncating a table reallocate space to be reused?
A: First, I want to be sure you're not talking about the Windows file size decreasing
when tables are truncated. The windows file doesn't ever shrink unless you shrink
the database which I definitely don't recommend. If you are asking when the pages
allocated to the table are freed up after a trancate table, they are freed after the next log
file backup. There' a misconception that truncate is not logged. It is actually minimally
logged - the list of pages allocated to the table is written to the log and in the next log backup,
those pages are written to the backup.
Q: with today's SANs, how can you control fragmentation?
A: SAN volumes can be defragmented with the defrage tool of your choice.
Q: Would multiple files on user databases help reduce deadlocks?
A: Unfortunately probably not. Allocation contention doesn't generally cause deadlocks. It's
worth doing anyway but I haven't ever run into a deadlock caused by allocation contention.
Monday, September 9, 2013
For those of you who don’t know me, I was one of the program managers for the SQL Server 2005 Service Broker. One of the harder parts of being the Service Broker (SSB) PM was positioning it. The first reaction I got when I did SSB talks was usually “Is this really different than MSMQ”? To demonstrate that Service Broker was really much more than yet another reliable messaging product, I worked with several of our early adopters to come up with creative uses for the reliable, transactional, asynchronous queuing aspects of SSB. I also wrote up a few uses that may not have ever been implemented. Here are the more significant SSB Rube Goldberg ideas:
Data Warehouse Load Balancing
The first ever SSB customer was an internal Microsoft reporting application. They had 4 or 5 copies of a database used for doing large ad-hoc queries. When a query came in it was routed to one of the copies for processing. The first SSB application was the obvious one – queries came in to a front-end server and were put on a queue on one of the query servers for processing. The query engine would receive a message from the queue with the query, execute the query, and return the results in a response message. This worked well and the transactional messaging ensured that queries weren’t lost but the customer wanted to be more intelligent about where queries should be routed. To do this we set up a dialog timer on each of the query servers. When the timer expired (every 10 seconds or so) an activation procedure would gather some statistics about running queries, processor, memory, and disk usage, etc. and send the info to the front end server. The front end server could then use this information to decide which server was the least loaded and route the queries accordingly.
This one started as a SOAP toolkit question. A research project at a college had a web service that people could use to search for and download x-rays. The problem they were running into was the x-ray files were very large and some of the SOAP clients couldn’t handle the large messages. We wrote a service that broke an x-ray down into a bunch of Service Broker messages that were sent to the SOAP server. The client would request a message at a time and assemble the x-ray. The client could request a message as many times as necessary. SSB would send the same message until the client acknowledged it. This was a pretty minor change to the web service that made it usable to significantly more clients.
One of the biggest pieces of negative feedback we got for SQL Express was that since it didn’t include SQL Agent there wasn’t an easy way to set up periodic database backups. The Service Broker dialog timer looked like a good way to do this so I came up with a way to backup SQL Express at any chosen interval: http://blogs.msdn.com/b/rogerwolterblog/archive/2006/04/13/575974.aspxOne of the cool things about this is that the backup routine is implemented in the database being backed up so if you move the database to another server, backups continue to work. I like to think of this as the first step in the contained database effort.
It was a pretty logical step from using SSB for scheduled database backups to using SSB for scheduling any database command. You just need an activated stored procedure that reads SSB messages and executes them. Service Broker is distributed so distributed tasks just work. I had a lot of interest when I published this article but I didn’t hear any implementation stories: http://technet.microsoft.com/en-us/magazine/2005.05.servicebroker.aspx
Service Broker was designed to support reliable, transactional, asynchronous applications. The typical Service Broker application queues a message to request an action and then goes away until the action is complete and a response is received. I always thought that pattern looked a lot like a workflow application so one day I built a Windows Workflow framework that used Service Broker as the eventing layer. This resulted in a reliable, transactional infrastructure for building workflow applications. This was pretty cool because as long as the actions the workflow invoked were transactional, the workflow was totally reliable. The workflow application could die at any point and Service Broker would recover and pick up where it left off as soon as the database started again. http://blogs.msdn.com/b/rogerwolterblog/archive/2007/05/21/windows-wf-on-sql-service-broker.aspx
Well, those are my Service Broker Rube Goldberg stories. I would be very interested in hearing about other unique SSB applications.
Friday, August 23, 2013
I ran into an interesting problem while installing MDS on a customer machine this week. I installed SQL Server 2012 and Visual Studio 2012 on a Windows 7 box. When I tried to configure MDS, I got a warning that said some Web Server software wasn’t installed. After verifying that the list of things that MDS said it required was indeed installed, I decided I wasn’t going to be intimidated by a mere warning so I configured MDS anyway. This worked fine but when I tried to bring up the client I got an error that said the URL was invalid. I dug into the IIS configuration and found (with a little help) that the World Wide Web service was disabled. I’ve seen this before – a security policy on the network disables IIS periodically to keep people from starting their own web sites on the corporate network.When I enabled and started the World Wide Web service, I god an error that said a page handler wasn’t available. This appeared to be an ASP.Net problem. When I looked more into the IIS configuration I noticed the .Net 4 Application Pools weren’t there. After a bunch of messing around with settings, I finally did a “Repair” install of Visual Studio 2012. After this, the ASP.Net application pools were there and MDS ran fine. My assumption is the problem was caused by the Web Server being disabled when the 4.5 .Net framework was installed the first time. Not an earth shattering discovery but running into corporate security policies that disable web servers is pretty common so I hope this might save you an afternoon of tearing your hair out someday (yes, I’m officially blaming my shortage of hair on ASP.Net).
Sunday, August 11, 2013
I ran into an interesting requirement on a recent MDS engagement. The customer wanted to give a large number of users read-only access to the MDS data. While they didn’t want most of the users to be able to update the Master Data, they realized that the users would frequently find issues with the Master Data. They wanted the users to be able to suggest changes and corrections to the Master Data so the data could be fixed in the source system.
We decided to use permissions to make the attributes of the key master entities read-only while creating an updatable text attribute that users could use to report issues. This worked well but we needed to inform the source system owners when an issue was reported so they could fix it quickly. Our first attempt was to just create a business rule that fired when the change-request attribute was updated and used the normal notification mechanism to send an email to the data steward when a change-request field was populated. This worked OK but the data steward was forced to click on the link to go to the MDS database to find out what the change-request text said. In this customer’s environment data stewards didn’t necessarily have access to MDS so this was inconvenient. To simplify the data steward’s life, we wrote a customer workflow class that was called when the change-request attribute was updated. The custom workflow made a web service call into MDS to retrieve the contents of the change-request attribute, composed an email with the change-request and other pertinent information and made a DBMail call to send the email to the data steward. We still have some work to do to configure distribution and make the mail more attractive but it looks like this is a viable solution.
Friday, August 2, 2013
File versioning is another thing that I assume most people understand but I still run into a lot of confusion about. This came up in one of the 24 Hours of PASS sessions I watched, The question was something like “does the format of the data in a SQL Server file change when you do an upgrade or does SQL Server understand both the new and old format”?
While there might be exceptions I’m not aware of, there is only one file format that a particular version of SQL Server understands. This means that when you upgrade a SQL Server file – upgrade in place by installing a new version, restore to a new version, attach to a newer version – the format of the data in the file is changed to match the new SQL Server version. This is enforced to avoid implementing code that behaves differently depending on the version that a page of data was created on. If the data in the file wasn’t upgraded when the database engine is upgraded then SQL Server 2012 would have to be able to handle at least 5 different page formats – SQL 7, SQL 2000, SQL 2005, SQL 2008, and SQL 2012. This would make the code very fragile and difficult to test. You would also see bugs that would depend on which version of the database a page was originally written by. You could also run into issues where performance was different in SQL 2012 for a database created in SQL 2000 and a database created in SQL 2012. So for all these reasons, not changing the format of the data files when SQL Server is upgrade would be a very bad thing.
This leads to several behaviors we’re all familiar with:
· You can restore a backup made on an older version to a newer version of SQL Server but you can’t restore a newer backup to an older version. SQL Server knows how to convert a page from the older format to the newer format but not the opposite. Note that the compatibility level of a database does NOT affect the format of the database pages. All database pages in a SQL Server 2012 instance use the 2012 format.
· Log shipping and Database Mirroring can transfer data from an older database version to a newer version but not the other way round. Again, the page upgrade is a one way transformation.
· You can attach a SQL 2008 database to a SQL 2012 instance but once you do, you can’t go back.
· You can only upgrade a database by two versions – 2005 to 2012 works but 2000 to 2012 doesn’t. This was a choice Microsoft made to limit the number of upgrade routines they have to develop, maintain, and test. Isn’t this pretty arbitrary? Sure, but think about whether you would want the SQL Dev and Test teams working on the upgrade from SQL 7 to SQL 2014 or working on new features for SQL 2014.
The one-way upgrade is the reason rollback planning is such an important part of upgrade planning. If you attach your database files to an instance of a newer version and something happens, you can’t go back just by attaching the files to the old level or backing up the new level database and restoring it to the old level. You also can’t use log shipping or mirroring from the new instance to keep the databases in the older instance current.
Wednesday, July 24, 2013
In my last post I talked about how SQL Server uses Windows to store data in files and how you have to manage those files to maintain database performance. In this post I’ll cover how SQL Server manages the space within the Windows files that it uses to store data.
SQL Server divides its storage space into 8192 byte (8KB) pages. Having everything stored in pages of the same size makes space management much more efficient. When SQL Server needs to store some new data it looks for a page that isn’t in use. Because all the free spaces are the same size, any free page will do.
If a table went to the SQL OS every time it needed a page, the SQL OS would soon be overwhelmed with requests. For this reason, the SQL OS allocates storage in 64KB chunks called extents. The table logic gets 8 pages at a time from the SQL OS file system so it doesn’t have to go back for more space until it has filled up 8 pages.
Keeping in mind that SQL Server was first developed when a gigabyte was a lot of disk; the SQL OS has a way to avoid allocating a whole 64KB of space to a table that may have only a few rows. The first 8 pages in a table are allocated one at a time from Shared Extents. The 8 pages in a Shared Extent are allocated one at a time instead of all being allocated to the same SQL Server object. This means that to populate the first 8 pages of a table SQL Server has to make 8 times as many calls to allocate space as it does for the next 8 pages. Keep this in mind when we discuss allocation maps next.
SQL Server keeps track of which extents and pages have been allocated using allocation maps. The first allocation map is the Global Allocation Map (GAM). Each bit in the GAM maps to an extent. A 0 in the GAM indicates that an extent is available and a 1 indicates that it has been allocated. Allocating an extent is just a matter of finding a bit in the map that is set to 0 and setting it to 1. Obviously, the GAM must be locked when this is happening so that two threads don’t allocate the same extent. If enough threads are trying to allocate space at the same time, the lock on the GAM can be a bottleneck. There are about 64k bits in a GAM with each bit representing 64KB of disk space so a GAM can handle allocations for 4GB of storage. The first GAM in a SQL Server data file is on page 2 and the subsequent GAMs occur every 4GB after that.
Page 3 in a SQL Server file and every 4GB after that is an SGAM page – Shared GAM. A 1 bit in the SGAM means the corresponding page is a shared extent with pages available to be allocated. A 0 bit means either the page is a uniform extent or it is a shared extent with no pages available.
The final part of the allocation picture is the Page Free Space (PFS) page. A PFS page has one Byte for each page in its map. Since each byte represents a page, there must be a PFS page every 8088 pages in the file. The first PFS page is page 1 of the file. The PFS byte corresponding to a page indicates whether the page is allocated and if the page is used in a blob or heap, the rest of the byte indicates how full the page is. The fullness information is used to find pages with available space. Page fullness information isn’t required for clustered or non-clustered indexes because an insert has to happen at a particular page whether there is room in the page or not (if there isn’t room, the data in the page is spread over the current page and a newly allocated page – a page split).
Allocating a page in a shared extent can be pretty involved. First the SGAM is searched for a shared extent with pages free. If there isn’t one, a page is allocated from the GAM (bit set to 1) and turned into a shared extent (1 bit set in the SGAM). The corresponding PFS must then be updated to indicate that the page has been allocated and then the page number is returned to the task that needed the page. The SGAM has to be locked during this process to keep another thread from allocating the same shared page.
Allocation Contention Issues
The locks on GAM, SGAM and PFS pages can be a bottleneck if many SQL Server threads are trying to allocate space at the same time. This contention for the allocation locks is especially acute in tempdb because a large number of temporary tables are created and deleted and most temp tables are small enough that they mainly use pages from shared extents. It also doesn’t help that all the databases in a SQL Server instance share the same tempdb. There are a few ways to address this:
· Starting in SQL Server 2005, temp tables aren’t dropped when they go out of scope. Instead they are trimmed down to the first 2 pages and recycled so the next request for the temp table can use the storage is already allocated. This reduces the number of allocations required for temp tables so the contention is reduced.
· Trace flag 1118 will change the allocation algorithm so that all allocations are for uniform extents. Because shared extents aren’t used, the SGAM is significantly reduced as a source of contention (there are still things that use mixed extents so the SGAM contention isn’t totally eliminated)
· The earliest fix for this issue was to create a tempdb data file for every processor that the server has. For example if you have an 8 core server you would create 8 tempdb data files. Each file has its own set of GAM and SGAM pages so the probability of contention on a GAM or SGAM is greatly reduced. Two things have happened to modify this recommendation – first the fixes we already mentioned were implemented so the amount of contention is significantly reduced and second modern servers can have 80 processors or more so managing that many data files incurs more overhead than the allocation contention did. The current recommendation is to create 1/3 to ½ of a data file for each processor – for example for a 12 core server create 4 to 6 files. The total number of files created shouldn’t be more than 8 because managing more files than that incurs a lot of overhead.
· I usually use all three approaches in a heavily loaded machine. Even though the problem has been addressed in the code, each new version of SQL Server seems to put more load on tempdb so reducing contention is still necessary.
While tempdb usually experiences more allocation contention than any other database, normal databases aren’t immune to contention. I generally create at least three identical files for each file group to reduce contention. Make sure that the files are the same size because SQL Server uses an algorithm called proportional fill for it files so if one of the files is twice as big as the others, it will get two allocations for every one allocation in the other files.
Friday, April 12, 2013
SQL Server Storage
Since SQL Server storage uses Windows to manage the actual storage, it’s necessary to talk a little bit about how Windows stores data. Warning – this is an over-simplified high level view of Windows. Windows data is stored on volumes. A volume may be a disk drive, a RAID array a LUN on a SAN but for our purposes a volume is a bunch of bytes of storage Windows uses to keep files on. There’s a directory that keeps track of what files have been written to the volume and where they are stored. Ideally, each file occupies one contiguous chunk of bytes but it’s common for a file to have its data spread across many isolated chunks of storage. The chunks of storage that aren’t being used by files are entered on a list of unallocated chunks of storage called a free list.
When Windows needs space to write something it goes through the free list until it finds a good place to write it. For example, if it needs to write 1MB it might find an entry on the free list for a 1.5 MB chunk. It will then write the 1MB, put a pointer to what it wrote in the directory and then put a .5MB entry back on the free list. When a file is deleted, the space not needed for the file is put back on the free list. It’s not hard to see that over time, the free list becomes very large with pointers to many small chunks of storage. This has two impacts – Windows takes longer to allocate new space because it has to search through a longer list to find it making space allocation a bottleneck and Windows often can’t find a big enough chunk of free space to write a file so the file gets written as small chunks spread across the disk resulting is slow IO performance when reading or updating the file. This can lead to major performance problems so Windows has a defragmentation utility that moves data around so files are in a single contiguous space as much as possible and free space in consolidated into a few large chunks.
So what does all this mean for SQL Server? SQL Server tries to avoid fragmentation by allocating all the space specified for a SQL Server file when the file is created. This works as long as the storage isn’t fragmented when the file is created so tip one is to defragment the volume before creating the files for a database. This fragmentation avoidance strategy also assumes that you create the files big enough to satisfy the databases’ need for space for an extended period of time – maybe six months or a year. If you grow the files in small increments, each increment is a separate space allocation for Windows so the disk volume can become fragmented. I often see SQL Server installations where the database files have been created with the default size specifications. That means the file starts at 2MB and grows in 1MB increments. A 1TB database grown with these settings would have almost a million space allocations. In addition to the significant impact that growing this many times has on SQL Server which we will discuss later, this many small file growths will badly fragment the Windows storage. This leads to the frequently asked question “should I defragment the Windows volumes where my databases are stored”? My answer is that if you’re doing a good job of managing storage you probably don’t need to but you should check for fragmentation of Windows storage periodically and defragment if the fragmentation goes above 10 or 15%. Notice that I am talking about fragmentation of the Windows storage that Windows uses to store SQL Server files. We’ll talk about SQL Server index fragmentation later and that’s a very different thing.
SQL Server Filesystem
One of the least understood parts of SQL Server storage is how the Windows file system relates to the SQL Server file system. When you are setting up Windows storage you first partition a disk drive (or RAID array) to create disk volumes. You then format these volumes which creates a Windows file system on the volume. Once the Windows file system is created you can create and delete files to your heart’s content. When you create a file the amount of free space in the volume decreases and when you delete a file the amount of free space increases. The size of the volume doesn’t change. This makes perfect sense to all storage administrators but when you try to apply the same logic to the SQL Server filesystem, they often refuse to believe it.
When you use SQL Server to create a SQL Server file, SQL Server will ask Windows for a file of the size specified and build a SQL Server filesystem in the file. When you create SQL Server objects – tables, indexes, procedures, etc. – the amount of free space in the file decreases. As you add rows to the table, the amount of free space decreases more. When you delete rows from a table or drop a table or index, the amount of free space increases. This works just like files in a Windows file system. Just as the disk volume doesn’t grow and shrink when you add and remove files, the SQL Server files don’t grow and shrink when you add and delete SQL data. Expecting a SQL Server file to shrink when you delete rows is like expecting a Windows volume to shrink when you delete files. No storage admin would think a LUN that shrinks every time you delete a file was a good idea but I very seldom run into one that thinks the same should apply to SQL Server files.
At this point I’m sure some of you are shouting at your screen shouting “but wait, SQL Server files do grow!” While this is true, it wasn’t always true and it isn’t true for many other databases. In the early days of SQL Server, data files worked like Windows volumes. When you used up all the space, you couldn’t write anymore. One of the design principles of SQL Server 7.0 was to reduce the amount of routine work you need to do to keep SQL Server operating. One of those routine operations was expanding the SQL Server filesystem when it runs low on space. People often fail to notice it’s running low on space so it eventually runs out completely and SQL Server just stops until the DBA gets out of bed at 3:00 Sunday morning and expands the SQL Server files. To avoid this, the SQL Server team invented “autogrow”. If autogrow is turned on, when SQL Server runs out of space in a file it automatically grows the file so processing can continue. While this is cool because it lets the DBA sleep at night, it should not be the normal way for SQL Server files to grow. While autogrow is growing the file, all updates that require space to execute have to stop until the growth is complete. That means you don’t want the growth increment too large so the transactions don’t have to wait too long. On the other hand if the growth increment is pretty small, growth will happen pretty often. I worked with a customer a couple months ago where we calculated that one of their files had been growing 15-20 times a day for two years. Autogrowth often looks like a hang to a user so a database that hangs a couple times an hour is not a good thing. The right thing to do is to stay ahead of the growth requirements and manually grow the file at a convenient time every few months. I look at any autogrowth as a failure of the DBA to manage space correctly. Failing every couple hours for months at a time doesn’t look good on the performance review. Of course some SQL Server installations don’t have a DBA so if this case making sure the file sizes are set to reasonable values and the increment is big enough to so growth events are fairly rare is probably the best option.
I guess for completeness I have to say that it’s possible to shrink a SQL Server file but there are very few situations where shrinking is appropriate. Shrinking is pretty expensive because space can only be freed from the end of a Windows file so the shrink function must move data from the end of the file into other parts of the file where there is free space. This not only takes a lot of IO operations to do, but can badly fragment the SQL data in the file. Because of the bad side-effects, I avoid shrinking files unless I’m sure that data has gone away permanently and the space won’t be needed again. For example, if a company moves to a new ERP system and you just deleted the entire old inventory, it might be acceptable to shrink the inventory files. I often compare shrinking files to Amazon building a new warehouse to handle the Christmas rush and then tearing it down in January because the rush is over. This makes sense only if Christmas isn’t coming next year – or you plan on shutting down the company before next Christmas. My favorite was the DBA who did a shrink on all the files before he did the backups because he was short of backup space – SQL Server doesn’t backup unallocated space so this really didn’t save anything.
Well, this is probably a good stopping place for now. Next time we’ll talk about how the SQL Server file system stores data and how knowing that will help you improve your database designs.
After being absent from the blogosphere for a few years, I decided it was time I started again. I lost the ability to access my previous blog (http://blogs.msdn.com/b/rogerwolterblog/ ) when I left Microsoft so it has taken me a while to get back into blogging again.
First a little background – I was a developer for many years primarily doing database related projects. I then spent 12 years at Microsoft. Some of my projects there were COM+, SQLXML, SOAP Toolkit, Service Broker, XML Datatype, SQL Express, and MDM. Since leaving Microsoft I have been a SQL Server consultant. I’m currently a Database Architect at Pragmatic Works – www.pragmaticworks.com
I like to write about SQL Server. Here are a few of the things I have written: