Space Layout
In my last post I talked about how SQL Server uses Windows
to store data in files and how you have to manage those files to maintain
database performance. In this post I’ll
cover how SQL Server manages the space within the Windows files that it uses to
store data.
SQL Server divides its storage space into 8192 byte (8KB)
pages. Having everything stored in pages
of the same size makes space management much more efficient. When SQL Server needs to store some new data
it looks for a page that isn’t in use.
Because all the free spaces are the same size, any free page will do.
If a table went to the SQL OS every time it needed a page,
the SQL OS would soon be overwhelmed with requests. For this reason, the SQL OS allocates storage
in 64KB chunks called extents. The table
logic gets 8 pages at a time from the SQL OS file system so it doesn’t have to
go back for more space until it has filled up 8 pages.
Keeping in mind that SQL Server was first developed when a gigabyte
was a lot of disk; the SQL OS has a way to avoid allocating a whole 64KB of space
to a table that may have only a few rows.
The first 8 pages in a table are allocated one at a time from Shared
Extents. The 8 pages in a Shared Extent are
allocated one at a time instead of all being allocated to the same SQL Server object. This means that to populate the first 8 pages
of a table SQL Server has to make 8 times as many calls to allocate space as it
does for the next 8 pages. Keep this in
mind when we discuss allocation maps next.
Allocation Maps
SQL Server keeps track of which extents and pages have been
allocated using allocation maps. The first
allocation map is the Global Allocation Map (GAM). Each bit in the GAM maps to an extent. A 0 in the GAM indicates that an extent is
available and a 1 indicates that it has been allocated. Allocating an extent is just a matter of
finding a bit in the map that is set to 0 and setting it to 1. Obviously, the GAM must be locked when this
is happening so that two threads don’t allocate the same extent. If enough threads are trying to allocate
space at the same time, the lock on the GAM can be a bottleneck. There are about 64k bits in a GAM with each
bit representing 64KB of disk space so a GAM can handle allocations for 4GB of
storage. The first GAM in a SQL Server
data file is on page 2 and the subsequent GAMs occur every 4GB after that.
Page 3 in a SQL Server file and every 4GB after that is an
SGAM page – Shared GAM. A 1 bit in the
SGAM means the corresponding page is a shared extent with pages available to be
allocated. A 0 bit means either the page
is a uniform extent or it is a shared extent with no pages available.
The final part of the allocation picture is the Page Free
Space (PFS) page. A PFS page has one
Byte for each page in its map. Since
each byte represents a page, there must be a PFS page every 8088 pages in the
file. The first PFS page is page 1 of
the file. The PFS byte corresponding to
a page indicates whether the page is allocated and if the page is used in a
blob or heap, the rest of the byte indicates how full the page is. The fullness information is used to find
pages with available space. Page
fullness information isn’t required for clustered or non-clustered indexes
because an insert has to happen at a particular page whether there is room in
the page or not (if there isn’t room, the data in the page is spread over the
current page and a newly allocated page – a page split).
Allocating a page in a shared extent can be pretty
involved. First the SGAM is searched for
a shared extent with pages free. If
there isn’t one, a page is allocated from the GAM (bit set to 1) and turned
into a shared extent (1 bit set in the SGAM).
The corresponding PFS must then be updated to indicate that the page has
been allocated and then the page number is returned to the task that needed the
page. The SGAM has to be locked during this
process to keep another thread from allocating the same shared page.
Allocation Contention Issues
The locks on GAM, SGAM and PFS pages can be a bottleneck if
many SQL Server threads are trying to allocate space at the same time. This contention for the allocation locks is especially
acute in tempdb because a large number of temporary tables are created and
deleted and most temp tables are small enough that they mainly use pages from
shared extents. It also doesn’t help
that all the databases in a SQL Server instance share the same tempdb. There are a few ways to address this:
·
Starting in SQL Server 2005, temp tables aren’t
dropped when they go out of scope.
Instead they are trimmed down to the first 2 pages and recycled so the
next request for the temp table can use the storage is already allocated. This reduces the number of allocations
required for temp tables so the contention is reduced.
·
Trace flag 1118 will change the allocation
algorithm so that all allocations are for uniform extents. Because shared extents aren’t used, the SGAM
is significantly reduced as a source of contention (there are still things that
use mixed extents so the SGAM contention isn’t totally eliminated)
.
·
The earliest fix for this issue was to create a
tempdb data file for every processor that the server has. For example if you have an 8 core server you
would create 8 tempdb data files. Each
file has its own set of GAM and SGAM pages so the probability of contention on
a GAM or SGAM is greatly reduced. Two
things have happened to modify this recommendation – first the fixes we already
mentioned were implemented so the amount of contention is significantly reduced
and second modern servers can have 80 processors or more so managing that many
data files incurs more overhead than the allocation contention did. The current recommendation is to create 1/3
to ½ of a data file for each processor – for example for a 12 core server
create 4 to 6 files. The total number of
files created shouldn’t be more than 8 because managing more files than that incurs
a lot of overhead.
·
I usually use all three approaches in a heavily
loaded machine. Even though the problem
has been addressed in the code, each new version of SQL Server seems to put
more load on tempdb so reducing contention is still necessary.
While tempdb usually experiences more allocation contention
than any other database, normal databases aren’t immune to contention. I generally create at least three identical
files for each file group to reduce contention.
Make sure that the files are the same size because SQL Server uses an
algorithm called proportional fill for it files so if one of the files is twice
as big as the others, it will get two allocations for every one allocation in
the other files.