Segment tables in SQL Server

Table partitioning technique (Table partitioning) to effectively manage the database with large capacity.

Table partitioning technique (Table partitioning) to effectively manage the database with large capacity.

>>> Setting up SQL Server is always available

This is a new feature introduced into SQL Server 2005 and continues to be enhanced in version 2008. For external applications, the table is still a single table, only the physical structure. Its is different from non-segmented tables.

Segment tables in SQL Server Picture 1Segment tables in SQL Server Picture 1

The partitioned table is based on its field value (the field is called the partition key ). For example, if you have data on sales transactions contained in the BanHang table, you can segment by year of the Right Transaction field (trading day): transactions occurring in 2009 are in a separate, similar segment. with transactions in 2010 . This technique greatly increases SQL Server's scalability and makes managing large databases easier. Imagine a data table containing several hundred million records that are regularly updated, tasks like backup / restore , or create / rebuild index are all very time-consuming. Querying or modifying data is also very difficult. Table partitioning aims to solve these obstacles, it has the following main advantages:

1. Convenience of administration:

- You can backup / restore a segment without affecting the rest. For example, at the time of 2010 the sections containing data for 2009 and the previous years no longer receive new data, you do not need to regularly backup these segments and only need to backup the 2010 segment.

- You can also REBUILD the index back on each segment (the sections need REBUILD due to multiple deletion and correction) instead of the whole table.

- It also allows to quickly remove raw data from a paragraph from the table instead of using the DELETE command (this is called SWITCH-OUT). It also allows 'loading' data from another table into a new paragraph (SWITCH-IN). This feature is very valuable for ETL applications (Extract, Transform & Load) and Datawarehouse .

For example you need to import 2008 data, you can import into a separate table and then switch-in immediately this table into the main table. Before having partitioning , you must use the INSERT command to transfer data from the private table to the main table. This process takes longer and during the process the table is locked and inaccessible.

2. Performance improvements:

- When a command only needs to retrieve data at a certain stage, the system only needs to access that section and ignore the remaining segments (this feature is called the elimination partition )

- When data segments are stored on different hard drives, it will reduce the I / O disputes between commands. For example, two SELECT and UPDATE statements work on the same table but in two different sections can be done completely parallel to each other.

Table segmentation is based on the following two new concepts:

- Partition function: Specifies the boundary value for segments. The system relies on this function to determine which segment each record belongs to.

- Partition scheme: Map the declarations in the function partition to the filegroup (each segment is stored in a filegroup).

Below I will go through each step of setting up the segment through a specific example.

You have BanHang table including the columns BangHang_ID, NgayGiaoDich, MaSP, SoLuong, ThanhTien. You want to segment the table by year of RightGiaoDich : For simplicity, suppose you want to save 2009 transactions backwards into a paragraph, in 2010 on a paragraph, and from 2011 or more into a paragraph ( later you can always modify to reserve a paragraph for 2011 and add new passages for 2012, 2013 .). So with the above configuration, the table will have 3 sections: 2009 back before, 2010, and 2011 and later. Therefore you also need 3 filegroup.

Step 1: Create database and filegroup

CREATE DATABASE PartTest
GO
USE PartTest
GO

- create filegroup

ALTER DATABASE PartTest ADD FILEGROUP FG2009AndBefore
ALTER DATABASE PartTest ADD FILEGROUP FG2010
ALTER DATABASE PartTest ADD FILEGROUP FG2011AndAfter

- add data file to each filegroup

ALTER DATABASE PartTest ADD FILE (NAME = N'FY2009AndBefore ', FILENAME = N'D: DATAPartTestFY2009AndBefore.ndf') TO FILEGROUP FG2009AndBefore

ALTER DATABASE PartTest ADD FILE (NAME = N'FY2010 ', FILENAME = N'D: DATAPartTestFY2010.ndf') TO FILEGROUP FG2010

ALTER DATABASE PartTest ADD FILE (NAME = N'FY2011AndAfter ', FILENAME = N'D: DATAPartTestFY201AndAfter.ndf') TO FILEGROUP FG2011AndAfter

Step 2: Create the partition function and partition scheme

USE PartTest
GO
CREATE PARTITION FUNCTION PFunc_NGD (DATETIME) AS RANGE RIGHT FOR VALUES ('2010-01-01', '2011-01-01')
GO
CREATE PARTITION SCHEME PScheme_NGD AS PARTITION PFunc_NGD TO (FG2009AndBefore, FG2010, FG2011AndAfter)

The partition function called PFunc_NGD defines the boundary value for the segments, which is the first day of 2010 and the first day of 2011. Like when you cut a string, only 2 cuts are needed to divide the string into 3 segments. Here, there are only two boundary values. Therefore the range of values ​​of the segments will be as follows:

Paragraph 1: Before to 2009-12-31 23:59:59
Paragraph 2 : 2010-01-01 00:00:00 to 2010-12-31 23:59:59
Paragraph 3: 2011-01-01 00:00:00 later

Then the scheme PScheme_NGD uses PFunc_NGD function to 'attach' the segments to each filegroup . Thus paragraph 1 will go to FG2009AndBefore , paragraph 2 to FG2010 and paragraph 3 to FG2011AndAfter .

Note, the partition function is not the same as the user-defined function . In Management Studio, you see the partition function and partition scheme in the Database / Storage section .

Another note is that a partition function can be used for multiple partition schemes , both are generic objects in the database, not attached to a specific table. When defining a table (see step 4) you need to specify which partition scheme to use.

Step 4: Create a table using the partition scheme

USE PartTest
GO
CREATE TABLE dbo.BanHang (
BangHang_ID INT IDENTITY,
Right Now, DATETIME,
MaSP INT,
INTLuong INT,
INT
) ON PScheme_NGD (Right Delivery)
GO
CREATE CLUSTERED INDEX CI_BanHang_NGD ON dbo.BanHang (RightGiaoDich) ON PScheme_NGD (Right Delivery)

The ' ON PScheme_NGD (RightGiaoDich) ' clause in the two table creation and index creation commands above specifies the BanHang table and the index CI_BanHang_NGD created on the scheme PScheme_NGD , which means that it manages the data allocation. So BanHang table has been segmented. You can check which data is recorded:

SELECT $ PARTITION.PFunc_NGD ('2008-07-24')
SELECT $ PARTITION.PFunc_NGD ('2009-12-31')
SELECT $ PARTITION.PFunc_NGD ('2010-01-01')
SELECT $ PARTITION.PFunc_NGD ('2010-11-25')
SELECT $ PARTITION.PFunc_NGD ('2011-03-16')

3.9 ★ | 72 Vote