Good habits when writing T-SQL statements (Part 2)

This second part will focus on how to optimize multi-purpose storage procedures.

TipsMake.com - To improve SQL server performance and minimize potential errors for the application, we need to practice writing the T-SQL statement optimally. In the first part of the article, you have been introduced to some useful tricks that help the server reduce redundant operations. This second part will focus on how to optimize multi-purpose storage procedures.

Good habits when writing T-SQL statements (Part 2) Picture 1 Good habits when writing T-SQL statements (Part 2) Picture 1 Part 1

Multipurpose storage procedure

Before getting into the issue of how to optimize the Jack Of All Trades Stored Procedure (SP) procedure, we need to have a bit of a concept about this type of procedure. Multipurpose storage procedure is a procedure that accepts many different parameters related to the procedure. Based on the parameters passed, the universal storage procedure determines which records will be returned. The following is an example of a universal storage procedure:

CREATE PROCEDURE JackOfAllTrades (@SalesOrderID int = NULL
, @ SalesOrderDetailID int = NULL
, @ CarrierTrackingNumber nvarchar (25) = NULL)
AS
SELECT * FROM AdventureWorks.Sales.SalesOrderDetail
WHERE
(SalesOrderID = @SalesOrderID or @SalesOrderID IS NULL)
AND (SalesOrderDetailID = @SalesOrderDetailID or
@SalesOrderDetailID IS NULL)
AND (CarrierTrackingNumber = @CarrierTrackingNumber or
@CarrierTrackingNumber IS NULL)
GO

Here the JackOfAllTrades SP accepts three different parameters. All of these parameters have a default value of NULL. When a value is passed, it will be used as a parameter in the WHERE clause to bind the returned records. Each parameter in SP is used to construct a complex WHERE clause containing the following logic in the WHERE clause for each parameter passed:

(= @PARM or @PARM IS NULL)

The logic above indicates that if @PARM is passed a non-null value, it will bind the return record to make sure that it equals @PARM's value. The second part of that condition is '@PARM IS NULL'. This section means that if @PARM has no input value (equal to NULL) then no data binding is based on that parameter.

Check out the typical implementation process of JackOfAllTrades SP. Suppose we execute SP with the following command:

EXEC JackOfAllTrades @SalesOrderID = 43659

When running the command, the execution diagram looks like this:

Good habits when writing T-SQL statements (Part 2) Picture 2 Good habits when writing T-SQL statements (Part 2) Picture 2

Here you can see that for each single parameter passed, the server decides to use the 'index scan' operation. SP's SELECT statement binds the unique column @SalesOrderID - part of the cluster index key. You might think the SQL server is smart enough to realize that handling the multipurpose stored procedure with the 'index search' operation is faster than plowing into the cluster index. But as we can see on the implementation diagram, the SQL server is not that smart. Why?

When the server sees the '@PARM IS NULL' condition, it is a constant for the SQL server. So the server considers there is no useful index to handle the condition '(= @ PARM1 or @ PARM1 IS NULL)' because the constant is in the WHERE clause. That's why the SQL server decides to use the 'index scan' operation to solve the problem. The more multi-parameter storage procedures, the lower the performance due to the effect of the number of scan operations required for each parameter passed.

Optimize multi-purpose storage procedures

You do not need to accept the use of a very inefficient multi-function stored procedure like SP wrote in the above example. Let's explore what the following SP can do and rewrite it so that the SQL server's query optimizer can create a more optimal execution plan.

As mentioned above, the real problem with the universal storage procedure is that for each parameter you need to have an 'OR' condition to check if the passed parameter is NULL. If we can remove this request, the SQL server will be able to plan to use the 'index search' operation. So how to remove the condition '@PARM IS NULL'? The answer is to use parameterized dynamic SQL.

You may think I am about to open the way for SQL injection into my solution. However, we will only build dynamic SQL code to pass SP parameters to another system SP 'sp_executesql'. This SP will use the parameters in the dynamic SQL code we build.

The 'sp_executesql' system allows you to develop a T-SQL statement containing parameters, and allows you to define and pass values to parameters to dynamic SQL by passing parameters to the SP 'sp_executesql' when run this SP. The T-SQL statement executed in this way is often called digitized SQL. There are many reasons to use digitized SQL, but in the framework of this article we only focus on how to use digitized SQL to improve the performance of multi-purpose storage procedures. . The following is the code to create a rewritten multi-purpose stored procedure using parameterized dynamic SQL:

CREATE PROCEDURE JackOfAllTrades_V2 (@SalesOrderID int = NULL
, @ SalesOrderDetailID int = NULL
, @ CarrierTrackingNumber nvarchar (25) = NULL)
AS
DECLARE @CMD NVARCHAR (max)
DECLARE @WHERE NVARCHAR (max)
SET @CMD = 'SELECT * FROM AdventureWorks.Sales.SalesOrderDetail'
SET @WHERE = ''
IF @SalesOrderID IS NOT NULL
SET @WHERE = @WHERE + 'AND SalesOrderID = @SalesOrderID'
IF @SalesOrderDetailID IS NOT NULL
SET @WHERE = @WHERE + 'AND SalesOrderDetailID = @SalesOrderDetailID'
IF @CarrierTrackingNumber IS NOT NULL
SET @WHERE = @WHERE + 'AND CarrierTrackingNumber = @CarrierTrackingNumber'
IF LEN (@WHERE)> 0
SET @CMD = @CMD + 'WHERE' + RIGHT (@ WHERE, LEN (@WHERE) - 3)
EXEC sp_executesql @CMD
, N '@ SalesOrderID int
, @ SalesOrderDetailID int
, @ CarrierTrackingNumber nvarchar (25) '
, @ SalesOrderID = @SalesOrderID
, @ SalesOrderDetailID = @SalesOrderDetailID
, @ CarrierTrackingNumber = @CarrierTrackingNumber

Next we will go into the details to help you understand the dynamic part and the parameterization of the code above. This SP starts by assigning the variable @CMD to the SELECT statement without the WHERE clause. Next we assign the @WHERE variable to an empty string. Next are four different IF statements. The first three IF statements check whether each parameter passed is NOT NULL. If a parameter is NOT NULL, we will attach the condition to the @WHERE variable for that parameter. Since we have checked and determined that the parameter is NOT NULL, we do not need to add the IS NULL condition to the WHERE clause as the initial SP code above. Instead, all we need is to add the condition = @PARM to the @WHERE variable. The last IF statement determines whether the @WHERE variable satisfies at least one condition, and if so, it joins the @WHERE variable with the @CMD variable.

Note that the @WHERE variable is the dynamic part of the code. But I do not put the actual text of the parameter into the @WHERE variable, instead just put a reference to the parameters in the WHERE condition. So the basic dynamic T-SQL statement only includes the original SELECT statement and the WHERE clause no longer requires the IS NULL condition to bind data.

Finally I used the 'sp_executesql' SP to execute dynamic parameterized T-SQL statements. To do this, I pass five parameters into the system SP. The first parameter is the dynamic T-SQL variable @CMD. The second parameter declares all possible variables in the parameterized query segment, along with their data type. With the last three parameters, they are only transmitted to the system SP as if they were transferred to the universal storage SP in the first section. As you can see, I absolutely did not make my dynamic SQL code become more susceptible to SQL injection attacks than the original SP. The reason is because I don't use the actual value of the parameter to go to the @WHERE variable. I only pass parameters as variables to dynamic SQL via SP 'sp_executesql system'.

Now run the code to create a new multi-function stored procedure by running the following command:

EXEC JackOfAllTrades_V2 @SalesOrderID = 43659

When I ran the test with the AdventureWorks database on the server, I received the following execution plan:

Good habits when writing T-SQL statements (Part 2) Picture 3 Good habits when writing T-SQL statements (Part 2) Picture 3

When comparing this diagram with the diagram at the beginning, you can see that it is simpler and uses the 'cluster index search' operation to handle SP. The reason SQL Server can use this operation is because the dynamic SQL code no longer has the condition '@PARM IS NULL' anymore. Since the T-SQL statement has been simplified using dynamic SQL and eliminating the IS NULL constraint, the SQL server can now offer a more optimal execution plan for the V2 version versatile storage procedure. .

In conclusion, what is the effect we have achieved in practice? Remember above that we only consider records from the SalesOrderDetail table with SalesOrderID equal to 43659. The original universal storage SP uses the 'index scan' operation to process the query. That means it must read the entire index in turn before it can complete the query request and return the record containing a SalesOrderID value. Conversely, the versatile version of the SP storage V2 can use the 'index search' operation with the cluster index key on the SalesOrderDetail table to quickly retrieve certain records that contain SalesOrderID with 43659 quickly. . The optimal 'index search' operation is much more than 'index scan', but how much specific?

The evaluation of the I / O amount saved by using the V2 version of versatile storage SP can be done in many ways. We will run the following T-SQL segment:

SET STATISTICS IO ON
GO
EXEC JackOfAllTrades @SalesOrderID = 43659
GO
EXEC JackOfAllTrades_V2 @SalesOrderID = 43659
GO

Here I use the 'SET STATISTICS IO ON' command so the results of the two currently executing SPs will display the number of I / O each command requires to process the query. Here are the results received:

(12 row (s) affected)
Table 'SalesOrderDetail'. Đang ghi số 1, hợp lệ đọc 264, Physical được đọc 0, read-ahead đọc 0,
hợp lệ lob đã đọc 0, lob Physical đọc 0, lob đọc-trước đọc 0.

(1 row (s) affected)

(12 row (s) affected)
Table 'SalesOrderDetail'. Đang kiểm tra số 1, hợp lệ đọc 3, Physical được đọc 0, read-ahead đọc 0,
hợp lệ lob đã đọc 0, lob Physical đọc 0, lob đọc-trước đọc 0.

(1 row (s) affected)

When looking at the above results, we can see that the performance of the first universal storage SP is 1 scan and 264 logical reads. In contrast, the V2 version has the same number of index scans but only 3 logical reads are required to process the query. The I / O savings are 261. This number does not seem to be towering, but in case you have to call SP back and forth in a loop, for example, the performance will be significantly improved. Noticeable between the two SP versions.

Improve I / O quality by using parameterized dynamic SQL

After reading this section, you need to understand why the SQL server has issued an inefficient implementation diagram. Above the SQL server considered logic '@PARM IS NULL' as a constant. Therefore it is decided to perform an 'index scan' operation to handle the first version of the multi-purpose stored procedure. As we know, scanning (SCAN) is always slower than searching (SEEK). By rewriting the V2 version of versatile storage SP using T-SQL dynamically, I removed the constant expression in the WHERE clause of the T-SQL statement. So the SQL server has found a more correct method that is to use the 'cluster index search' operation. If your site uses a multi-purpose stored procedure, try rewriting it with digitized dynamic SQL and wait to see how the performance will improve.