What is the Value of the MERGE Statement?
Prior to the introduction of SQL Server 2008 if you needed to write
logic that inserted rows into a target table if they didn’t exist, or
updated them if they did exist you needed a series of “if then else”
logic and needed to perform both the UPDATE and INSERT statement. With
the introduction of the MERGE statement with SQL Server 2008 you can
perform either an INSERT or UPDATE statement using a single MERGE
statement. Being able to use the MERGE statement to perform inserts or
updates to a table makes it easier to code your UPSERT logic in TSQL.
Performing UPSERT Using MERGE Statement
The MERGE statement supports inserting and updating rows in a table
with a single operation. In order to accomplish this the MERGE
statement requires both a Source and Target table. The Source table is
used to identify the rows that needed be inserted or update, and the
Target table is the table that rows will be inserted or updated.
In order to demo the MERGE statement I will need a Source and Target
table, which will be used in my MERGE example. I will use the
following code to create my Source and Target table:
USE tempdb;
GO
SET NOCOUNT ON;
CREATE TABLE dbo.Product (
Id int identity,
ProductName varchar(100),
Qty int);
INSERT INTO dbo.Product (ProductName, Qty) VALUES('Magic Stripper', 5);
CREATE TABLE dbo.NewInventory(
Id int identity,
ProductName varchar(100),
Qty int);
INSERT INTO dbo.NewInventory (ProductName, Qty) VALUES
('Sandpaper',20),
('Paint Brush 1 inch',15),
('Magic Stripper',5);
With this code I created two tables. The first table I created was
the Product table, which will be my Target table. This is the table
that I will be updating or inserting rows using the MERGE statement.
The other table, NewInventory, is the Source table. My Source table
identifies the records that will be used to determine if a new record
needs to be inserted into my Product table. The Source table is also
used to determine if an UPDATE needs to be performed against any
existing records in my Product table. These two tables will be used in
the following MERGE statement, which performs an UPSERT operation:
USE tempdb;
GO
MERGE dbo.Product AS T
USING dbo.NewInventory AS S
ON T.ProductName = S.ProductName
WHEN MATCHED THEN
UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED THEN
INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;
When I run this code I get the following output:
Id ProductName Qty
----------- --------------------------------------------------------------- -----------
1 Magic Stripper 10
2 Sandpaper 20
3 Paint Brush 1 inch 15
Let me review the above code and explain how it worked. The table
identified right after the MERGE statement, the one with the table alias
of T is known as the Target. The Target table is the table in which I
will be performing an UPDATE, or an INSERT. The table following the
USING clause, with the table alias of S is known as the Source table.
The Source table identifies the potential records that will be inserted
or updated in the Target table. The ON clause identifies how to join
the Target and Source tables. In my example above I’m joining the
Source and Target table based on the ProductName column. Following the
ON clause are two WHEN clauses. These clauses identify the conditions
when a UPDATE or an INSERT will be performed based on the results of the
join between the Target and Source tables.
The first WHEN clause says “WHEN MATCHED”. The “WHEN MATCHED”
conditions means when the Target and Source tables are joined based on
the “ON” clause if there is a match then the UPDATE statement will be
performed. In the UPDATE statement I take the existing Qty value in the
Target table and add to it the Qty value in the Source table to
increase the amount of inventory I have on hand for a given
ProductName.
The second WHEN clause has “WHEN NOT MATCHED”. The “WHEN NOT MATCH”
condition means when joining the Target and Source if there is a
ProductName in the Source table that is not found in the Target table
then this condition will be met. When this condition is met based on
the Source and Target table join operation the Source row will be
inserted into the Product table. This condition allows me to insert a
new row into the Product table when new ProductName’s are found in the
Source table.
If you review the original rows I had in my Product table you will
see that I only had the “Magic Stripper” product in my Product table and
that product had a Qty of 5. After the MERGE statement ran, the “Magic
Stripper” product now has a Qty of 10. This happened because the MERGE
statement’s WHEN MATCH condition was met and therefore the Qty value
from the Product table, which was 5 and the Qty value from the New
Inventory, which was also 5 were summed together to UPDATE the matched
row in the Target table. The other two Products “Sandpaper” and “Paint
Brush 1 inch” where inserted into the Product table because these two
products didn’t already exist in the Product table so they met the “WHEN
NOT MATCHED” condition. When the “WHEN NOT MATCHED” condition is met
the unmatched rows in the Source table were inserted into the Target
table.
What about Deleting Rows Using the Merge Statement?
Can the MERGE statement be used to delete records from a table? Yes
it can! To see how this works let’s suppose we have a business
requirement to delete rows from the Target table of the MERGE statement
when they don’t exist in the Source table. In order to show how this
works I’m going to using my exiting Product table, which now after
running the prior examples has records for the following three products:
- Magic Stripper
- Sandpaper
- Paint Brush 1 Inch
But this time I’m going to use the NewInventory2 table that is
created with the following TSQL code. This new table will be the Source
table for the MERGE statement:
USE tempdb;
GO
CREATE TABLE dbo.NewInventory2(
Id int identity,
ProductName varchar(100),
Qty int);
INSERT INTO dbo.NewInventory2 (ProductName, Qty) VALUES
('Sandpaper',5),
('Paint Brush 1 inch',10);
To use the MERGE statement to perform a delete from my Product table I will use the following code:
USE tempdb;
GO
MERGE dbo.Product AS T
USING dbo.NewInventory2 AS S
ON T.ProductName = S.ProductName
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
SELECT * FROM dbo.Product;
When I run this code I get the following output:
Id ProductName Qty
----------- ---------------------------------------------------------------- -----------
2 Sandpaper 20
3 Paint Brush 1 inch 15
By reviewing the result set above you can see that the “Magic
Stripper” product got deleted from the Product table. This was
accomplished by using the “WHEN NOT MATCHED BY SOURCE” condition of the
above MERGE statement. This condition is met when there is no row in
the Source table that matches a Target table row. When this occurs SQL
Server deletes rows in the Target table that don’t have a corresponding
matching row in the Source table. Note you can include the WHEN NOT
MATCHED BY SOURCE, WHEN MATCHED and WHEN NOT MATCHED is the same MERGE
statement.
Beware of Filtering Rows Using ON Clause
If you read Books Online (BOL) there is a cautionary note on
Filtering Rows by associating additional constraints with the ON clause
of the MERGE statement. Here is the actual text regarding trying to
perform additional filtering using the ON clause that was found in the
BOL:
It is important to specify only the columns from the target table
that are used for matching purposes. That is, specify columns from the
target table that are compared to the corresponding column of the source
table. Do not attempt to improve query performance by filtering out
rows in the target table in the ON clause, such as by specifying AND NOT
target_table.column_x = value. Doing so may return unexpected and
incorrect results.
To show you an example of how adding additional filtering using the
ON clause of a MERGE statement can cause unexpected results. I’m going
to first run this code to recreate my Target and Source tables:
USE tempdb;
GO
DROP table Product, NewInventory
SET NOCOUNT ON;
CREATE TABLE dbo.Product (
Id int identity,
ProductName varchar(100),
Qty int);
INSERT INTO dbo.Product (ProductName, Qty) VALUES('Magic Stripper', 5);
CREATE TABLE dbo.NewInventory(
Id int identity,
ProductName varchar(100),
Qty int);
INSERT INTO dbo.NewInventory (ProductName, Qty) VALUES
('Sandpaper',20),
('Paint Brush 1 inch',15),
('Magic Stripper',5);
This code just sets my Target and Source back to what it was
originally, where there is only 1 row in the Product table and 3 rows in
the NewInventory table.
Suppose I only want to add a subset of the rows in my NewInventory
table to my Product table. For this example, suppose I only want to add
rows to my Product table if the Qty value in my NewInventory table is
greater than 5.
To demonstrate how a MERGE statement can produce unexpected results
if you add additional filtering criteria using the ON clause, let me run
the following code:
USE tempdb;
GO
MERGE dbo.Product AS T
USING dbo.NewInventory AS S
ON T.ProductName = S.ProductName
AND S.QTY > 5
WHEN MATCHED THEN
UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED THEN
INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;
When I run this code I get this output:
Id ProductName Qty
----------- ------------------------------------------------------------- -----------
1 Magic Stripper 5
2 Sandpaper 20
3 Paint Brush 1 inch 15
4 Magic Stripper 5
If you look at this output you can see it doesn’t meet my requirement
by only performing an UPSERT operation of the records in my Source
table that have a Qty value greater than 5. Why did it do this? It
did this because I added an additional filter rule that wasn’t an
equality comparison between a column in the Target table and a column in
the Source tables. This additional filter criteria is the “S.Qty >
5” condition. By adding this additional condition that only identified
a source column I told SQL Server this was part of the matching
criteria. This additional condition was used to determine which records
match between the Source and Target tables. In this particular
example no row in the Source table matched any rows in the Target table
therefore all three records in my Source table where inserted into the
Target table. This is why I now have two product records with a
ProductName of “Magic Stripper”.
In order to meet my requirements I can write my MERGE statement like below:
USE tempdb;
GO
MERGE dbo.Product AS T
USING dbo.NewInventory AS S
ON T.ProductName = S.ProductName
WHEN MATCHED AND S.Qty > 5 THEN
UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED AND S.Qty > 5 THEN
INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;
In this code I place the additional filter rule on the WHEN clause.
Now SQL Server first matches the rows between the Source and Target
tables based on the ON condition and then when SQL Server executes the
WHEN conditions it excludes the Source rows that have a Qty less than 6
from being inserted or updated. To test this for yourself you will
first need to run the first code block in this section that creates the
original Source and Target rows, and then run this code.
Duplicates in Source Table
Another condition that causes problems with the MERGE statement is
when the Source table contains multiple records with the same column
values associated with the ON clause. When this occurs SQL Server will
throw a duplicate error message. To demonstrate let me create a new
Source table “NewInventory3” and then run it through my MERGE
statement. Here is the code that will throw the MERGE error:
USE tempdb;
GO
CREATE TABLE NewInventory3 (
Id int identity,
ProductName varchar(100),
Qty int);
INSERT INTO NewInventory3 values
('Magic Stripper',15),
('Magic Stripper',5);
USE tempdb;
GO
MERGE dbo.Product AS T
USING dbo.NewInventory3 AS S
ON T.ProductName = S.ProductName
WHEN MATCHED THEN
UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED THEN
INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;
When I run this code I get this error:
Msg 8672, Level 16, State 1, Line 12
The MERGE statement attempted to UPDATE or DELETE the same row more
than once. This happens when a target row matches more than one source
row. A MERGE statement cannot UPDATE/DELETE the same row of the target
table multiple times. Refine the ON clause to ensure a target row
matches at most one source row, or use the GROUP BY clause to group the
source rows.
This error was caused because I had two duplicate rows in my Source
table that matched to a single Target row. To resolve this problem I
need to eliminate the duplicate rows based on the matching criteria.
Suppose I wanted both of those rows to actually add to my inventory of
“Magic Stripper”. In this case, I could run the following MERGE
statement to accomplish this:
MERGE dbo.Product AS T
USING (SELECT ProductName, SUM(Qty) as Qty
FROM dbo.NewInventory3
GROUP by ProductName) AS S
ON T.ProductName = S.ProductName
WHEN MATCHED THEN
UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED THEN
INSERT (ProductName,Qty) VALUES (S.ProductName,S.Qty);
SELECT * FROM dbo.Product;
Here I removed the duplicate row in my Source table by first
aggregating all the similar ProductName rows in my Source table by using
a sub-query when I identified the Source rows for my MERGE statement.
Summary
The MERGE statement allows you to write a single TSQL statement that
allows you to INSERT, UPDATE, and/or DELETE records from a Target table.
The MERGE statement controls whether an INSERT, UPDATE, or DELETE
clause is executed by matching a set of rows in the Source table against
the Target table. When a Source row matches a Target row the Target
row is updated, with information from the Source row. If the Source row
does not match a Target row then the Source row is inserted into the
Target table. The MERGE statement can also DELETE rows from the Target
table when Target table rows are not found within the Source table.
Using the MERGE statement greatly simplifies the amount of code you
would need to write using “if then else” logic to perform INSERT,
UPDATE, and/or DELETE operations against a Target table. Next time you
need to perform an UPSERT operation look into using the MERGE statement
if you are on