在更新数据仓库时,经常需要根据源表对Target表进行数据同步,Merge 命令具有数据更新,删除,插入的功能,专门用于数据同步,并将数据的更新输出到表中。在使用Merge命令时,需要注意when not matche子句:
- when not matched by target :当Target Table不匹配时,数据行不存在于Target Table中,存在于Source Table;
- when not matched by source:当Source Table不匹配时,数据行不存在于Source Table中,存在于Target Table;
- 当不指定by子句时,默认值是by target;
1,创建示例数据
2,同步数据,将源表的数据同步到靶表中
merge into dbo.tar as t using dbo.src as s on t.id=s.id --matched表示On子句匹配成功,加上额外的and条件,如果when子句成功,那么更新Targe表中的数据 when matched and t.name<>s.name then update set t.name=s.name --not matched表示On子句不匹配 --ID不存在于Targe表,存在于Source表,则将行插入到Targe表 when not matched --default by target then insert (id,name) values(s.id,s.name) --ID存在于Targe表,而不存在于Source表,那么将行从Target表删除 when not matched by source then delete;
2,使用output子句,将靶表中更新的数据输出
merge into dbo.tar as t using dbo.src as s on t.id=s.id --matched表示On子句匹配成功,加上额外的and条件,如果when子句成功,那么更新Targe表中的数据 when matched and t.name<>s.name then update set t.name=s.name --not matched表示On子句不匹配 --ID不存在于Targe表,存在于Source表,则将行插入到Targe表 when not matched --default by target then insert (id,name) values(s.id,s.name) --ID存在于Targe表,而不存在于Source表,那么将行从Target表删除 when not matched by source then delete output $action,deleted.id as Deleted_ID,deleted.name as Deleted_Name,inserted.id as Instered_ID,inserted.name as Instered_Name ;
3,将靶表中更新的数据插入到一个表中有两种方式,一种是output into,一种是使用insert into
第一种方式,使用ouput into方式,将数据插入到staging table中
;merge into dbo.tar as t using dbo.src as s on t.id=s.id --matched表示On子句匹配成功,加上额外的and条件,如果when子句成功,那么更新Targe表中的数据 when matched and t.name<>s.name then update set t.name=s.name --not matched表示On子句不匹配 --ID不存在于Targe表,存在于Source表,则将行插入到Targe表 when not matched --default by target then insert (id,name) values(s.id,s.name) --ID存在于Targe表,而不存在于Source表,那么将行从Target表删除 when not matched by source then delete output $action,deleted.id as Deleted_ID,deleted.name as Deleted_Name,inserted.id as Instered_ID,inserted.name as Instered_Name into dbo.dt_merge_output ; select * from dbo.dt_merge_output
第二种方式,将output子句的输出作为派生表,使用Insert Into子句将数据插入到staging 表中
insert into dbo.dt_merge_output select * from ( merge into dbo.tar as t using dbo.src as s on t.id=s.id --matched表示On子句匹配成功,加上额外的and条件,如果when子句成功,那么更新Targe表中的数据 when matched and t.name<>s.name then update set t.name=s.name --not matched表示On子句不匹配 --ID不存在于Targe表,存在于Source表,则将行插入到Targe表 when not matched --default by target then insert (id,name) values(s.id,s.name) --ID存在于Targe表,而不存在于Source表,那么将行从Target表删除 when not matched by source then delete output $action,deleted.id as Deleted_ID,deleted.name as Deleted_Name,inserted.id as Instered_ID,inserted.name as Instered_Name ) as p(Action,Deleted_ID,Deleted_Name,Instered_ID,Instered_Name)
4,Output子句
Output子句,用于输出在Target Table中更新的数据,在每个数据行中,有一个特殊的字段,$Action,数据类型是nvarchar(10),能够标识出Merge操作的类型:Insert Delete,Update。
表Deleted和Inserted 是特殊的两个系统表,由系统创建,并且用户语句只能读取,不能修改,作用域是语句级别,当语句结束时,系统自动回收。DELETED 用于标识被Merge命令删除的数据行,INSERTED 用于标识被Merge命令插入的数据行,如果执行的是Update操作,那么inserted 用于标识更新之后的数据,deleted 用于标识数据行更新之前的数据。
5,在使用Merge命令更新Target表时,同一行数据只能被更新一次
If UPDATE is specified in the <merge_matched> clause, and more than one row of <table_source>matches a row in target_table based on <merge_search_condition>, SQL Server returns an error. The MERGE statement cannot update the same row more than once, or update and delete the same row.
Target表中一个数据行只能被更新一次,SQL Server会报错,错误原因是Source Table的中的多行数据和Target Table中一行数据匹配。
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.