近期有一个需求,向一张数据库表插入数据,如果是新数据则执行插入动作,如果插入的字段和已有字段重复,则更新该行对应的部分字段
1. 创建测试表
create table meta_data ( id serial, user_id varchar(128) DEFAULT NULL, file_name varchar(1024) DEFAULT NULL, file_path varchar(1024) DEFAULT NULL, update_time TIMESTAMP DEFAULT NULL, UNIQUE (user_id,file_name) ); postgres=# d meta_data Table "public.meta_data" Column | Type | Modifiers -------------+-----------------------------+-------------------------------------------------------- id | integer | not null default nextval('meta_data_id_seq'::regclass) user_id | character varying(128) | default NULL::character varying file_name | character varying(1024) | default NULL::character varying file_path | character varying(1024) | default NULL::character varying update_time | timestamp without time zone | Indexes: "meta_data_user_id_file_name_key" UNIQUE CONSTRAINT, btree (user_id, file_name)
2. 插入两条测试数据
INSERT INTO meta_data ( user_id, file_name, file_path, UPDATE_TIME ) VALUES ( 'user_id01', 'file_name01', '/usr/local/file_name01', now()) ON CONFLICT (user_id, file_name) DO UPDATE SET file_path = EXCLUDED.file_path, UPDATE_TIME = EXCLUDED.UPDATE_TIME; INSERT INTO meta_data ( user_id, file_name, file_path, UPDATE_TIME ) VALUES ( 'user_id02', 'file_name02', '/usr/local/file_name02', now()) ON CONFLICT (user_id, file_name) DO UPDATE SET file_path = EXCLUDED.file_path, UPDATE_TIME = EXCLUDED.UPDATE_TIME;
postgres=# select * from meta_data; id | user_id | file_name | file_path | update_time ----+-----------+-------------+------------------------+---------------------------- 1 | user_id01 | file_name01 | /usr/local/file_name01 | 2019-09-23 17:14:52.39878 2 | user_id02 | file_name02 | /usr/local/file_name02 | 2019-09-23 17:14:53.118192 (2 rows)
3. 插入第三条测试数据,注意插入的字段user_id和file_name和第二条语句对应的字段是重复的
INSERT INTO meta_data ( user_id, file_name, file_path, UPDATE_TIME ) VALUES ( 'user_id02', 'file_name02', '/usr/local/file_name03', now()) ON CONFLICT (user_id, file_name) DO UPDATE SET file_path = EXCLUDED.file_path, UPDATE_TIME = EXCLUDED.UPDATE_TIME;
postgres=# select * from meta_data; id | user_id | file_name | file_path | update_time ----+-----------+-------------+------------------------+---------------------------- 1 | user_id01 | file_name01 | /usr/local/file_name01 | 2019-09-23 17:14:52.39878 2 | user_id02 | file_name02 | /usr/local/file_name03 | 2019-09-23 17:16:52.457696 (2 rows)
可以看到新插入的第三条语句其实是更新了已存在的第二条记录
4.如何区分该条语句到底是执行了insert和update操作。
通过xmax字段的值是否为0,可以判断,如果是UPDATE,XMAX里面会填充更新事务号。注意直接用UPDATE语句更新的话,XMAX会写入0,因为是新版本,而老版本上XMAX会填入更新事务号。
我们重建表结构重新插入前面两条数据测试。
postgres=# select ctid,xmin,xmax,* from meta_data; ctid | xmin | xmax | id | user_id | file_name | file_path | update_time -------+------+------+----+-----------+-------------+------------------------+---------------------------- (0,1) | 3241 | 0 | 1 | user_id01 | file_name01 | /usr/local/file_name01 | 2019-09-23 17:31:27.360539 (0,2) | 3242 | 0 | 2 | user_id02 | file_name02 | /usr/local/file_name02 | 2019-09-23 17:31:28.10752 (2 rows)
再次插入第三条重复数据
INSERT INTO meta_data ( user_id, file_name, file_path, UPDATE_TIME ) VALUES ( 'user_id02', 'file_name02', '/usr/local/file_name03', now()) ON CONFLICT (user_id, file_name) DO UPDATE SET file_path = EXCLUDED.file_path, UPDATE_TIME = EXCLUDED.UPDATE_TIME; postgres=# select ctid,xmin,xmax,* from meta_data; ctid | xmin | xmax | id | user_id | file_name | file_path | update_time -------+------+------+----+-----------+-------------+------------------------+---------------------------- (0,1) | 3241 | 0 | 1 | user_id01 | file_name01 | /usr/local/file_name01 | 2019-09-23 17:31:27.360539 (0,3) | 3243 | 3243 | 2 | user_id02 | file_name02 | /usr/local/file_name03 | 2019-09-23 17:33:53.459403 (2 rows)
ctid表示行号, xmin表示INSERT该记录的事务号,xmax表示删除该记录(update实际上是删除老版本新增新版本,所以老版本上xmax有值)的事务号。
手动执行update
postgres=# update meta_data set file_path='/usr/local/file_name02' where user_id='user_id02'; UPDATE 1 postgres=# select ctid,xmin,xmax,* from meta_data; ctid | xmin | xmax | id | user_id | file_name | file_path | update_time -------+------+------+----+-----------+-------------+------------------------+---------------------------- (0,1) | 3241 | 0 | 1 | user_id01 | file_name01 | /usr/local/file_name01 | 2019-09-23 17:31:27.360539 (0,4) | 3244 | 0 | 2 | user_id02 | file_name02 | /usr/local/file_name02 | 2019-09-23 17:33:53.459403 (2 rows)
结论
1、insert into on conflict do update,返回xmax等于0表示insert,不等于0表示update,
2、直接update,并提交,提交的记录上xmax为0。