postgresql - Is there any difference between CTE INSERT+UPDATE in one go versus two different queries - Stack Overflow-软件玩家

admin管理员组
文章数量:1432566

Imagine a table like this

create table public.my_table (
    id          uuid primary key,
    category_id uuid not null,
    name        text not null,
    is_active   bool not null,
    created_at timestamp with time zone not null default now(),
    updated_at timestamp with time zone not null default now()
);

There is a rule: a category_id must have only one record with is_active=true

I can come up with 2 ways of adding records:

#1 using CTE insert to add a new active record + update to deactivate previous:

WITH new_record AS (
    INSERT INTO my_table (id, category_id, name, is_active) VALUES (?, ?, ?, true)
    RETURNING id, category_id
)
UPDATE my_table
SET is_active  = false,
    updated_at = now()
FROM new_record
WHERE my_table.category_id = new_record.category_id
  AND my_table.id <> new_record.id
  AND my_table.is_active = true

#2 making separate insert & update queries in a transaction:

BEGIN TRANSACTION;

INSERT INTO my_table (id, category_id, name, is_active) VALUES (?, ?, ?, true);

UPDATE my_table
   SET is_active  = false,
       updated_at = now()
   WHERE my_table.category_id = ?
         AND my_table.id <> ?
         AND my_table.is_active = true;

COMMIT;

I like the second way for its simplicity.

Does the first way have any benefits compared to the second?

Imagine a table like this

create table public.my_table (
    id          uuid primary key,
    category_id uuid not null,
    name        text not null,
    is_active   bool not null,
    created_at timestamp with time zone not null default now(),
    updated_at timestamp with time zone not null default now()
);

There is a rule: a category_id must have only one record with is_active=true

I can come up with 2 ways of adding records:

#1 using CTE insert to add a new active record + update to deactivate previous:

WITH new_record AS (
    INSERT INTO my_table (id, category_id, name, is_active) VALUES (?, ?, ?, true)
    RETURNING id, category_id
)
UPDATE my_table
SET is_active  = false,
    updated_at = now()
FROM new_record
WHERE my_table.category_id = new_record.category_id
  AND my_table.id <> new_record.id
  AND my_table.is_active = true

#2 making separate insert & update queries in a transaction:

BEGIN TRANSACTION;

INSERT INTO my_table (id, category_id, name, is_active) VALUES (?, ?, ?, true);

UPDATE my_table
   SET is_active  = false,
       updated_at = now()
   WHERE my_table.category_id = ?
         AND my_table.id <> ?
         AND my_table.is_active = true;

COMMIT;

I like the second way for its simplicity.

Does the first way have any benefits compared to the second?

Share Improve this question edited Nov 19, 2024 at 10:49 Mark Rotteveel 110k230 gold badges156 silver badges225 bronze badges asked Nov 18, 2024 at 18:20 Peter Fence 636 bronze badges

1 I suggest reading WITH data modifying statements. – Adrian Klaver Commented Nov 18, 2024 at 18:43
Are you aware that this structure will become slower every INSERT? The first time there are 0 records to update. The second time there will we 1 update, the third time 2 updates, etc. – Frank Heikens Commented Nov 18, 2024 at 19:25
@FrankHeikens I don't think I follow. The first time there's nothing to update. The second and all following inserts require just a single update, of the row that was previously is_active for that category_id. The search would be faster if they moved is_active to the category table (if they have one and my_table.category_id points at its pk) but the insert would result in the exact same amount of updates: just the one that was active before the new insert arrived. – Zegarek Commented Nov 18, 2024 at 19:54
"There is a rule" - but there is no constraint enforcing this? – Bergi Commented Nov 18, 2024 at 22:26
In your first solution ("with ... as ( insert ... ) update ..."), the update runs on the same snapshot as the insert (see the answer from @Zegarek), and does not see the row which was newly inserted. You don't have to exclude the new row from the update, the part "AND my_table.id <> new_record.id" is superfluous. – hobgoblin Commented Nov 25, 2024 at 17:05

Add a comment |

1 Answer 1

Sorted by: Reset to default 4

does the first way have any pros?

It does, it's one atomic operation.

By default you operate in read committed transaction isolation mode which means there's a chance that in your second example one of your workers will
_{demo at db<>fiddle}

Open the transaction, insert the new row and promote it to active.
Meanwhile, some other worker will insert an even newer active row and commit it.
The first worker will proceed to run the update to make sure all other rows of the category are deacativated. At that point it'll notice the newer row and deactivate it. As a result, once it commits, your active row might not be the latest one, because the slower worker overwrote the work from the faster, most recent messenger.
If the other worker didn't commit before the first one issued the update, they will both deactivate some older row, and after that they'll both commit their active row. As a result, you'll have two active rows in a category. It can go as high as the number of concurrent clients.

Your first example does the whole insert-one-update-another operation in one go, on one snapshot that doesn't allow a window for anyone to slip in any changes. The doc immediately recommended by @Adrian Klaver puts it well:

when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot (see Chapter 13), so they cannot “see” one another's effects on the target tables. This alleviates the effects of the unpredictability of the actual order of row updates

If my_table.category_id is really a foreign key pointing at the primary key of category.id, it's better to move is_active to category table. PK means it's unique, so it'll be smaller and faster to search for the category and update a pointer at the currently active my_table.id:

create table category(
  id uuid primary key
 ,active_my_table_id uuid references my_table(id)
);

To populate it, rather than taking all where is_active, grab a distinct on, in case there are inactive categories:

insert into category
select distinct on(category_id)category_id
     , id
from my_table
order by category_id
        ,is_active desc
        ,created_at desc
        ,updated_at desc;

alter table public.my_table 
  drop column is_active cascade--drops the exclusion constraint too
 ,add foreign key(category_id)references category(id);

Then, inserting a new active row for a category:

with new_record as (
    insert into my_table (id, category_id, name) 
    values ( get_row_uuid(row_)
            ,get_category_uuid(category_)
            ,row_)
    returning id, category_id
)
insert into category
select category_id, id
from new_record
on conflict(id)
do update set active_my_table_id = excluded.active_my_table_id

From technical standpoint, expressing your business rule as an actual SQL-level constraint is optional. But if you define it, it automatically rejects any violation attempts, throwing a clear error message and logging the event, preventing the table from entering invalid state with multiple active records for you to track down and fix. Plus, in the config above, you can use it to run insert..on conflict..do update.

That being said, even if you have the separate category table, adding, maintaining and manipulating a 128-bit uuid fk in there will obviously take more space than the 8-bit boolean in my_table. If you wish to keep it that way, you can follow @Bergi's suggestion below:
_{demo at db<>fiddle}

alter table my_table 
  add constraint one_active_per_category
  exclude (category_id with =)
  where (is_active)
  deferrable initially deferred;

It keeps the is_active untouched and guards the table against inserting more than a single active row per category.

Making the constraint deferrable initially deferred means it'll only be checked at the end of transaction so you can deactivate then activate or the other way around, as long as in the end you commit a state that respects the rule. Otherwise, you need to deactivate first, then activate to avoid a window of time when the old and the new entry for the category have both is_active=true.
If you do it that way, a regular unique is simpler and works faster:

create unique index on my_table(category_id)
  where (is_active);

Unique and exclusion constraints are backed by indexes, so setting up just an index instead of a constraint will have the same effect. You could normally tie the index to the table so that it becomes a proper constraint, but that's not supported for partial indices.

If multiple concurrent workers try the operation, all but one will hit a lock and get rejected, the exact point and error message depending on whether they try that atomically or in transactions with multiple steps, and if it's deferred or not. The advantage of marking what's active in the separate category table and doing that in one step is they'll instead get queued up and wait - first come, first served.

本文标签：

版权声明：本文标题：postgresql - Is there any difference between CTE INSERT+UPDATE in one go versus two different queries - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1745602012a2665621.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

postgresql - Is there any difference between CTE INSERT+UPDATE in one go versus two different queries - Stack Overflow

#1 using CTE insert to add a new active record + update to deactivate previous:

#2 making separate insert & update queries in a transaction:

#1 using CTE insert to add a new active record + update to deactivate previous:

#2 making separate insert & update queries in a transaction:

1 Answer 1

更多相关文章

javascript - Google Maps won&#39;t work when API key is providedThis page was unable to display a Google maps element - Stac

java - Upgrading from Spring Framework 5 to 6 (Spring Security 5.7.2 to 6.3.4) - Stack Overflow

javascript - ng-switch with typeof expression - Stack Overflow

javascript - Click anywhere outside of menu to close menu - Stack Overflow

sql - Does SQLite support ALL and ANY keywords? - Stack Overflow

javascript - useState hook not working on updating map - Stack Overflow

functions - Adding additional roles on registration

javascript - draw pie chart from arrays in highcharts - Stack Overflow

Making the default &quot;Latest Posts&quot; Wordpress Widget Show Thumbnails

javascript - VIM possible to detect multiple languages on same file - Stack Overflow

Eclipse RCP Help in Browser - Stack Overflow

javascript - Find difference between two dates picked from jQuery datepicker - Stack Overflow

javascript - Use JS to add browser version to &lt;html&gt; or &lt;body&gt; as class - Stack Overflow

wp query - Is it possible to select against a post&#39;s parent&#39;s fields with WP_Query?

javascript - Get the offset position of a table using its index with jQuery - Stack Overflow

javascript - How to get the raw data from pdf.js - Stack Overflow

How can I pause an embedded Youtube video with vanilla Javascript? - Stack Overflow

javascript - Delete first page in JQuery Mobile history - Stack Overflow

javascript - JS does not work in google chrome and it only works on firefox - Stack Overflow

user roles - Wordpress add_rewrite_rule redirection match GET variable not passing through to custom template

发表评论

推荐文章

javascript - Bootstrap Vue Sidebar component not working properly - Stack Overflow

openiddict - How to write single claim value as an array? - Stack Overflow

javascript - How to make a form in React Native? - Stack Overflow

javascript - Adding Autocomplete to Google Geocoder - Stack Overflow

javascript - Date Object in node.js has some weird behaviour - Stack Overflow

热门文章

javascript - Best way to use JSON as a database? - Stack Overflow

javascript - pass list to js, thymeleaf,springboot - Stack Overflow

javascript - ChartJS not displaying time data using Moment.js - Stack Overflow

javascript - NightwatchJS: How to check if attribute is not present? - Stack Overflow

html - JavaScript focus() function doesn&#39;t focus textarea on page load - Stack Overflow

r - Why do ggmagnify() zoomed-in insets are off-limit? - Stack Overflow

PHP JSON to JavaScript Array - Stack Overflow

javascript - How to wait for a function, which contains promises - Stack Overflow

alliedvision - Vimba SDK display allied vision camera in Asp.net web application (.net framework) - Stack Overflow

.net - Why are my GUID type filters returning null? - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

How to fix NJS-505: unable to initiate TLS connection. Please check if wallet credentials are valid in node-oracledb - Stack Ove

Is it possible to overwrite javaScript primitive data type? - Stack Overflow

javascript - jQuery date picker not working on ajax-generated input field - Stack Overflow

user roles - Wordpress add_rewrite_rule redirection match GET variable not passing through to custom template

windows - fastapi dev fails with UnicodeEncodeError - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

javascript - Google Maps won't work when API key is providedThis page was unable to display a Google maps element - Stac

Making the default "Latest Posts" Wordpress Widget Show Thumbnails

javascript - Use JS to add browser version to <html> or <body> as class - Stack Overflow

wp query - Is it possible to select against a post's parent's fields with WP_Query?

html - JavaScript focus() function doesn't focus textarea on page load - Stack Overflow