重庆思庄Oracle、Redhat认证学习论坛

 找回密码
 注册

QQ登录

只需一步,快速开始

搜索
查看: 707|回复: 0
打印 上一主题 下一主题

postgresql分区表DDL操作时导致pg_dump等待

[复制链接]
跳转到指定楼层
楼主
发表于 2024-8-25 18:45:07 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
问题概述
分区表在执行DDL的时候,pg_dump执行导出任意表都会等待

问题复现
会话1开启一个事务,查询分区表test_list

postgres=# begin;
BEGIN
postgres=*# select * from test_list limit 1;
id | name
----+------
  1 | 1
(1 row)

postgres=*#
会话2执行truncate分区表test_list

postgres=# truncate test_list;
会话3执行pg_dump导出a表定义,执行后并未输出任何信息

[postgres@mydb1a src]$ /app/pg12_7/bin/pg_dump -p 5431 -t a -s
会话4查看活动会话和锁信息,可以看到pid:1496559(pg_dump) -> 等待pid:1496329(truncate) -> 等待pid:1494903(select)

postgres=# select pid,application_name,xact_start,state,wait_event,wait_event_type,query from pg_stat_activity where state<>'idle' order by xact_start;
   pid   | application_name |          xact_start           |        state        | wait_event | wait_event_type |query
--------------------------
1494903 | psql             | 2024-08-14 14:26:49.79773+08  | idle in transaction | ClientRead | Client          | select * from test_list limit 1;
1496329 | psql             | 2024-08-14 14:29:13.82338+08  | active              | relation   | Lock            | truncate test_list;
1496559 | pg_dump          | 2024-08-14 14:29:21.555196+08 | active              | relation   | Lock            | SELECT c.tableoid, c.oid, c.relname, (SELECT
pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(c.relacl,pg_catalog.acldefault(CASE WHEN c.relkind = 'S' THEN
's' ELSE 'r' END::"char",c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog
.acldefault(CASE WHEN c.relkind = 'S' THEN 's' ELSE 'r' END::"char",c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) AS relacl, (SELECT pg_catalo
g.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(CASE WHEN c.relkind = 'S' THEN 's'
ELSE 'r' END::"char",c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(c.relacl,pg_catalog.acldefau
lt(CASE WHEN c.relkind = 'S' THEN 's' ELSE 'r' END::"char",c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) as rrelacl, NULL AS initrelacl, NULL
as initrrelacl, c.relkin

postgres=# select relation::regclass,pid,pg_blocking_pids(pid),mode from pg_locks where granted='f';
relation  |   pid   | pg_blocking_pids |        mode
-----------+---------+------------------+---------------------
test_list | 1496559 | {1496329}        | AccessShareLock
test_list | 1496329 | {1494903}        | AccessExclusiveLock
(2 rows)
问题分析
此时test_list表是有锁的,这毫无疑问,但是pg_dump导出a表时为什么也要等待?

首先要拿到pg_dump执行的这个完整SQL是什么,设置参数log_statement=all,再次重复测试,在日志中找到完整SQL

只展示部分SQL

SELECT c.tableoid
......
pg_get_partkeydef(c.oid) AS partkeydef, c.relispartition AS ispartition, pg_get_expr(c.relpartbound, c.oid) AS partbound
FROM pg_class c
           LEFT JOIN pg_depend d ON (c.relkind = 'S' AND d.classid = c.tableoid AND d.objid = c.oid AND d.objsubid = 0 AND d.refclassid = c.tableoid AND d.deptype IN ('a', 'i'))
           LEFT JOIN pg_class tc ON (c.reltoastrelid = tc.oid AND c.relkind <> 'p')
           LEFT JOIN pg_am am ON (c.relam = am.oid)
           LEFT JOIN pg_init_privs pip ON (c.oid = pip.objoid AND pip.classoid = 'pg_class'::regclass AND pip.objsubid = 0)
           WHERE c.relkind in ('r', 'S', 'v', 'c', 'm', 'f', 'p') ORDER BY c.oid
可以看到SQL中的两个函数pg_get_partkeydef、pg_get_expr,问题就在这,表test_list正在执行DDL,表上会加最高级别锁AccessExclusiveLock,此时表test_list是不可访问,分区函数也不可以执行,pg_dump就会在此处等待。

测试pg15及之后的版本已不存在该问题

pg15源码"bin/pg_dump/pg_dump.c"注释中可以看到相关说明

        /*
         * Find all the tables and table-like objects.
         *
         * We must fetch all tables in this phase because otherwise we cannot
         * correctly identify inherited columns, owned sequences, etc.
         *
         * We include system catalogs, so that we can work if a user table is
         * defined to inherit from a system catalog (pretty weird, but...)
         *
         * Note: in this phase we should collect only a minimal amount of
         * information about each table, basically just enough to decide if it is
         * interesting.  In particular, since we do not yet have lock on any user
         * table, we MUST NOT invoke any server-side data collection functions
         * (for instance, pg_get_partkeydef()).  Those are likely to fail or give
         * wrong answers if any concurrent DDL is happening.
         */
复制
问题总结
1.分区表正在执行DDL时,执行pg_dump导出任意表都会卡住,pg15及之后版本已优化该问题可以正常执行

2.普通表a正在执行DDL时,执行pg_dump导出其他表没问题

导出a表卡住,可以在pg_dump时使用选项–lock-wait-timeout=TIMEOUT ,设置超时时间,超时后pg_dump退出

分享到:  QQ好友和群QQ好友和群 QQ空间QQ空间 腾讯微博腾讯微博 腾讯朋友腾讯朋友
收藏收藏 支持支持 反对反对
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

QQ|手机版|小黑屋|重庆思庄Oracle、Redhat认证学习论坛 ( 渝ICP备12004239号-4 )

GMT+8, 2024-11-28 16:09 , Processed in 0.111198 second(s), 20 queries .

重庆思庄学习中心论坛-重庆思庄科技有限公司论坛

© 2001-2020

快速回复 返回顶部 返回列表