最关键字段是pid、locktype、mode、granted、virtualxid/transactionid;通过自连接pg_locks可定位阻塞者与被阻塞者,但需注意事务残留、咨询锁、2PC等导致“有等待无持有”的情况。

pg_locks 里哪些字段最关键
定位阻塞的核心是识别谁在等锁、谁持有锁、等的是哪把锁。pg_locks 本身不直接存“阻塞关系”,得靠关联 pid 和锁模式交叉比对。最关键的字段是:
-
pid:对应后端进程 ID,和pg_stat_activity.pid对齐 -
locktype:常见如relation(表级)、tuple(行级)、transactionid(事务 ID 锁) -
mode:当前请求的锁模式,比如RowExclusiveLock、ShareLock -
granted:true表示已获得锁,false表示正在等待 -
virtualxid或transactionid:用于识别事务级阻塞(尤其长事务卡住其他事务提交)
注意:pg_locks 是实时快照,查完就可能变化;且它只显示本节点(不跨主从)。
怎么快速找出正在阻塞别人的进程
本质是找那些 granted = true,同时又有别的进程在等它同一把锁(相同 locktype+database+relation+page+tuple 等组合)的记录。常用写法是自连接 pg_locks:
SELECT blocked_locks.pid AS blocked_pid,
blocked_activity.usename AS blocked_user,
blocking_locks.pid AS blocking_pid,
blocking_activity.usename AS blocking_user,
blocked_activity.query AS blocked_query,
blocking_activity.query AS current_query
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks
ON blocking_activity.pid = blocking_locks.pid
AND blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;
这个查询会返回所有“被阻塞的进程”及其对应的“阻塞者”。实际使用时建议加 ORDER BY blocked_activity.backend_start 看谁等得最久。
为什么有时查不到阻塞者,但查询就是卡住
常见原因有三类:
- 阻塞者已经退出,但事务没结束(比如客户端断连但后端没 clean up),此时它的
pid在pg_stat_activity里可能已消失,但在pg_locks中仍残留(尤其transactionid锁) - 锁类型是
advisory(咨询锁),它不参与自动冲突检测,也不会出现在上面的关联逻辑里,需单独查pg_advisory_locks - 阻塞来自两阶段提交(2PC)的
prepared transaction,这类事务在pg_prepared_xacts里,持有的锁不会被普通pg_locks关联捕获
遇到“有等待无持有”情况,优先查:
-
SELECT * FROM pg_prepared_xacts; -
SELECT * FROM pg_stat_activity WHERE state = 'idle in transaction (aborted)';(事务异常但未退出) -
SELECT * FROM pg_locks WHERE NOT granted;看等待的locktype和virtualxid,再反查这些virtualxid是否还在活跃事务中
kill 掉阻塞进程前要注意什么
pg_terminate_backend(pid) 能强制结束进程,但不是所有场景都安全:
- 如果阻塞者正在执行 DDL(如
ALTER TABLE),中断可能导致表处于不一致状态(虽 PostgreSQL 多数 DDL 是原子的,但某些带锁升级的操作例外) - 若阻塞者是复制槽(replication slot)关联的进程,
pg_terminate_backend可能导致 WAL 积压甚至磁盘打满 -
pg_cancel_backend(pid)更温和,只取消当前查询;而pg_terminate_backend会断开连接并回滚整个事务——确认它没在做关键更新(比如财务扣款)再动手
建议先看 pg_stat_activity 的 state 和 backend_start、xact_start,判断是否真卡死,而不是正常长查询。另外,别在生产高峰直接 kill,优先联系对应业务方确认影响。
真正难的不是查出谁在阻塞,而是判断那个“阻塞者”到底能不能动、动了会不会引发连锁反应。锁背后连着事务语义、应用逻辑、甚至上游服务状态,光看 pg_locks 只是开了个门。










