从一个实例谈谈postgresql索引锁

时间:2023-03-08 15:42:12
从一个实例谈谈postgresql索引锁

最近客户在使用我司开发的数据库时,报告了如下问题(也不能算是问题,就是疑惑吧),环境如下:

OS : Red Hat Enterprise Linux Server release 6.7 (Santiago)
Kernel : 2.6.32-573.el6.x86_64
PostgreSQL : PostgreSQL 9.6.2

执行准备工作:

postgres=# create table test (id int, data text);
CREATE TABLE
postgres=# insert into test select a, 'Test data No ' || a from
generate_series(1,10000000) as a;
INSERT 0 10000000
postgres=# create index on test (id);
CREATE INDEX
postgres=# vacuum analyze verbose test;
......

然后,开启一个事务,在该事务中查询test表:

postgres=# begin ;
BEGIN
postgres=# select data from test where data = 'Test data No 1';
data
----------------
Test data No 1
(1 row)

在开另一个psql console,查询锁的情况:

postgres=# select c.oid, c.relname, l.locktype, l.pid, l.mode ,a.query
from (pg_class as c inner join pg_locks as l on (c.oid = l.relation))
inner join pg_stat_activity as a on (l.pid = a.pid) where relname like 'test%'
order by pid;
oid | relname | locktype | pid | mode |
query
-------+-------------+----------+-------+-----------------+----------
---
-------+-------------+----------+-------+-----------------+----------
---
-------+-------------+----------+-------+-----------------+----------
---
-------+-------------+----------+-------+-----------------+----------
---
-------+-------------+----------+-------+-----------------+--
19412 | test_id_idx | relation | 27612 | AccessShareLock | select data
from test where data = 'Test data No 1';
19405 | test | relation | 27612 | AccessShareLock | select
data from test where data = 'Test data No 1';
(2 rows)

发现在执行select期间对表上的所有的索引都加上了AccessShareLock锁,但是查询并没有走索引。这让客户非常奇怪。

没办法,带着这个问题调查了一番。大概有了以下的两点粗浅的认识和理解。

1.在执行一条select文时,数据库后端会对select文进行查询分析,查询重写,查询规划和查询执行四个阶段。在查询规划阶段,需要生成一个RelOptInfo结构存储优化的查询路径,该结构中存储了表上的索引信息。此处调用get_relation_info函数打开并锁定表上所有的索引,并将索引的信息写入到RelOptInfo结构体中。

具体到代码里,我们可以看到:

typedef struct RelOptInfo
{
NodeTag type; RelOptKind reloptkind; /* all relations included in this RelOptInfo */
Relids relids; /* set of base relids (rangetable indexes) */
.
.
.
List *indexlist; /* list of IndexOptInfo */ <------- here
.
.
.
bool has_eclass_joins; /* T means joininfo is incomplete */
} RelOptInfo;

这里indexlist就是查询的表上的所有的index的列表;

我们再看看get_relation_info函数,它获取该表上的“catalog information”,详情可以看看这段注释:


##############################################################
src/backend/optimizer/util/plancat.c: /*
* get_relation_info -
* Retrieves catalog information for a given relation.
*
* Given the Oid of the relation, return the following info into fields
* of the RelOptInfo struct:
*
* min_attr lowest valid AttrNumber
* max_attr highest valid AttrNumber
* indexlist list of IndexOptInfos for relation's indexes
* serverid if it's a foreign table, the server OID
* fdwroutine if it's a foreign table, the FDW function pointers
* pages number of pages
* tuples number of tuples
*
* Also, initialize the attr_needed[] and attr_widths[] arrays. In most
* cases these are left as zeroes, but sometimes we need to compute attr
* widths here, and we may as well cache the results for costsize.c.
*
* If inhparent is true, all we need to do is set up the attr arrays:
* the RelOptInfo actually represents the appendrel formed by an inheritance
* tree, and so the parent rel's physical size and index information isn't
* important for it.
*/
void
get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
RelOptInfo *rel) ###################################################################

而由于该查询是放在一个事务中的,我们发现:

2.一个事务内申请的锁会在事务结束时调用函数LockReleaseAll统一释放,在事务结束之前,该事务仍然保持对锁的持有。

代码如下:

src/backend/storage/lmgr/lock.c:

/*
* LockRelease -- look up 'locktag' and release one 'lockmode' lock on it.
* Release a session lock if 'sessionLock' is true, else release a
* regular transaction lock.
*
* Side Effects: find any waiting processes that are now wakable,
* grant them their requested locks and awaken them.
* (We have to grant the lock here to avoid a race between
* the waking process and any new process to
* come along and request the lock.)
*/
bool
LockRelease(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock)

最后我们打开lock log,再模拟下客户现场。贴一下执行结果:

postgres=# begin;
BEGIN
postgres=# select data from test where data = 'Test data No 1';
LOG: LockAcquire: lock [12373,16386] AccessShareLock at character 18
STATEMENT: select data from test where data = 'Test data No 1';
LOG: LockAcquire: lock [12373,16392] AccessShareLock
STATEMENT: select data from test where data = 'Test data No 1';
LOG: LockAcquire: lock [12373,16386] AccessShareLock
STATEMENT: select data from test where data = 'Test data No 1';
data
----------------
Test data No 1
(1 row) postgres=# end;
LOG: LockReleaseAll: lockmethod=1
STATEMENT: end;
LOG: LockReleaseAll done
STATEMENT: end;

果然,锁是在事务最后释放的。

综合1,2两点,我们向客户解释了这个问题是PostgreSQL的式样,不是bug。

Over~