MySQL的多层SP中Cursor的m_max_cursor_index相关BUG分析

源码分析丨MySQL的多层SP中Cursor相关BUG

一、问题发现

在一次开发中在sp中使用多层cursor的时候想知道每层的m_max_cursor_index值分别是多少，以用来做后续开发。于是做了以下的试验，但是发现第一个level=2那层的m_max_cursor_index的值有点问题。

注：本次使用的MySQL数据库版本为最新的debug版本。

SQL语句示例：

greatsql> CREATE TABLE t1 (a INT, b VARCHAR(10));

以下注释里面是该层sp_pcontext的参数值。
DELIMITER $$
CREATE PROCEDURE processnames() -- level=0，m_max_cursor_index=1+8+1
BEGIN
    DECLARE nameCursor0 CURSOR FOR SELECT * FROM t1; -- level=1，m_cursor_offset=0，m_max_cursor_index=1+8+1
    begin
	   DECLARE nameCursor1 CURSOR FOR SELECT * FROM t1; -- level=2，m_cursor_offset=1，m_max_cursor_index=1+8 ☆问题点
       begin
			DECLARE nameCursor2 CURSOR FOR SELECT * FROM t1; -- level=3，m_cursor_offset=2，m_max_cursor_index=1
            DECLARE nameCursor3 CURSOR FOR SELECT * FROM t1; -- level=3，m_cursor_offset=2，m_max_cursor_index=2
            DECLARE nameCursor4 CURSOR FOR SELECT * FROM t1; -- level=3，m_cursor_offset=2，m_max_cursor_index=3
            DECLARE nameCursor5 CURSOR FOR SELECT * FROM t1; -- level=3，m_cursor_offset=2，m_max_cursor_index=4
        end;
   	end;
    begin
		DECLARE nameCursor6 CURSOR FOR SELECT * FROM t1; -- level=2，m_cursor_offset=1，m_max_cursor_index=1
   	end;
END $$
DELIMITER ;

首先查看上面的sp的code，可以发现nameCursor6和nameCursor1属于同一层，因此他们的offset值一样。

greatsql>  show procedure code processnames;
+-----+---------------------------------------+
| Pos | Instruction                           |
+-----+---------------------------------------+
|   0 | cpush nameCursor0@0: SELECT * FROM t1 |
|   1 | cpush nameCursor1@1: SELECT * FROM t1 |
|   2 | cpush nameCursor2@2: SELECT * FROM t1 |
|   3 | cpush nameCursor3@3: SELECT * FROM t1 |
|   4 | cpush nameCursor4@4: SELECT * FROM t1 |
|   5 | cpush nameCursor5@5: SELECT * FROM t1 |
|   6 | cpop 4                                |
|   7 | cpop 1                                |
|   8 | cpush nameCursor6@1: SELECT * FROM t1 |
|   9 | cpop 1                                |
|  10 | cpop 1                                |
+-----+---------------------------------------+
11 rows in set (6.02 sec)

然后通过debug查看每层sp_pcontext的参数值（相关参数值已经在上面标识出），发现第一个level=2的sp_pcontext的m_max_cursor_index值多了很多，预期值应该是4+1，但是实际是8+1，而上面的层都没错，这说明代码最里面那层m_max_cursor_index赋值错了。

二、问题调查过程

1、发现了问题点就看看代码里面对于每层的m_max_cursor_index是怎么赋值的。

1、初始化sp_pcontext的时候所有的参数都为0
sp_pcontext::sp_pcontext(THD *thd) 
    : m_level(0),
      m_max_var_index(0),
      m_max_cursor_index(0)...{init(0, 0, 0, 0);}

2、每加一层sp_pcontext，当前的m_cursor_offset=上一层cursor个数
sp_pcontext::sp_pcontext(THD *thd, sp_pcontext *prev,  
                         sp_pcontext::enum_scope scope)
    : m_level(prev->m_level + 1),
      m_max_var_index(0),
      m_max_cursor_index(0)... {init(prev->current_cursor_count());}
void sp_pcontext::init(uint cursor_offset) {m_cursor_offset = cursor_offset;}
uint current_cursor_count() const {
    return m_cursor_offset + static_cast<uint>(m_cursors.size());
}

3、退出当前sp_pcontext层，需要把当前的max_cursor_index()信息值赋值给上一层的m_max_cursor_index，即当前的cursor数量累加给上一层
sp_pcontext *sp_pcontext::pop_context() {
    uint submax = max_cursor_index();
    if (submax > m_parent->m_max_cursor_index)
      m_parent->m_max_cursor_index = submax;
}
uint max_cursor_index() const {
    return m_max_cursor_index + static_cast<uint>(m_cursors.size());
  }

4、每次增加一个cursor，m_max_cursor_index值递增，m_max_cursor_index是计数器。
bool sp_pcontext::add_cursor(LEX_STRING name) {
  if (m_cursors.size() == m_max_cursor_index) ++m_max_cursor_index;

  return m_cursors.push_back(name);
}

2、根据第一步的分析，只在最里面那层的m_max_cursor_index累加出来计算错误，看看上面的累加过程，是用max_cursor_index()值来累加的，于是查看max_cursor_index()函数的实现：

uint max_cursor_index() const {
    return m_max_cursor_index + static_cast<uint>(m_cursors.size());
  }

这里是把当前层的m_max_cursor_index值加上m_cursors.size()，但是在函数add_cursor里面，m_cursors数组每增加一个cursor，m_max_cursor_index都要加1，也就是说在最里面那层sp_pcontext的计算重复了，计算了2遍m_cursors.size()，导致上面的level=2那层的m_max_cursor_index值变成2*4=8了。到这里问题点发现。

三、问题解决方案

通过以上代码解析后，可以考虑只对最里面那层sp_pcontext的max_cursor_index()取值进行修改，最里面那层的sp_pcontext没有m_children，因此可以用这个数组值进行判断。代码作如下修改：

uint max_cursor_index() const {
    if(m_children.size() == 0) -- 最里面那层sp_pcontext直接返回m_max_cursor_index的值。
    	return m_max_cursor_index; -- 可以改为static_cast<uint>(m_cursors.size())，二者值一样。
    else -- 上层sp_pcontext返回下层所有sp_pcontext的m_max_cursor_index的值，再加上当前层的m_cursors.size()值。
        return m_max_cursor_index + static_cast<uint>(m_cursors.size());
}

四、问题总结

在MySQL的sp里面使用cursor的话，因为m_max_cursor_index只用于统计，不用于实际赋值和计算过程，因此不影响使用。但是如果要用这个值用于二次开发，就要注意到这个问题。上面的修改方案只是其中一个解决方案，也可以根据自己的需要去改add_cursor的m_max_cursor_index的赋值过程。

这次发现的问题属于不参与计算的bug，但却影响开源代码的后续开发，在实际开发应用中类似的问题也要注意，一不小心就会踩坑。

Enjoy GreatSQL

标签：index,pcontext,sp,t1,cursor,max
From： https://www.cnblogs.com/greatsql/p/18112051