案例说明:
数据库在日常的维护过程中,执行表结构查询语句(\d t1),如下图所示,出现“missing chunk number 0 for toast value 16259 in pg_toast_2619”,从报错信息看和toast表有关。本案例复现了以上故障,并提供了解决方案。
适用版本:
KingbaseES V8R6
一、案例复现
TOAST存储的表不能单独创建,只有当普通表包含了main,extended或external存储格式的字段时,系统会自动创建一个和普通表关联的TOAST表。当一行记录(tuple)存储的(包括压缩后的大小)大小超过TOAST_TUPLE_THRESHOLD(默认2K)时,会存储到TOAST表。
1、查询toast表对应的源表
Tips:
toast表pg_toast_2619的‘2619’,对应着源表在pg_class中的OID,可以通过此OID查询表名称。
test=# select oid,reltoastrelid ,relname from pg_class where oid=2619;
oid | reltoastrelid | relname
------+---------------+--------------
2619 | 2840 | pg_statistic
(1 row)
Tips:
pg_statistic表存储有关该数据库内容的统计数据。 记录是由ANALYZE创建的,并且随后被查询规划器使用。
2、查看toast表名称
test=# select relname from pg_class where oid=2840;
relname
---------------
pg_toast_2619
(1 row)
二、复现toast表故障
1、查看toast表文件存储路径
prod=# select pg_relation_filepath('pg_toast.pg_toast_2619');
pg_relation_filepath
----------------------
base/16385/2840
(1 row)
2、模拟toast文件数据破坏
# 查看toast表文件信息
[kingbase@node102 16385]$ ls -lh /data/kingbase/v8r6_c6/data/base/16385/2840
-rw------- 1 kingbase kingbase 32K Oct 27 14:18 /data/kingbase/v8r6_c6/data/base/16385/2840
# 模拟数据文件故障
[kingbase@node102 v8r6_c6]$ dd if=/dev/zero of=/data/kingbase/v8r6_c6/data/base/16385/2840 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000186372 s, 2.7 MB/s
# 查看数据文件信息
[kingbase@node102 v8r6_c6]$ ls -lh /data/kingbase/v8r6_c6/data/base/16385/2840
-rw------- 1 kingbase kingbase 512 Nov 2 14:19 /data/kingbase/v8r6_c6/data/base/16385/2840
3、执行数据库访问(出现toast表访问错误)
prod=# \d+ test_t1;
ERROR: could not read block 0 in file "base/16385/2840": read only 512 of 8192 bytes
常见 pg_statistic toast 访问问题
prod=# select count(*) from pg_statistic;
count
-------
508
(1 row)
prod=# select * from pg_statistic;
ERROR: missing chunk number 0 for toast value 16259 in pg_toast_2619
prod=# \d pg_toast.pg_toast_2619
ERROR: could not read block 0 in file "base/16385/2840": read only 512 of 8192 bytes
三、toast表故障解决步骤
# 清理pg_statistic中的数据
prod=# delete from pg_statistic;
ERROR: permission denied: "pg_statistic" is a system catalog
# 默认对系统表用户没有权限执行系统表修改操作
prod=# show allow_system_table_mods;
allow_system_table_mods
-------------------------
off
(1 row)
# 配置系统参数允许管理员修改系统表
prod=# alter system set allow_system_table_mods=on;
ALTER SYSTEM
# 重启数据库服务
[kingbase@node102 bin]$ ./sys_ctl restart -D /data/kingbase/v8r6_c6/data/
waiting for server to shut down.... done
......
# 连接数据库清理pg_statistic表数据
[kingbase@node102 bin]$ ./ksql -U system test -p 54325
ksql (V8.0)
Type "help" for help.
test=# show allow_system_table_mods;
allow_system_table_mods
-------------------------
on
(1 row)
# 仍然没有对系统表的DML权限
prod=# delete from pg_statistic;
ERROR: permission denied: "pg_statistic" is a system catalog
# 执行truncate清理系统表
prod=# truncate table pg_statistic;
TRUNCATE TABLE
# pg_statistic表清理后查询(可以正常查询)
prod=# select * from pg_statistic;
starelid | staattnum | stainherit | stanullfrac | stawidth | stadistinct | stakind1 | stakind2 | stakind3 | stakind4 | stakind5 | staop1 | staop2 | staop3 | staop4 | staop5 | stacoll1 | stacoll2 | stacoll3 | stacoll4 | stacoll5 | stanumbers1 | stanumbers2 | stanumbers3 | stanumbers4 | stanumbers5 | stavalues1 | stavalues2 | stavalues3 | stavalues4 | stavalues5
----------+-----------+------------+-------------+----------+-------------+----------+----------+----------+----------+----------+--------+--------+--------+--------+--------+----------+----------+----------+----------+----------+-------------+-------------+-------------+-------------+-------------+------------+------------+------------+------------+------------
(0 rows)
prod=# select count(*) from pg_toast.pg_toast_2619;
count
-------
0
(1 row)
# 查询表结构
prod=# \d test_t1;
Table "public.test_t1"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
name | varchar | | |
# 执行表分析
prod=# vacuum analyze t1;
VACUUM
# 执行表分析后pg_statistic有数据插入
prod=# select * from pg_statistic;
starelid | staattnum | stainherit | stanullfrac | stawidth | stadistinct | stakind1 | stakind2 | stakind3 | stakind4 | stakind5 | staop1 | staop2 | staop3 | staop4 | staop5 | stacoll1 | stacoll2 | stacoll3 | stacoll4 | stacoll5 | stanumbers1 | stanumbers2 | stanumbers3 | stanumbers4 | stanumbers5 | stavalues1 | stavalues2 | stavalues3 | stavalues4 | stavalues5
----------+-----------+------------+-------------+----------+-------------+----------+----------+----------+----------+----------+--------+--------+--------+--------+--------+----------+----------+----------+----------+----------+-------------+-------------+-------------+-------------+-------------+------------+------------+------------+------------+------------
16386 | 1 | f | 0 | 4 | -1 | 0 | 0 | 0 | 0 |0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | | | | | | | | |
(1 row)
# 如上所示,pg_statistic及toast表都可以正常访问。
四、总结
以上toast表故障发生在pg_statistic表,对于pg_statistic表数据可以通过truncate方式清理,但是如果是业务普通表,将不能直接通过truncate方式清理(除非有备份),对于业务普通表的toast故障,将在后面的案例中分析。