KingbaseES恢复被删除数据
生产环境操作请先备份整个data目录或者cp 个新的data目录在进行此操作恢复被删除数据,通过备份的新路径恢复数据之后,确认数据完整性。在把数据从备份环境导出重新导入到生产环境。
使用到的系统工具:
sys_resetwal
sys_waldump
准备测试环境:
[kingbase@postgres ~]$ sys_ctl -D data1 start
waiting for server to start....2022-11-11 10:40:58.095 CST [30813] LOG: sepapower extension initialized
2022-11-11 10:40:58.103 CST [30813] LOG: starting KingbaseES V008R006C006B0021 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
2022-11-11 10:40:58.103 CST [30813] LOG: listening on IPv4 address "0.0.0.0", port 54321
2022-11-11 10:40:58.103 CST [30813] LOG: listening on IPv6 address "::", port 54321
2022-11-11 10:40:58.106 CST [30813] LOG: listening on Unix socket "/tmp/.s.KINGBASE.54321"
2022-11-11 10:40:58.219 CST [30813] LOG: redirecting log output to logging collector process
2022-11-11 10:40:58.219 CST [30813] HINT: Future log output will appear in directory "sys_log".
done
server started
[kingbase@postgres ~]$ ksql -Usystem -dtest
ksql (V8.0)
Type "help" for help.
WARNING:License file will expire in 14 days.
test=# create table t2 (id int);
CREATE TABLE
test=# insert into t2 values (generate_series(1,5));
INSERT 0 5
test=# insert into t2 values (generate_series(1,1000));
INSERT 0 1000
test=# checkpoint;
CHECKPOINT
test=#
test=# select sys_current_wal_lsn(),sys_walfile_name(sys_current_wal_lsn()),sys_walfile_name_offset(sys_current_wal_lsn());
sys_current_wal_lsn | sys_walfile_name | sys_walfile_name_offset
--------------------+--------------------------+------------------------------------
0/730F790 | 000000010000000000000007 | (000000010000000000000007,3209104)
(1 row)
test=#
test=# checkpoint;
CHECKPOINT
test=# select count(*) from t2;
count
-------
1005
(1 row)
test=# delete from t2 where id>995;
DELETE 5
test=# select count(*) from t2;
count
-------
1000
(1 row)
test=# select sys_current_wal_lsn(),sys_walfile_name(sys_current_wal_lsn()),sys_walfile_name_offset(sys_current_wal_lsn());
sys_current_wal_lsn | sys_walfile_name | sys_walfile_name_offset
--------------------+--------------------------+------------------------------------
0/7311F80 | 000000010000000000000007 | (000000010000000000000007,3219328)
(1 row)
确认当前日志文件 + 事物(LSN)号
select sys_current_wal_lsn(),sys_walfile_name(sys_current_wal_lsn()),sys_walfile_name_offset(sys_current_wal_lsn());
sys_current_wal_lsn | sys_walfile_name | sys_walfile_name_offset
--------------------+--------------------------+------------------------------------
0/730F790 | 000000010000000000000007 | (000000010000000000000007,3209104)
(1 row)
根据查询的事务号以及日志文件,查找WAL日志确定恢复数据的范围
lsn数据类型可以被用来存储 LSN(日志序列号)数据,LSN 是一个指向WAL中的位置的指针。
这个类型是XLogRecPtr的一种表达并且是数据库内部系统类型。
在数据库内部,一个 LSN 是一个 64 位整数,表示在预写式日志流中的一个字节位置。
它由两个长度为8位的十六进制数组成,中间用斜线分隔,如6/7311F80。
lsn类型支持标准的比较操作符 = 和 >
两个 LSN 可以做相减操作, 结果将是分隔两个预写式日志位置的字节数
通过wal日志事物号找到delete的大概位置,这里位置是 1034
sys_waldump: fatal: could not find a valid record after 0/7311F80
[kingbase@postgres sys_wal]$ sys_waldump 000000010000000000000007 00000001000000000000000C -s 0/730F790
rmgr: Standby len (rec/tot): 42/ 42, tx: 0, lsn: 0/0730F790, prev 0/0730F760, desc: RUNNING_XACTS nextXid 1034 latestCompletedXid 120649616 oldestRunningXid 1034
rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn: 0/0730F7C0, prev 0/0730F790, desc: CHECKPOINT_ONLINE redo 0/730F790; tli 1; prev tli 1; fpw true; xid 0:1034; oid 24669; multi 1; offset 0; oldest xid 912 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 1034; online
rmgr: Standby len (rec/tot): 42/ 42, tx: 0, lsn: 0/0730F838, prev 0/0730F7C0, desc: RUNNING_XACTS nextXid 1034 latestCompletedXid 120649664 oldestRunningXid 1034
rmgr: XLOG len (rec/tot): 49/ 5917, tx: 0, lsn: 0/0730F868, prev 0/0730F838, desc: FPI_FOR_HINT , blkref #0: rel 1663/16269/2619 blk 23 FPW
rmgr: Heap len (rec/tot): 59/ 3751, tx: 1034, lsn: 0/07310FA0, prev 0/0730F868, desc: DELETE off 97 flags 0x00 KEYS_UPDATED , blkref #0: rel 1663/16269/16410 blk 4 FPW
rmgr: Heap len (rec/tot): 54/ 54, tx: 1034, lsn: 0/07311E48, prev 0/07310FA0, desc: DELETE off 98 flags 0x00 KEYS_UPDATED , blkref #0: rel 1663/16269/16410 blk 4
rmgr: Heap len (rec/tot): 54/ 54, tx: 1034, lsn: 0/07311E80, prev 0/07311E48, desc: DELETE off 99 flags 0x00 KEYS_UPDATED , blkref #0: rel 1663/16269/16410 blk 4
rmgr: Heap len (rec/tot): 54/ 54, tx: 1034, lsn: 0/07311EB8, prev 0/07311E80, desc: DELETE off 100 flags 0x00 KEYS_UPDATED , blkref #0: rel 1663/16269/16410 blk 4
rmgr: Heap len (rec/tot): 54/ 54, tx: 1034, lsn: 0/07311EF0, prev 0/07311EB8, desc: DELETE off 101 flags 0x00 KEYS_UPDATED , blkref #0: rel 1663/16269/16410 blk 4
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1034, lsn: 0/07311F28, prev 0/07311EF0, desc: COMMIT 2022-11-11 12:15:09.240177 CST
rmgr: Standby len (rec/tot): 42/ 42, tx: 0, lsn: 0/07311F50, prev 0/07311F28, desc: RUNNING_XACTS nextXid 1035 latestCompletedXid 120659752 oldestRunningXid 1035
sys_waldump: fatal: error in WAL record at 0/7311F50: invalid record length at 0/7311F80: wanted 24, got 0
通过以下命令进行在wal日志精确定位事务号
[kingbase@postgres sys_wal]$ sys_waldump 000000010000000000000007 00000001000000000000000C | grep DELETE
停止数据库
[kingbase@postgres ~]$ sys_ctl -D /home/kingbase/data1/ stop
waiting for server to shut down.... done
server stopped
通过指定事务号1034,进行恢复数据
[kingbase@postgres sys_wal]$ sys_resetwal -x 1034 -D /home/kingbase/data1/
Write-ahead log reset
[kingbase@postgres sys_wal]$
启动数据库并且查看数据是否恢复
[kingbase@postgres ~]$ sys_ctl -D /home/kingbase/data1/ start
waiting for server to start....2022-11-11 13:12:39.468 CST [7239] LOG: sepapower extension initialized
2022-11-11 13:12:39.481 CST [7239] LOG: starting KingbaseES V008R006C006B0021 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
2022-11-11 13:12:39.481 CST [7239] LOG: listening on IPv4 address "0.0.0.0", port 54321
2022-11-11 13:12:39.481 CST [7239] LOG: listening on IPv6 address "::", port 54321
2022-11-11 13:12:39.485 CST [7239] LOG: listening on Unix socket "/tmp/.s.KINGBASE.54321"
2022-11-11 13:12:39.615 CST [7239] LOG: redirecting log output to logging collector process
2022-11-11 13:12:39.615 CST [7239] HINT: Future log output will appear in directory "sys_log".
done
server started
[kingbase@postgres ~]$ ksql -Usystem -dtest
ksql (V8.0)
Type "help" for help.
WARNING:License file will expire in 14 days.
test=# select count(*) from t2;
count
-------
1005
(1 row)
test=#
经确认数据已经恢复.
在进行了sys_resetwal操作后,设置的LSN 号后的数据都取消了,相当于回滚到指定的事务LSN 号的位置.
WAL 日志整体被清理,原有的日志都会被清除
恢复之前的wal日志
[kingbase@postgres sys_wal]$ ll
total 98304
-rw------- 1 kingbase kingbase 16777216 Nov 11 12:15 000000010000000000000007
-rw------- 1 kingbase kingbase 16777216 Nov 11 12:11 000000010000000000000008
-rw------- 1 kingbase kingbase 16777216 Nov 11 12:11 000000010000000000000009
-rw------- 1 kingbase kingbase 16777216 Nov 11 12:11 00000001000000000000000A
-rw------- 1 kingbase kingbase 16777216 Nov 11 12:11 00000001000000000000000B
-rw------- 1 kingbase kingbase 16777216 Nov 11 12:11 00000001000000000000000C
drwx------ 2 kingbase kingbase 6 Nov 11 12:08 archive_status
恢复之后的日志:
[kingbase@postgres sys_wal]$ ll
total 16384
-rw------- 1 kingbase kingbase 16777216 Nov 11 13:17 00000001000000000000000D
drwx------ 2 kingbase kingbase 6 Nov 11 12:08 archive_status
[kingbase@postgres sys_wal]$
WAL日志切换
从上次 WAL 文件切换以来没有生成 WAL 的活动,则不会创建新的 WAL 文件,也两次手动切换之间需要有新数据生成,如果没有新数据生成,手动切换无效。
使 WAL 文件活动,会发生 WAL 文件切换:
SELECT sys_walfile_name(sys_switch_wal()), now(), sys_walfile_name(sys_switch_wal());
标签:11,wal,lsn,kingbase,sys,12,KingbaseES,resetwal
From: https://www.cnblogs.com/nwwhile/p/16926028.html