瀚高数据库
目录
文档用途
详细信息
相关文档
记录lsn三种记录形式,以及redo对lsn的操作。
详细信息
一、lsn的三种形式
1. pg_controldata中可以看到这样的lsn表示:
Latest checkpoint location: 0/1548018Latest
checkpoint’s REDO location: 0/1547FC8
Latest checkpoint’s REDO WAL file: 000000010000000000000001
2.pg中的一些控制函数也可以看到类似的表示:
psql (16.1)
Type "help" for help.
postgres=# select pg_current_wal_lsn();
pg_current_wal_lsn
--------------------
0/15480C8
(1 row)
[postgres@node1 pg_wal]$ pg_controldata | grep REDO
Latest checkpoint's REDO location: 0/15482D8
Latest checkpoint's REDO WAL file: 000000010000000000000001
postgres=# select pg_walfile_name('0/15480C8');
pg_walfile_name
--------------------------
000000010000000000000001
(1 row)
3. pg中的xlog文件命名看起来是另一种形式:
-rw-------. 1 postgres postgres 16777216 Jan 5 10:11 000000010000000000000001
二、解析文件名含义
一、文件名含义
000000010000000000000001
00000001 00000000 00000001
timeline logid segid
二、控制函数查出来的’A/B’形式的lsn含义
0/15480C8
0 1 5480C8
logid segid offset
三、pg内部变量含义:就是lsn不带’/’
(gdb) p RedoRecPtr
$6 = 22315736
(gdb) p/x RedoRecPtr
$7 = 0x15482d8
0 1 5482d8
logid segid offset
后两种是等价的。
xlog文件名是把lsn最后6个16进制省略了,也就是一个16MB的文件中可以保存省略的偏移量offset。
offset范围:0x000000 - 0xFFFFFF = 16MB 所以16MB的文件的每个字节都可以通过offset索引到。
如要调试,可参考pg16恢复过程的调用栈:
(gdb) bt
#0 ReadCheckpointRecord (xlogprefetcher=0xf7dde8, RecPtr=22315736, replayTLI=1) at xlogrecovery.c:3971
#1 0x0000000000599c6f in InitWalRecovery (ControlFile=0x7fffe41d4980, wasShutdown_ptr=0x7fffffffde2f, haveBackupLabel_ptr=0x7fffffffde2d,
haveTblspcMap_ptr=0x7fffffffde2e) at xlogrecovery.c:769
#2 0x0000000000589710 in StartupXLOG () at xlog.c:5171
#3 0x00000000009073af in StartupProcessMain () at startup.c:282
#4 0x00000000008fbd49 in AuxiliaryProcessMain (auxtype=StartupProcess) at auxprocess.c:141
#5 0x00000000009061d7 in StartChildProcess (type=StartupProcess) at postmaster.c:5372
#6 0x0000000000901959 in PostmasterMain (argc=1, argv=0xf7bf20) at postmaster.c:1458
#7 0x00000000007cb6d1 in main (argc=1, argv=0xf7bf20) at main.c:198
三、Redo point变更
使用pg_controldata查看当前ControlFile内容:
[postgres@node1 pg_wal]$ pg_controldata | grep REDO
Latest checkpoint's REDO location: 0/15482D8
Latest checkpoint's REDO WAL file: 000000010000000000000001
插入几条测试数据:
psql (16.1)
Type "help" for help.
postgres=# insert into test values(1);
INSERT 0 1
postgres=# insert into test values(2);
INSERT 0 1
postgres=# insert into test values(3);
INSERT 0 1
postgres=#
使用pg_waldump查看wal文件:
......
......
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0156AC60, prev 0/0156A9D8, desc: RUNNING_XACTS nextXid 735 latestCompletedXid 734 oldestRunningXid 735
rmgr: Heap len (rec/tot): 59/ 59, tx: 735, lsn: 0/0156AC98, prev 0/0156AC60, desc: INSERT+INIT off: 1, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0
rmgr: Btree len (rec/tot): 90/ 90, tx: 735, lsn: 0/0156ACD8, prev 0/0156AC98, desc: NEWROOT level: 0, blkref #0: rel 1663/5/16390 blk 1, blkref #2: rel 1663/5/16390 blk 0
rmgr: Btree len (rec/tot): 64/ 64, tx: 735, lsn: 0/0156AD38, prev 0/0156ACD8, desc: INSERT_LEAF off: 1, blkref #0: rel 1663/5/16390 blk 1
rmgr: Transaction len (rec/tot): 34/ 34, tx: 735, lsn: 0/0156AD78, prev 0/0156AD38, desc: COMMIT 2024-01-05 17:31:07.213249 CST
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0156ADA0, prev 0/0156AD78, desc: RUNNING_XACTS nextXid 736 latestCompletedXid 735 oldestRunningXid 736
rmgr: Heap len (rec/tot): 59/ 59, tx: 736, lsn: 0/0156ADD8, prev 0/0156ADA0, desc: INSERT off: 2, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0
rmgr: Btree len (rec/tot): 64/ 64, tx: 736, lsn: 0/0156AE18, prev 0/0156ADD8, desc: INSERT_LEAF off: 2, blkref #0: rel 1663/5/16390 blk 1
rmgr: Transaction len (rec/tot): 34/ 34, tx: 736, lsn: 0/0156AE58, prev 0/0156AE18, desc: COMMIT 2024-01-05 17:31:16.074250 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 737, lsn: 0/0156AE80, prev 0/0156AE58, desc: INSERT off: 3, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0
rmgr: Btree len (rec/tot): 64/ 64, tx: 737, lsn: 0/0156AEC0, prev 0/0156AE80, desc: INSERT_LEAF off: 3, blkref #0: rel 1663/5/16390 blk 1
rmgr: Transaction len (rec/tot): 34/ 34, tx: 737, lsn: 0/0156AF00, prev 0/0156AEC0, desc: COMMIT 2024-01-05 17:31:22.578713 CST
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0156AF28, prev 0/0156AF00, desc: RUNNING_XACTS nextXid 738 latestCompletedXid 737 oldestRunningXid 738
pg_waldump: error: error in WAL record at 0/156AF28: invalid record length at 0/156AF60: expected at least 24, got 0
最后一个XLOG Record记录的位置为0/0156AF28,加上记录大小50(十六进制为0x32),位置为0/0156AF5A,按理论上来说,如果现在执行checkpoint,该位置为REDO point.
执行checkpoint:
postgres=# checkpoint ;
2024-01-05 17:41:28.687 CST [9080] LOG: checkpoint starting: immediate force wait
2024-01-05 17:41:28.710 CST [9080] LOG: checkpoint complete: wrote 40 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.007 s, sync=0.004 s, total=0.024 s; sync files=33, longest=0.003 s, average=0.001 s; distance=139 kB, estimate=139 kB; lsn=0/156AF98, redo lsn=0/156AF60
CHECKPOINT
postgres=#
pg_controldata查看:
[postgres@node1 pg_wal]$ pg_controldata|grep 'REDO location'
Latest checkpoint's REDO location: 0/156AF60
再次查看XLOG Record记录:
......
rmgr: Btree len (rec/tot): 64/ 64, tx: 735, lsn: 0/0156AD38, prev 0/0156ACD8, desc: INSERT_LEAF off: 1, blkref #0: rel 1663/5/16390 blk 1
rmgr: Transaction len (rec/tot): 34/ 34, tx: 735, lsn: 0/0156AD78, prev 0/0156AD38, desc: COMMIT 2024-01-05 17:31:07.213249 CST
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0156ADA0, prev 0/0156AD78, desc: RUNNING_XACTS nextXid 736 latestCompletedXid 735 oldestRunningXid 736
rmgr: Heap len (rec/tot): 59/ 59, tx: 736, lsn: 0/0156ADD8, prev 0/0156ADA0, desc: INSERT off: 2, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0
rmgr: Btree len (rec/tot): 64/ 64, tx: 736, lsn: 0/0156AE18, prev 0/0156ADD8, desc: INSERT_LEAF off: 2, blkref #0: rel 1663/5/16390 blk 1
rmgr: Transaction len (rec/tot): 34/ 34, tx: 736, lsn: 0/0156AE58, prev 0/0156AE18, desc: COMMIT 2024-01-05 17:31:16.074250 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 737, lsn: 0/0156AE80, prev 0/0156AE58, desc: INSERT off: 3, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0
rmgr: Btree len (rec/tot): 64/ 64, tx: 737, lsn: 0/0156AEC0, prev 0/0156AE80, desc: INSERT_LEAF off: 3, blkref #0: rel 1663/5/16390 blk 1
rmgr: Transaction len (rec/tot): 34/ 34, tx: 737, lsn: 0/0156AF00, prev 0/0156AEC0, desc: COMMIT 2024-01-05 17:31:22.578713 CST
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0156AF28, prev 0/0156AF00, desc: RUNNING_XACTS nextXid 738 latestCompletedXid 737 oldestRunningXid 738
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0156AF60, prev 0/0156AF28, desc: RUNNING_XACTS nextXid 738 latestCompletedXid 737 oldestRunningXid 738
rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn: 0/0156AF98, prev 0/0156AF60, desc: CHECKPOINT_ONLINE redo 0/156AF60; tli 1; prev tli 1; fpw true; xid 0:738; oid 24579; multi 1; offset 0; oldest xid 722 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 738; online
可以看到0/0156AF60是最后一条checkpoint记录的起始写入位置,REDO point为0/0156AF60,比预想的要多出6个字节(0/0156AF60-0/0156AF5A = 6)。
[postgres@node1 pg_wal]$ echo "obase=10;ibase=16;56AF28"|bc
5680936
[postgres@node1 pg_wal]$ hexdump -C 000000010000000000000001 -s 5680936 -n 50
0056af28 32 00 00 00 00 00 00 00 00 af 56 01 00 00 00 00 |2.........V.....|
0056af38 10 08 00 00 09 be d0 ba ff 18 00 00 00 00 00 00 |................|
0056af48 00 00 00 e3 bd 2d e2 02 00 00 e2 02 00 00 e1 02 |.....-..........|
0056af58 00 00 |..|
0056af5a
[postgres@node1 pg_wal]$
XLOG Record的头部首先是XLogRecord结构体,第一个域是uint32的record的大小,即0x00000032,十进制为50,跟pg_waldump打印出的一样。
[postgres@node1 pg_wal]$ hexdump -C 000000010000000000000001 -s 5680986 -n 6
0056af5a 00 00 00 00 00 00 |......|
0056af60
[postgres@node1 pg_wal]$
后续的6个字节均为0x00
[postgres@node1 pg_wal]$ hexdump -C 000000010000000000000001 -s 5681048 -n 114
0056af98 72 00 00 00 00 00 00 00 60 af 56 01 00 00 00 00 |r.......`.V.....|
0056afa8 10 00 00 00 e0 5a e5 07 ff 58 60 af 56 01 00 00 |.....Z...X`.V...|
0056afb8 00 00 01 00 00 00 01 00 00 00 01 00 00 00 00 00 |................|
0056afc8 00 00 e2 02 00 00 00 00 00 00 03 60 00 00 01 00 |...........`....|
0056afd8 00 00 00 00 00 00 d2 02 00 00 01 00 00 00 01 00 |................|
0056afe8 00 00 01 00 00 00 00 00 00 00 c8 ce 97 65 00 00 |.............e..|
0056aff8 00 00 00 00 00 00 00 00 00 00 e2 02 00 00 00 00 |................|
0056b008 00 00 |..|
0056b00
上述图有点老了,pg16CheckPoint是88B,图片来源:https://www.interdb.jp/pg/pgsql09/04.html
大小为0x00000072,即114B(头部XLogRecord 24B + XLogRecordDataHeaderShort 2B + Checkpoint 88B )。从FF 58为XLogRecordDataHeaderShort结构体的内容,0xFF为标志位,0x58为Data的大小(即88B).
checkpoint记录的内容详见Checkpoint结构体,该结构体第一个域字段为8个字节的LSN–>0x00000000 0156AF60,即REDO point:0/0156AF60.