ITPub博客

首页 > 数据库 > 国内数据库 > 瀚高数据库日志挖掘方法

瀚高数据库日志挖掘方法

原创 国内数据库 作者:瀚高PG实验室 时间:2021-11-02 10:45:55 0 删除 编辑
环境
系统平台: Linux x86-64 Red Hat Enterprise Linux 7
版本: 4.3.4.7
详细信息
一、walminer介绍


1、简单介绍

walminer是Highgo开源的一款软件,WalMiner是从PostgreSQL的WAL(write ahead logs)日志中解析出执行的SQL语句的工具,并能生成出对应的undo SQL语句。


2、版本控制

支持范围:支持所有版本HGDB
PG9.5.0之前的版本没有做过测试
PG9.5.0~PG10.X版本使用WalMiner_10_X
PG11之后的版本使用WalMiner_11_X
10.0.1版本
1.WalMiner支持解析minimal级别以上的任何wal日志级别。
2.无需将表设置为IDENTITY FULL模式。
3.增加对系统表修改的wal记录的解析。
4.他库解析时relfilenode不匹配的bug修改
11.0.1版本
1.WalMiner支持解析minimal级别以上的任何wal日志级别。
2.无需将表设置为IDENTITY FULL模式。
3.增加对系统表修改的wal记录的解析。
4.他库解析时relfilenode不匹配的bug修改
5.pg11支持性修改


3、下载地址:
[WalMiner](
)


4、使用前提

数据据库日志级别配置需要大于minimal 若想做最完整的日志挖掘,建议设置为logical。

 


二、walminer的安装


1、确定数据库版本

highgo=# select version();
                                                 version                                                 
----------------------------------------------------------------------------------------------------------
 PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (NeoKylin 4.8.5-16), 64-bit
(1 row)
pg10.5,使用walminer_10_x版本
            


2、数据库运行参数设置

①创建归档目录
[root@mode2 pg_wal]# mkdir /hgdbbak/archive/ -p
②修改数据库参数
                

必须设置如下三个参数,据库日志级别配置需要大于minimal                

wal_level分为 minimal, replica, or logical,若想做最完整的日志挖掘,建议设置为logical。                

highgo=# alter system set wal_level = 'replica';
highgo=# alter system set archive_mode = on;
highgo=# alter system set archive_directory = '/hgdbbak/archive/';
                

③重启数据库生效                

[root@mode2 pg_wal]# pg_ctl restart                


3、安装walminer

①将下载的walniner文件夹上传至数据库代码的contrib目录下,执行make;make install
[root@mode2 walminer]# pwd
/opt/HighGo4.3.4.7-see/share/postgresql/contrib/walminer
[root@mode2 walminer]# ls
datadictionary.c logminer.c Makefile pg_logminer.c walminer--1.0.sql xlogminer_contents.c xlogreader_logmine
datadictionary.h logminer.h organizsql.c pg_logminer.h walminer.control xlogminer_contents.h
[root@mode2 walminer]# make
Makefile:16: ../../src/Makefile.global: No such file or directory
Makefile:17: /contrib/contrib-global.mk: No such file or directory
make: *** No rule to make target `/contrib/contrib-global.mk'. Stop.
               

②根据编译时的报错信息修改文件信息
[root@mode2 walminer]# vi Makefile
                
top_builddir = /opt/HighGo4.3.4.7-see/lib/postgresql/pgxs    ----修改此目录
include $(top_builddir)/src/Makefile.global
include $(top_srcdir)/contrib/contrib-global.mk
               
[root@mode2 walminer]# find / -name Makefile.global
/opt/HighGo4.3.4.7-see/lib/postgresql/pgxs/src/Makefile.global
                
[root@mode2 walminer]# make
Makefile:17: /opt/HighGo4.3.4.7-see/lib/postgresql/pgxs//contrib/contrib-global.mk: No such file or directory
make: *** No rule to make target `/opt/HighGo4.3.4.7-see/lib/postgresql/pgxs//contrib/contrib-global.mk'.  Stop.
                

③配置编译参数
[root@mode2 walminer]# make USE_PGXS=1
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o pg_logminer.o pg_logm
pg_logminer.c: In function ‘logminer_elog’:
pg_logminer.c:151:2: warning: function might be possible candidate for ‘gnu_printf’ format attribute [-Wsuggest-attrib
  vfprintf(tempFileOpen, _(fmt), args);
  ^
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o logminer.o logminer.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o organizsql.o organizs
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o xlogreader_logminer.ominer.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o datadictionary.o data
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o xlogminer_contents.o nts.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -shared -o walminer.so pg_logminer.o logminer.ologreader_logminer.o datadictionary.o xlogminer_contents.o -L/opt/HighGo4.3.4.7-see/lib    -Wl,--as-needed -Wl,-rpath,.4.7-see/lib',--enable-new-dtags 
④安装walminer
                   

[root@mode2 walminer]# make USE_PGXS=1 install
/usr/bin/mkdir -p '/opt/HighGo4.3.4.7-see/lib/postgresql'
/usr/bin/mkdir -p '/opt/HighGo4.3.4.7-see/share/postgresql/extension'
/usr/bin/mkdir -p '/opt/HighGo4.3.4.7-see/share/postgresql/extension'
/usr/bin/install -c -m 755  walminer.so '/opt/HighGo4.3.4.7-see/lib/postgresql/walminer.so'
/usr/bin/install -c -m 644 .//walminer.control '/opt/HighGo4.3.4.7-see/share/postgresql/extension/'
/usr/bin/install -c -m 644 .//walminer--1.0.sql  '/opt/HighGo4.3.4.7-see/share/postgresql/extension/'
                    

⑤数据库内创建walminer扩展                    

[root@mode2 walminer]# psql -U sysdba -d highgo
NOTICE: 
-------------------------------------------
Login User: sysdba
Login time: 2020-05-18 10:28:15.806217+08
Login Address: [local]
Last Login Status: SUCCESS
Login Failures: 0
Valied Until: infinity
-------------------------------------------
                   

psql (4.3.4.7)
Type "help" for help.
                    

highgo=# create extension walminer;
CREATE EXTENSION

                   


三、使用限制


1.只能解析DML语句,不处理DDL语句。

未来改动:DDL语句的解析已放入todolist,可能会逐步支持各种DDL语句。


2.执行了删除表、truncate表、更改表的表空间、更改表字段的类型、vacuum full,这样的DDL语句后,发生DDL语句之前的此表相关的DML语句不会再被解析。
应对措施:建议在执行表结构变更之前,先保存一份数据字典,用来保证可以解析历史wal日志。
未来改动:现在已经考虑在walminer内增加保存数据字典的功能。


3.解析结果依赖于数据字典。(举例:创建表t1,所有者为user1,但是中间将所有者改为user2。那解析结果中,所有t1相关操作所有者都将标示为user2)。
应对措施:建议在执行表结构变更之前,先保存一份数据字典,用来保证可以解析历史wal日志。
未来改动:现在已经考虑在walminer内增加保存数据字典的功能。


4.解析结果中undo字段的ctid属性是发生变更“当时”的值,如果因为vacuum等操作导致ctid发生变更,这个值将不准确。对于有可能存在重复行的数据,我们需要通过这个值确定undo对应的tuple条数,不代表可以直接执行该undo语句。


5.执行了表字段drop的DDL语句后,发生DDL语句之前的这个字段相关的值都会被解析为encode('AD976BC56F',hex)的形式,另外自定义类型也会解析为这种形式。


6.只能解析与数据字典时间线一致的wal文件。


7.WalMiner是个人出品,暂时未进行全面测试。


8.不建议使用walminer解析大宗copy语句(在同一个事务中插入大量数据行)产生的wal日志,这会导致解析过程中的效率低下和内存占用过高。


9、参数解释

--解析语法:                    

select walminer_start(’START_TIMSTAMP’,’STOP_TIMESTAMP’,’START_XID’,’STOP_XID’)                    

--如果分析全部日志:                
select   walminer_start('null','null',0,0);                
--将系统表修改结果输出到$PGDATA/walminer/temp下:                
select   walminer_start('null','null',0,0,true);                

START_TIMESTAMP:指定输出结果中最早的记录条目,即从该时间开始输出分析数据;若该参数值为空,则以分析日志列表中最早数据开始输出;若该参数值指定时间没有包含在所分析xlog列表中,即通过分析发现全部早于该参数指定时间,则返回空值。

STOP_TIMESTAMP:指定数据结果中最晚的记录条目,即输出结果如果大于该时间,则停止分析,不需要继续输出;如果该参数值为空,则从START_TIMESTAMP开始的所有日志都进行分析和输出。
START_XID:作用与START_TIMESTAMP相同,指定开始的XID值;
STOP_XID:作用与STOP_TIMESTAMP相同,指定结束的XID值
注意:两组参数只能有一组为有效输入,否则报错。


四、场景一  WAL日志生成库直接解析


1、新建表及测试数据

highgo=# create table t2(i int,j int, k varchar);               

CREATE TABLE
highgo=#
highgo=# insert into t2 values(1,1,'qqqqqq');
INSERT 0 1
highgo=# insert into t2 values(2,2,'wwwwww');
INSERT 0 1
highgo=# insert into t2 values(3,3,'eeeee');
INSERT 0 1
highgo=#  update t2 set k = '1111qqqqq' where i = 1;
UPDATE 1
highgo=# delete from t2 where j = 2;
DELETE 1
highgo=# insert into t2 values(4,4,'44444rrrrrr');
                    

INSERT 0 1                    


2、查看当前正在使用的wal日志,切换wal日志

highgo=# select pg_walfile_name(pg_current_wal_lsn());
     pg_walfile_name     
--------------------------
 000000010000000000000002
(1 row) 
                    

highgo=# select pg_switch_wal();                    

 pg_switch_wal                     

---------------
                   

 0/2034158
(1 row)
                

 

3、加载wal日志到解析器,本次添加pg_wal目录内的单个wal日志,注意是相对路径。可添加单个wal日志,也可以加载pg_wal目录或wal日志归档目录内的所有日志

highgo=# select walminer_wal_add('pg_wal/000000010000000000000002');
NOTICE:  Get data dictionary from current database.
  walminer_wal_add 
--------------------
 1 file add success
(1 row)
                    


                   

举例1:加载wal归档目录内的单个wal日志,使用绝对路径                    

highgo=# select walminer_wal_add('/hgdbbak/archive/000000010000000000000003');
NOTICE:  Get data dictionary from current database.
  walminer_wal_add 
--------------------
 1 file add success
(1 row)
                    


                   

举例2:添加整个wal归档目录内的日志,使用绝对路径                    

highgo=# select walminer_wal_add('/hgdbbak/archive');                    

  walminer_wal_add 
--------------------
 2 file add success
(1 row)
                   


4、查看解析器内加载的日志

highgo=# select walminer_wal_list();
              walminer_wal_list             
---------------------------------------------
 (/opt/HigoGO4.3.4.7-see/data/pg_wal/000000010000000000000003)
(1 row)
            


5、执行解析

highgo=# select walminer_start('NULL','NULL',0,0);
NOTICE:  Change Wal Segment To:pg_wal/000000010000000000000002
NOTICE:  Change Wal Segment To:(null)
   walminer_start   
---------------------
 walminer sucessful!
(1 row)
                


6、解析结果查看

highgo=# \x
Expanded display is on.
highgo=# select record_database,record_user,op_text,op_undo from walminer_contents;
-[ RECORD 1 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(1, 1, 'qqqqqq');
op_undo         | DELETE FROM "public"."t2" WHERE "i"=1 AND "j"=1 AND "k"='qqqqqq' AND ctid = '(0,1)';
-[ RECORD 2 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(2, 2, 'wwwwww');
op_undo         | DELETE FROM "public"."t2" WHERE "i"=2 AND "j"=2 AND "k"='wwwwww' AND ctid = '(0,2)';
-[ RECORD 3 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(3, 3, 'eeeee');
op_undo         | DELETE FROM "public"."t2" WHERE "i"=3 AND "j"=3 AND "k"='eeeee' AND ctid = '(0,3)';
-[ RECORD 4 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | UPDATE "public"."t2" SET "k" = '1111qqqqq' WHERE "i"=1 AND "j"=1 AND "k"='qqqqqq';
op_undo         | UPDATE "public"."t2" SET "k" = 'qqqqqq' WHERE "i"=1 AND "j"=1 AND "k"='1111qqqqq' AND ctid = '(0,4)'
-[ RECORD 5 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | DELETE FROM "public"."t2" WHERE "i"=2 AND "j"=2 AND "k"='wwwwww';
op_undo         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(2, 2, 'wwwwww');
-[ RECORD 6 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(4, 4, '44444rrrrrr');
op_undo         | DELETE FROM "public"."t2" WHERE "i"=4 AND "j"=4 AND "k"='44444rrrrrr' AND ctid = '(0,5)';
                
highgo=# \q                

注:通过解析内容可看到刚才执行的插入语句及对应的undo语句


7、停止解析,释放内存

test=# select walminer_stop();
   walminer_stop  
-------------------
 walminer cleaned!
(1 row)
            


五、从非wal产生的数据库中执行wal日志解析


1、在生产库中创建扩展

test=# create extension xlogminer;            


2、创建数据字典

test=# select walminer_build_dictionary('/hgdbbak/test');
 walminer_build_dictionary 
---------------------------
 Dictionary build success!
(1 row)
            


3、在测试库中创建扩展

highgo=# create extension walminer ;
CREATE EXTENSION
                

4、加载字典库,将在生产库创建的字典移动到/opt下,加载此字典库

highgo=# select walminer_load_dictionary('/opt/test')
highgo-# ;
 walminer_load_dictionary 
--------------------------
 Dictionary load success!
(1 row)
            


5、加载wal日志

highgo=# select walminer_wal_add('/hgdbbak/archive');
  walminer_wal_add 
--------------------
 3 file add success
(1 row)
                
highgo=# select walminer_wal_list();
              walminer_wal_list             
---------------------------------------------
 (/hgdbbak/archive/000000010000000000000003)
 (/hgdbbak/archive/000000010000000000000004)
 (/hgdbbak/archive/000000010000000000000005)
(3 rows)
                


6、开始解析wal日志

highgo=# select walminer_start('null','null',900,1000);
NOTICE: Change Wal Segment To:/hgdbbak/archive/000000010000000000000003
NOTICE: wal record after time 2020-05-18 11:02:36+08 or 0/300d2e8 will be analyse completely
NOTICE: Change Wal Segment To:/hgdbbak/archive/000000010000000000000004
NOTICE: Change Wal Segment To:/hgdbbak/archive/000000010000000000000005
NOTICE: Change Wal Segment To:(null)
   walminer_start   
---------------------
 walminer sucessful!
(1 row)
                


7、查看解析内容

highgo=# \x
Expanded display is on.
highgo=# select * from walminer_contents ;
-[ RECORD 1 ]-----+-------------------------------------------------------------------------------------------------
----------------------------------------------
xid | 983
virtualxid | 1
timestamptz | 2020-05-18 13:42:56.797565+08
record_database | test
record_user | sysdba
record_tablespace | pg_default
record_schema | public
op_type | INSERT
op_text | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(2, 'be236f80d7aa78bf06ca288edbdc03
3e', 3001);
op_undo | DELETE FROM "public"."gt_test1" WHERE "id"=2 AND "passwd"='be236f80d7aa78bf06ca288edbdc033e' AND
 "num"=3001 AND ctid = '(0,1)';
-[ RECORD 2 ]-----+-------------------------------
            



来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/69994931/viewspace-2840160/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
瀚高PG实验室(Highgo PG Lab)依托于瀚高DBA运维团队及瀚高数据库PG内核研发团队,旨在深入研究PostgreSQL技术、使用技巧、内核探秘、PG教学等,并进行分享。欢迎大家关注、交流。

注册时间:2021-02-18

  • 博文量
    75
  • 访问量
    27305