ITPub博客

首页 > 数据库 > PostgreSQL > POSTGRESQL10.3源码安装主从搭建 pgpoolii + keepalive 高可用(未成功,实验中)

POSTGRESQL10.3源码安装主从搭建 pgpoolii + keepalive 高可用(未成功,实验中)

原创 PostgreSQL 作者:germany006 时间:2018-06-01 14:20:37 0 删除 编辑


一、下载POSTGRESQL源码安装包及主机配置

https://www.postgresql.org/ftp/source/v10.3/
postgresql-10.3.tar.gz

虚拟机环境
node1  192.168.159.151
node2  192.168.159.152

操作系统为redhat6.5
数据库为postgresql10.3

两个节点均配置/etc/hosts
vi /etc/hosts
node1  192.168.159.151
node2  192.168.159.152



二、编译安装
(1)创建postgres用户
useradd -m -r -s /bin/bash -u 5432 postgres
(2)安装相关依赖包
yum install gettext gcc make perl python perl-ExtUtils-Embed   readline-devel   zlib-devel    openssl-devel   libxml2-devel  cmake  gcc-c++ libxslt-devel  openldap-devel  pam-devel  python-devel  cyrus-sasl-devel  libgcrypt-devel  libgpg-error-devel  libstdc++-devel

(3)配置POSTGRES
./configure --prefix=/opt/postgresql-10.3 --with-segsize=8 --with-wal-segsize=64 --with-wal-blocksize=16 --with-blocksize=16 --with-libedit-preferred --with-perl --with-python --with-openssl --with-libxml --with-libxslt --enable-thread-safety --enable-nls=zh_CN

最后几行出现以下黄色输出即配置正确,否则根据报错提示继续安装依赖包
configure: using CPPFLAGS= -D_GNU_SOURCE -I/usr/include/libxml2 
configure: using LDFLAGS=  -Wl,--as-needed
configure: creating ./config.status
config.status: creating GNUmakefile
config.status: creating src/Makefile.global
config.status: creating src/include/pg_config.h
config.status: creating src/include/pg_config_ext.h
config.status: creating src/interfaces/ecpg/include/ecpg_config.h
config.status: linking src/backend/port/tas/dummy.s to src/backend/port/tas.s
config.status: linking src/backend/port/dynloader/linux.c to src/backend/port/dynloader.c
config.status: linking src/backend/port/posix_sema.c to src/backend/port/pg_sema.c
config.status: linking src/backend/port/sysv_shmem.c to src/backend/port/pg_shmem.c
config.status: linking src/backend/port/dynloader/linux.h to src/include/dynloader.h
config.status: linking src/include/port/linux.h to src/include/pg_config_os.h
config.status: linking src/makefiles/Makefile.linux to src/Makefile.port

(4)编译
make && make install

最后几行出现以下黄色输出即配置正确
make[1]: Leaving directory `/opt/postgresql-10.3/src'
make -C config install
make[1]: Entering directory `/opt/postgresql-10.3/config'
/bin/mkdir -p '/opt/postgresql-10.3/lib/pgxs/config'
/usr/bin/install -c -m 755 ./install-sh '/opt/postgresql-10.3/lib/pgxs/config/install-sh'
/usr/bin/install -c -m 755 ./missing '/opt/postgresql-10.3/lib/pgxs/config/missing'
make[1]: Leaving directory `/opt/postgresql-10.3/config'
PostgreSQL installation complete.

(5)安装
make world && make install -world

最后几行出现以下黄色输出即配置正确
make[1]: Leaving directory `/opt/postgresql-10.3/src'
make -C config install
make[1]: Entering directory `/opt/postgresql-10.3/config'
/bin/mkdir -p '/opt/postgresql-10.3/lib/pgxs/config'
/usr/bin/install -c -m 755 ./install-sh '/opt/postgresql-10.3/lib/pgxs/config/install-sh'
/usr/bin/install -c -m 755 ./missing '/opt/postgresql-10.3/lib/pgxs/config/missing'
make[1]: Leaving directory `/opt/postgresql-10.3/config'
PostgreSQL installation complete.
make: Leaving directory `/opt/postgresql-10.3'

(6)创建相关目录及配置环境变量
mkdir -p /data/pgdata/serverlog
mkdir /data/pg
su - postgres
vi .bash_profile  (删除原来的所有,以下黄色部分直接复制粘贴)
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

# User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH

# postgres
#PostgreSQL端口
PGPORT=5432

#PostgreSQL数据目录
PGDATA=/data/pgdata
export PGPORT PGDATA 

#所使用的语言
export LANG=zh_CN.utf8

#PostgreSQL 安装目录
export PGHOME=/data/pg

#PostgreSQL 连接库文件
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH
export DATE=`date +"%Y%m%d%H%M"`

#将PostgreSQL的命令行添加到 PATH 环境变量
export PATH=$PGHOME/bin:$PATH

#PostgreSQL的 man 手册
export MANPATH=$PGHOME/share/man:$MANPATH

#PostgreSQL的默认用户
export PGUSER=postgres

#PostgreSQL默认主机地址
export PGHOST=127.0.0.1

#默认的数据库名
export PGDATABASE=postgres

#定义日志存放目录
PGLOG="$PGDATA/serverlog"


source .bash_profile

(7)初始化数据库
#执行数据库初始化脚本
root用户登录
chown -R postgres.postgres /data/
su - postgres
$/opt/postgresql-10.3/bin/initdb --encoding=utf8 -D /data/pg/data


警告:为本地连接启动了 "trust" 认证.
你可以通过编辑 pg_hba.conf 更改或你下次
行 initdb 时使用 -A或者--auth-local和--auth-host选项.


Success. You can now start the database server using:


启动数据库
su - postgres
/opt/postgresql-10.3/bin/pg_ctl -D /data/pg/data -l logfile start



(8)相关命令拷贝
root用户
mkdir /data/pg/bin
cp /opt/postgresql-10.3/bin/*   /data/pg/bin
chown -R postgres.postgres  /data/pg/bin


三、postgresql主从搭建

1、主库配置


(1)创建一个用户复制的用户replica
su - postgres
psql
CREATE ROLE replica login replication encrypted password 'replica';
 
(2)修改pg_hba.conf文件,指定replica登录网络(最后一添加)
vi /data/pg/data/pg_hba.conf
host    replication     replica            192.168.159.0/24           md5
host    all          replica           192.168.159.0/24           trust
 
(3)主库配置文件修改以下几项,其他不变
vi /data/pg/ data/ postgresql.conf
listen_addresses = '*'
wal_level = hot_standby  #热备模式
max_wal_senders= 6 #可以设置最多几个流复制链接,差不多有几个从,就设置多少
wal_keep_segments = 10240  #重要配置
wal_send_timeout = 60s
max_connections = 512 #从库的 max_connections要大于主库
archive_mode = on #允许归档
archive_command = 'cp %p /data/pg/data/archive/%f'     #根据实际情况设置

checkpoint_timeout = 30min
max_wal_size = 3GB
min_wal_size = 64MB

mkdir /data/pg/data/archive


2、从库环境
(1)把备库的数据文件夹目录清空
rm -rf /var/lib/pgsql/10/data/*
(2)在备库上运行
pg_basebackup -F p --progress -D /data/pg/data/ -h 192.168.159.151 -p 5432 -U replica --password

输入密码replica 

!!!注意,复制完成后,在备库一定要将数据目录下的所有文件重新授权
chown -R postgres.postgres /data/pg /data/
 
(3)创建recovery.conf 文件
cp  /opt/postgresql-10.3/share/recovery.conf.sample /data/pg /data/ recovery.conf
 
vi /data/pg /data/ recovery.conf
standby_mode = on
primary_conninfo = 'host=192.168.159.151 port=5432 user=replica password=replica'
recovery_target_timeline = 'latest'
trigger_file = ' /data/pg /data/ trigger.kenyon '
 
(4)配置 postgresql.conf文件
vi /data/pg /data/ postgresql.conf
listen_addresses ='*'
wal_level = hot_standby
max_connections =1000  #一般从的最大链接要大于主的
hot_standby =on   #说明这台机器不仅仅用于数据归档,也用于查询
max_standby_streaming_delay =30s
wal_receiver_status_interval = 10s   #多久向主报告一次从的状态
hot_standby_feedback = on    #如果有错误的数据复制,是否向主进行范例



(5) 启动备库

su - postgres
/opt/postgresql-10.3/bin/pg_ctl -D /data/pg/data/ -l logfile start
如果无法启动,到主库复制文件 postmaster.opts到备库 如下操作:
scp /data/pg/data /postmaster.opts 192.168.159.152: /data/pg/data/
chown -R postgres.postgres /data/pg/data/
cd /data/pg/
chmod 700 data/

 
3、验证主从功能
主库查询
su - postgres
psql
postgres=# select client_addr,sync_state from pg_stat_replication;
   client_addr   | sync_state
-----------------+------------
 192.168.159.152 | async
(1 row)



发现登陆postgres时出现以下问题
-bash-4.1$ 
root用户执行
cp /etc/skel/.bash* /var/lib/pgsql/
再次登陆即可变成
[postgres@node1 ~]$


4、手动主备切换
(1)创建备库recovery.conf 文件( 在备库上操作192.168.159.152)
  cp $PGHOME/share/recovery.conf.sample $PGDATA/recovery.conf
  
配置以下参数  
standby_mode = 'on'  --标记PG为STANDBY SERVER
primary_conninfo = 'host=192.168.159.151  port=5432 user=replica password=replica'   --标识主库信息
trigger_file = '/data/pg/data/trigger.kenyon'     --标识触发器文件

(2) 关闭主库(在主库上操作 192.168.159.151
/opt/postgresql-10.3/bin/pg_ctl -D /data/pg/data/ -l logfile stop


(3)激活备库到主库状态 ( 在备库上操作 192.168.159.152 )
   激活备库只要创建一个文件即可,根据备库 recovery.conf 配置文件的参数 trigger_file 值,
  创建这个 trigger 文件即可。 例如 "touch /data/pg/data/trigger.kenyon "
  
touch /data/pg/data/trigger.kenyon
过一会儿发现 recovery.conf 文件变成 recovery.done ,说明备库已经激活。

查看logfile日志,出现以下信息即激活
2018-06-04 21:11:01.137 PDT [12818] 日志:  已找到触发器文件:/data/pg/data/trigger.kenyon
2018-06-04 21:11:01.148 PDT [12818] 日志:  redo 在 0/C02A390 完成
2018-06-04 21:11:01.172 PDT [12818] 日志:  已选择的新时间线ID:2
2018-06-04 21:11:05.442 PDT [12818] 日志:  归档恢复完毕
2018-06-04 21:11:05.568 PDT [12817] 日志:  数据库系统准备接受连接

(4)修改原来主库的配置文件
vi /data/pg/data/postgresql.conf
max_connections = 1500  #从库值要比主库值大


(5) 激活原来的主库,让其转变成从库  (在原来的主库上执行 192.168.159.151
--创建 $PGDATA/recovery.conf 文件,配置以下参数
vi /data/pg/data/ recover y.conf
recovery_target_timeline = 'latest'
standby_mode = 'on'  #--标记PG为STANDBY SERVER
primary_conninfo = 'host=192.168.159.152 port=5432 user=replica password= replic a '   #--标识主库信息
trigger_file = ' /data/pg/data/trigger.kenyon '     #--标识触发器文件


--修改 pg_hba.conf (现在的主库上增加 192.168.159.152 ),添加以下配置
vi /data/pg/data/ pg_hba.conf
host   replication     replica           192.168.159.151/32         md5

启动原来的主库即现在的从库( 192.168.159.151 )
/opt/postgresql-10.3/bin/pg_ctl -D /data/pg/data/ -l logfile start

查看现在的从库logfile日志发现报错信息
2018-06-05 00:08:00.326 PDT [9729] 详细信息:  WAL结束时,到了时间线1和地址0/C02A400.
2018-06-05 00:08:00.327 PDT [9725] 日志:  在当前恢复点0/C02A630之前, 新的时间点2脱离了当前茅的数据库系统时间点1
2018-06-05 00:08:05.322 PDT [9729] 日志:  在0/C000000处时间线1上重启WAL流操作
2018-06-05 00:08:05.327 PDT [9729] 日志:  复制由主用服务器终止
2018-06-05 00:08:05.327 PDT [9729] 详细信息:  WAL结束时,到了时间线1和地址0/C02A400.
2018-06-05 00:08:05.329 PDT [9725] 日志:  在当前恢复点0/C02A630之前, 新的时间点2脱离了当前茅的数据库系统时间点1
2018-06-05 00:08:10.328 PDT [9729] 日志:  在0/C000000处时间线1上重启WAL流操作
2018-06-05 00:08:10.332 PDT [9729] 日志:  复制由主用服务器终止
2018-06-05 00:08:10.332 PDT [9729] 详细信息:  WAL结束时,到了时间线1和地址0/C02A400.
2018-06-05 00:08:10.333 PDT [9725] 日志:  在当前恢复点0/C02A630之前, 新的时间点2脱离了当前茅的数据库系统时间点1

在现在的主库操作:
scp /data/pg/data/pg_wal/ 00000002.history 192.168.159.151:/data/pg/data/pg_wal/


(6)两个节点都操作
vi /data/pg/data/recovery.conf
restore_command = 'cp /data/pg/data/archive/%f %p'

mkdir  /data/pg/data/archive
chown postgres.postgres  /data/pg/data/archive

vi /data/pg/data/postgresql.conf
archive_command = 'cp %p /data/pg/data/archive/%f' 

四、安装PGPOOL
(1)配置两台机器的ssh免密钥登录
1节点
[postgres@node1]$ ssh-keygen -t rsa
全部回车默认
[postgres@node1]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[postgres@node1]$ chmod go-rwx ~/.ssh/*
[postgres@node1]$ cd ~/.ssh


2节点
[postgres@node2$ ssh-keygen -t rsa
全部回车默认
[postgres@node2$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[postgres@node2$ chmod go-rwx ~/.ssh/*
[postgres@node2] cd ~/.ssh


1节点
[postgres@node1]$ scp id_rsa.pub 192.168.159.152:/home/postgres/.ssh/id_rsa.pub1
2节点
[postgres@node2] cat id_rsa.pub1 >> authorized_keys
[postgres@node2]scp id_rsa.pub 192.168.159.151:/home/postgres/.ssh/id_rsa.pub2
1节点
[postgres@node1] cat id_rsa.pub2  >> authorized_keys


(2)安装pgpool ii
安装pgpool ii 
yum -y  install libmemcached  postgresql-libs.x86_64   openssl098e    
(这里注意 一定要先安装这些YUM源,不然死活安装不了pgpool II

rpm -ivh pgpool-II-pg10-3.7.2-1pgdg.rhel6.x86_64.rpm


pg_md5 -u postgres -p
密码设置为postgres
输出的密码编码为
e8a48653851e28c69d0506508fb27fc5

vi /etc/pgpool-II/pcp.conf  #最后一行添加
postgres:e8a48653851e28c69d0506508fb27fc5




mkdir -p /opt/pgpool/oiddir
cp /etc/pgpool-II/pgpool.conf /etc/pgpool-II/pgpool.conf.bak


ifconfig查看下网卡
[root@node1 pgpool-II]# ifconfig
eth1      Link encap:Ethernet  HWaddr 00:0C:29:9E:E8:6D  
          inet addr:192.168.159.152  Bcast:192.168.159.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe9e:e86d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:14557 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10820 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1889055 (1.8 MiB)  TX bytes:1485329 (1.4 MiB)


lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:5029 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5029 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:2786891 (2.6 MiB)  TX bytes:2786891 (2.6 MiB)

注意: 这里我的网卡是eth1,所以,下面2个节点的黄色字体配置要相应变化
如果网卡配置不正确,会报类似这样的报错arping: unknown iface eth0


1节点
vi /etc/pgpool-II/pgpool.conf
listen_addresses = '*'
port = 9999
socket_dir = '/opt/pgpool'
pcp_port = 9898
pcp_socket_dir = '/opt/pgpool'
backend_hostname0 = '192.168.159.151'   ##配置数据节点 node1
backend_port0 = 5432
backend_weight0 = 1
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = '192.168.159.152'   ##配置数据节点  node2
backend_port1 = 5432
backend_weight1 = 1
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0
log_destination = 'syslog'
print_timestamp = on
log_connections = on
log_hostname = on
log_statement = on
log_per_node_statement = off
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 0
pid_file_name = '/opt/pgpool/pgpool.pid'
logdir = '/tmp'
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = on
lobj_lock_table = ''
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = on
ignore_leading_white_space = on
white_function_list = ''
black_function_list = 'nextval,setval'
master_slave_mode = on # 设置流复制模式
master_slave_sub_mode = 'stream' # 设置流复制模式
sr_check_period = 5
sr_check_user = 'replica'
sr_check_password = 'replica'
delay_threshold = 16000
follow_master_command = ''
parallel_mode = off
pgpool2_hostname = ''
system_db_hostname  = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'pgpool'
system_db_password = ''
health_check_period = 5
health_check_timeout = 20
health_check_user = 'replica'
health_check_password = 'replcia'
health_check_max_retries = 3
health_check_retry_delay = 1
failover_command = '/opt/pgpool/failover_stream.sh  %d %H /data/pg /data/ trigger.kenyon '  
failback_command = ''
fail_over_on_backend_error = on
search_primary_node_timeout = 10
recovery_user = 'nobody'
recovery_password = ''
recovery_1st_stage_command = ''
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
trusted_servers = ''
ping_path = '/bin'
wd_hostname = '192.168.159.151'
wd_port = 9000
wd_authkey = ''
delegate_IP = '192.168.159.153
ifconfig_path = '/sbin'  
if_up_cmd = 'ifconfig eth1:0 inet $_IP_$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig eth1:0 down'
arping_path = '/usr/sbin'           # arping command path
arping_cmd = 'arping -I eth1  -U $_IP_$ -w 1'   #-I eth1指定出口网卡
clear_memqcache_on_escalation = on
wd_escalation_command = ''
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = '192.168.159.152'   # 配置对端的 hostname
heartbeat_destination_port0 = 9694 
heartbeat_device0 = 'eth1' 
wd_life_point = 3
wd_lifecheck_query = 'SELECT 1'
wd_lifecheck_dbname = 'template1'
wd_lifecheck_user = 'nobody'
wd_lifecheck_password = ''
other_pgpool_hostname0 = '192.168.159.152'   ## 配置对端的 pgpool
other_pgpool_port0 = 9999
other_wd_port0 = 9000
relcache_expire = 0
relcache_size = 256
check_temp_table = on
memory_cache_enabled = off
memqcache_method = 'shmem'
memqcache_memcached_host = 'localhost'
memqcache_memcached_port = 11211
memqcache_total_size = 67108864
memqcache_max_num_cache = 1000000
memqcache_expire = 0
memqcache_auto_cache_invalidation = on
memqcache_maxcache = 409600
memqcache_cache_block_size = 1048576
memqcache_oiddir = '/opt/pgpool/oiddir'#(需要现在/opt/pgpool目录下创建oiddr)
white_memqcache_table_list = ''
black_memqcache_table_list = ''


2节点
vi  /etc/pgpool-II/pgpool.conf

listen_addresses = '*'
port = 9999
socket_dir = '/opt/pgpool'
pcp_port = 9898
pcp_socket_dir = '/opt/pgpool'
backend_hostname0 = '192.168.159.151'
backend_port0 = 5432
backend_weight0 = 1
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = '192.168.159.152'
backend_port1 = 5432
backend_weight1 = 1
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0
log_destination = 'syslog'
print_timestamp = on
log_connections = on
log_hostname = on
log_statement = on
log_per_node_statement = off
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 0
pid_file_name = '/opt/pgpool/pgpool.pid'
logdir = '/tmp'
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = on
lobj_lock_table = ''
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = on
ignore_leading_white_space = on
white_function_list = ''
black_function_list = 'nextval,setval'
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 0
sr_check_user = 'replica'
sr_check_password = 'replica'
delay_threshold = 16000
follow_master_command = ''
parallel_mode = off
pgpool2_hostname = ''
system_db_hostname  = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'pgpool'
system_db_password = ''
health_check_period = 0
health_check_timeout = 20
health_check_user = 'nobody'
health_check_password = ''
health_check_max_retries = 0
health_check_retry_delay = 1
failover_command = '/opt/pgpool/failover_stream.sh  %d %H /file/data/trigger/file'
failback_command = ''
fail_over_on_backend_error = on
search_primary_node_timeout = 10
recovery_user = 'nobody'
recovery_password = ''
recovery_1st_stage_command = ''
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = off
trusted_servers = ''
ping_path = '/bin'
wd_hostname = ' '
wd_port = 9000
wd_authkey = ''
delegate_IP = '192.168.159.153 '
ifconfig_path = '/sbin'
if_up_cmd = 'ifconfig eth1:0 inet $_IP_$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig eth1:0 down'
arping_path = '/usr/sbin'           # arping command path
arping_cmd = 'arping -I eth1  -U $_IP_$ -w 1'   #-I eth1指定出口网卡
clear_memqcache_on_escalation = on
wd_escalation_command = ''
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = '192.168.159.151'
heartbeat_destination_port0 = 9694 
heartbeat_device0 = 'eth1'
wd_life_point = 3
wd_lifecheck_query = 'SELECT 1'
wd_lifecheck_dbname = 'template1'
wd_lifecheck_user = 'nobody'
wd_lifecheck_password = ''
other_pgpool_hostname0 = '192.168.159.152'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
relcache_expire = 0
relcache_size = 256
check_temp_table = on
memory_cache_enabled = off
memqcache_method = 'shmem'
memqcache_memcached_host = 'localhost'
memqcache_memcached_port = 11211
memqcache_total_size = 67108864
memqcache_max_num_cache = 1000000
memqcache_expire = 0
memqcache_auto_cache_invalidation = on
memqcache_maxcache = 409600
memqcache_cache_block_size = 1048576
memqcache_oiddir = '/opt/pgpool/oiddir'
white_memqcache_table_list = ''
black_memqcache_table_list = ''




vi /opt/pgpool/failover_stream.sh
#! /bin/sh
# Failover command for streaming replication.
# This script assumes that DB node 0 is primary, and 1 is standby.
#
# If standby goes down, do nothing. If primary goes down, create a
# trigger file so that standby takes over primary node.
#
# Arguments: $1: failed node id. $2: new master hostname. $3: path to
# trigger file.
 
failed_node=$1
new_master=$2
trigger_file=$3
# Do nothing if standby goes down.
#if [ $failed_node = 1 ]; then
#        exit 0;
#fi
/usr/bin/ssh -T $new_master /bin/touch $trigger_file
exit 0;


给脚本授权
chmod u+x /opt/pgpool/failover_stream.sh

scp /opt/pgpool/failover_stream.sh 192.168.159.152:/opt/pgpool/


cp /etc/pgpool-II/pool_hba.conf /etc/pgpool-II/pool_hba.conf.bak
vi /etc/pgpool-II/pool_hba.conf
host    all             all           192.168.159.151/32        trust
host    replication     replica       192.168.159.151/32        trust
host    postgres        postgres      192.168.159.151/32        trust
host    all             all           192.168.159.152/32      trust
host    replication     replica       192.168.159.152/32      trust
host    postgres        postgres      192.168.159.152/32      trust
host    postgres        postgres      192.168.159.152/32      trust
host    all             all           192.168.159.153/32      trust
host    replication     replica       192.168.159.153/32      trust
host    postgres        postgres      192.168.159.153/32      trust
host    postgres        postgres      192.168.159.153/32      trust


注意192.168.159.153 是VIP地址  


scp /etc/pgpool-II/pool_hba.conf 192.168.159.140:/etc/pgpool-II/

启动pgpool
pgpool -n &


关闭pgpool
pgpool -m fast stop

登陆pgpool
/data/pg/bin/psql -h 192.168.159.151 -p 9999 -U postgres -d postgres

也可以用VIP登陆 /data/pg/bin/psql -h 192.168.159.153 -p 9999 -U postgres -d postgres

查看pgpool节点
show pool_nodes;
postgres=# show pool_nodes;
 node_id |    hostname     | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+-----------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.159.151 | 5432 | up     | 0.500000  | primary | 0          | true              | 0
 1       | 192.168.159.152 | 5432 | down   | 0.500000  | standby | 0          | false             | 0
(2 rows)



五、安装keepalive
tar xvf keepalived-1.4.2.tar.gz
cd  keepalived-1.4.2
./configure
make
make install
mkdir /etc/keepalived
cd /etc/keepalived/
1节点:
vi /etc/keepalived/ keepalived.conf
global_defs {  
    router_id node1 
}  
vrrp_instance VI_1 {  
    state BACKUP    #设置为主服务器  
    interface eth1:0  #监测网络接口  
    virtual_router_id 51  #主、备必须一样  
    priority 100   #(主、备机取不同的优先级,主机值较大,备份机值较小,值越大优先级越高)  
    advert_int 1   #VRRP Multicast广播周期秒数  
    authentication {  
    auth_type PASS  #VRRP认证方式,主备必须一致  
    auth_pass 1111   #(密码)  
}  
virtual_ipaddress {  
    192.168.159.153/24  #VRRP HA虚拟地址  
}  

2节点:
vi /etc/keepalived/keepalived.conf
global_defs {  
    router_id node2 
}  
vrrp_instance VI_1 {  
    state BACKUP    #设置为主服务器  
    interface eth1:0  #监测网络接口  
    virtual_router_id 51  #主、备必须一样  
    priority 90   #(主、备机取不同的优先级,主机值较大,备份机值较小,值越大优先级越高)  
    advert_int 1   #VRRP Multicast广播周期秒数  
    authentication {  
    auth_type PASS  #VRRP认证方式,主备必须一致  
    auth_pass 1111   #(密码)  
}  
virtual_ipaddress {  
    192.168.159.153/24  #VRRP HA虚拟地址  
}  


启动Keepalived
keepalived -D -f /etc/keepalived/keepalived.conf

查看日志
tail -f /var/log/message

查看进程
ps -ef|grep keepalive



!!!!!注意!!!!!!!  配置PGPOOL的高可用,以下内容为本人亲测,部分关键性
己摸索 写的 资料



1、设置相关权限(两个节点都要执行)
--配置 ifconfig, arping 执行权限  root用户下执行
chmod u+s /sbin/ifconfig
chmod u+s /sbin/ifdown
chmod u+s /sbin/ifup
chmod u+s /usr/sbin/
chmod 755 /opt/pgpool/failover_stream.sh
chown postgres.root /opt/pgpool/failover_stream.sh


2、配置PGPOOL日志 (两个节点都要执行 )最后一行添加
vi /etc/rsyslog.conf
local0.*    /var/log/pgpool.log

/etc/init.d/rsyslog restart


3、配置关键脚本 failover_stream.sh (两个节点都要执行
将原来的那个ssh那行删除或注释

主库是192.168.159.151时
vi  /opt/pgpool/failover_stream.sh
ifconfig eth1:0 down
/usr/bin/ssh 192.168.159.152 /bin/touch /data/pg/data/ trigger.kenyon
/usr/bin/ssh 192.168.159.152 ifconfig eth1:0 up

主库是192.168.159.152时
vi  /opt/pgpool/ failover_st ream.sh
ifconfig eth1:0 down
/usr/bin/ssh 192.168.159. 151 /bin/touch /data/pg/data/ trigger.kenyon
/usr/bin/ssh 192.168.159. 151 ifconfig eth1:0 up



4、复制一个eth1:0的网卡 (两个节点都要执行
cd /etc/sysconfig/network-scripts/
cp ifcfg-eth1 ifcfg-eth1:0
vi  ifcfg-eth1:0

DEVICE="eth1:0"
BOOTPROTO="static"
HWADDR="00:0c:29:0c:7d:4f"
IPV6INIT="yes"
NM_CONTROLLED="yes"
ONBOOT="yes"
TYPE="Ethernet"
#UUID="e618ec6a-8bb0-4202-8fe6-54febd0f8c76"
IPADDR=192.168.159.153
NETMASK=255.255.255.0
GATEWAY=192.168.159.1




5、修改 pgpool.conf配置文件
vi /etc/pgpool-II/pgpool.conf
failover_command = '/opt/pgpool/failover_stream.sh'

将原来的那行注释掉,用这种方式就可以了

6、修改 pgpool.conf配置文件
vi  /etc/pgpool-II/pgpool.conf
heartbeat_device0 = 'eth1:0'




在本文上面的第三大点的第四小点有手动主备切换的步骤

VIP暂时还是不能自动漂浮切换,但是可以手动切换主备(按上面的配置会出现两个节点都有VIP192.168.159.153,这就很奇怪)
目前手动切换主备,可以实现VIP自动漂浮切换,但是前提条件是pgpool必须停止,比如主节点的postgresql数据库实例停止了,那同时主节点pgpool也要停止,这样过几分钟左右,从节点的vip192.168.159.153就会自动起来。
一定要注意的是,主备切换动作完成后,要用命令查看及测试切换是否成功,比如登陆
/data/pg/bin/psql -h 192.168.159.153 -p 9999 -U postgres -d postgres
show pool_nodes;
select client_addr,sync_state from pg_stat_replication;
这些查到的信息确认正确后,再尝试create一个测试table看是否能够创建
create table test123 (tt int);


注意,在/data/pg/data/gprof目录下,有很多的一些二进制文件,不知道是什么,会占用大量的存储空间。请教大神指导,哈哈


PG主从+pgpool ii 搭建完成后 相关报错
1、PG无法登陆问题
原本第一步搭建好PG主从,测试主从同步功能,登陆都没问题

但是在后续安装配置pgpool ii 高可用的时候,突然发现PG无法登陆了,报错如下:
[postgres@node1 ~]$ psql
psql: symbol lookup error: psql: undefined symbol: PQconnectdbParams

因为启动PG的时候是有执行日志的
[postgres@node1 ~]$ /opt/postgresql-10.3/bin/pg_ctl -D /data/pg/data -l logfile start
可以在postgres家目录查看logfile日志报错信息:
2018-05-31 23:00:18.703 PDT [12734] 致命错误:  无法加载库 "/opt/postgresql-10.3/lib/libpqwalreceiver.so": /opt/postgresql-10.3/lib/libpqwalreceiver.so: undefined symbol: PQescapeIdentifier
2018-05-31 23:00:23.709 PDT [12736] 致命错误:  无法加载库 "/opt/postgresql-10.3/lib/libpqwalreceiver.so": /opt/postgresql-10.3/lib/libpqwalreceiver.so: undefined symbol: PQescapeIdentifier
2018-05-31 23:00:28.715 PDT [12737] 致命错误:  无法加载库 "/opt/postgresql-10.3/lib/libpqwalreceiver.so": /opt/postgresql-10.3/lib/libpqwalreceiver.so: undefined symbol: PQescapeIdentifier
2018-05-31 23:00:33.721 PDT [12738] 致命错误:  无法加载库 "/opt/postgresql-10.3/lib/libpqwalreceiver.so": /opt/postgresql-10.3/lib/libpqwalreceiver.so: undefined symbol: PQescapeIdentifier
2018-05-31 23:00:38.730 PDT [12739] 致命错误:  无法加载库 "/opt/postgresql-10.3/lib/libpqwalreceiver.so": /opt/postgresql-10.3/lib/libpqwalreceiver.so: undefined symbol: PQescapeIdentifier


这个时候,可以临时执行
export LD_LIBRARY_PATH=/opt/postgresql-10.3/lib/libpqwalreceiver.so
加载缺失的库文件,再重新启动PG就可以登录了

想要永久的解决,如下:
vi  ~/.bash_profile
在最后一行添加
export LD_LIBRARY_PATH=/opt/postgresql-10.3/lib/libpqwalreceiver.so


2、PGPOOL无法启动问题
使用命令pgpool -n & 启动pgpool,发现无法启动
[root@node1 ~]# ps -ef|grep pgpool
root      3163  3081  0 19:57 pts/0    00:00:00 pgpool -n
root      3205  3163  0 19:57 pts/0    00:00:00 pgpool: health check process(0)
root      3206  3163  0 19:57 pts/0    00:00:02 pgpool: health check process(1)
root      4505  4455  0 20:37 pts/1    00:00:00 grep pgpool
ps命令查看pgpool进程,发现存在残留进程
kill 3205
kill 3206
再次启动pgpool成功

成功启动的pgpool是以下这样的
[root@node1 ~]# ps -ef|grep pool
root     12828  2231  0 19:58 pts/0    00:00:00 pgpool -n
root     12829 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12830 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12831 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12832 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12833 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12834 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12835 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12836 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12837 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12838 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12839 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12840 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12841 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12842 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12843 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12844 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12845 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12846 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12847 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12848 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12849 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12850 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12851 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12852 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12853 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12854 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12855 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12856 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12857 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12858 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12859 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12860 12828  0 19:58 pts/0    00:00:00 pgpool: wait for connection request
root     12861 12828  0 19:58 pts/0    00:00:00 pgpool: PCP: wait for connection request
root     12862 12828  0 19:58 pts/0    00:00:00 pgpool: worker process
root     12863 12828  0 19:58 pts/0    00:00:00 pgpool: health check process(0)
root     12864 12828  0 19:58 pts/0    00:00:00 pgpool: health check process(1)
root     14061 14045  0 20:37 pts/1    00:00:00 grep pool



3、PG数据库无法启动
[postgres@node2 data]$ /opt/postgresql-10.3/bin/pg_ctl -D /data/pg/data/ -l logfile start
报错:
等待服务器进程启动 .... 已停止等待
pg_ctl: 无法启动服务器进程
检查日志输出.

按报错提示查看日志
tail logfile 
2018-05-30 22:40:05.208 PDT [16383] 日志:  在0/8000130上已到达一致性恢复状态
2018-05-30 22:40:05.208 PDT [16382] 日志:  数据库系统准备接受只读请求的连接
2018-05-30 22:40:05.242 PDT [16387] 日志:  在时间点: 0/C000000 (时间安排1)启动日志的流操作
2018-05-30 23:19:59.272 PDT [16382] 日志:  接到到智能 (smart) 停止请求
2018-05-30 23:19:59.325 PDT [16387] 致命错误:  由于管理员命令中断walreceiver进程
2018-05-30 23:19:59.332 PDT [16384] 日志:  正在关闭
2018-05-30 23:19:59.426 PDT [16382] 日志:  数据库系统已关闭
2018-06-03 23:59:31.974 PDT [15817] 致命错误:  无法写入锁文件 "postmaster.pid": 设备上没有空间
2018-06-04 00:00:32.287 PDT [15840] 致命错误:  无法写入锁文件 "postmaster.pid": 设备上没有空间
2018-06-04 00:01:54.556 PDT [15867] 致命错误:  无法写入锁文件 "postmaster.pid": 设备上没有空间


df -h查看磁盘空间,果然磁盘空间不足
[postgres@node2 data]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2        18G   17G   18M 100% /
tmpfs           242M   72K  242M   1% /dev/shm
/dev/sda1       291M   39M  238M  14% /boot
[postgres@node2 data]$


4、主从切换后,从库日志报错
主从切换后,从库logfile日志报错:
2018-07-01 21:08:41.889 PDT [2644] 日志:  listening on IPv4 address "0.0.0.0", port 5432
2018-07-01 21:08:41.889 PDT [2644] 日志:  listening on IPv6 address "::", port 5432
2018-07-01 21:08:41.893 PDT [2644] 日志:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2018-07-01 21:08:41.954 PDT [2645] 日志:  数据库上次关闭时间为 2018-07-01 21:08:41 PDT
2018-07-01 21:08:42.008 PDT [2644] 日志:  数据库系统准备接受连接

从库的安装目录需增加文件recovery.conf
且需配置如下:
vi /data/pg/data/recovery.conf
recovery_target_timeline = 'latest'
standby_mode = 'on'  #--标记PG为STANDBY SERVER
primary_conninfo = 'host=192.168.159.152 port=5432 user=replica password= replic a '   #--标识主库信息
trigger_file = '/data/pg/data/trigger.kenyon'     #--标识触发器文件




--未完待续

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/28371090/viewspace-2155459/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
ORACLE,MYSQL,POSTGRESQL,SQLSERVER

注册时间:2013-03-06

  • 博文量
    756
  • 访问量
    909517