ITPub博客

首页 > 数据库 > PostgreSQL > Greenplum初始化数据库gpinitsystem报错以及解决

Greenplum初始化数据库gpinitsystem报错以及解决

原创 PostgreSQL 作者:你好我是李白 时间:2020-06-19 22:32:25 0 删除 编辑

 

初始化报错解决

报错Unable to resolve mdw on this host

现象

[gpadmin@sdw1 ~]$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Reading Greenplum configuration file gpconfigs/gpinitsystem_config
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Locale has not been set in gpconfigs/gpinitsystem_config, will set to default value
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Locale set to en_US.utf8
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[WARN]:-Master hostname mdw does not match hostname output
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Checking to see if mdw can be resolved on this host
20200618:15:33:11:010535 gpinitsystem:sdw1:gpadmin-[FATAL]:-Master hostname in configuration file is mdw
20200618:15:33:11:010535 gpinitsystem:sdw1:gpadmin-[FATAL]:-Operating system command returns sdw1
20200618:15:33:11:010535 gpinitsystem:sdw1:gpadmin-[FATAL]:-Unable to resolve mdw on this host
20200618:15:33:11:010535 gpinitsystem:sdw1:gpadmin-[FATAL]:-Master hostname in gpinitsystem configuration file must be mdw Script Exiting!


原因

由于gpinitsystem命令config要求不能写master节点hostname,所以无法使用hostfile解析master节点hostname,所以需要使用Master节点初始化Greenplum。

解决

使用mdw节点gpinitsystem初始化Greenplum。

报错Unknown host sdw1: ping: sdw1

现象

[gpadmin@mdw ~]$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem
20200618:15:35:11:007212 gpinitsystem:mdw:gpadmin-[INFO]:-Checking configuration parameters, please wait...
 
...
 
> y
20200618:15:36:27:007212 gpinitsystem:mdw:gpadmin-[INFO]:-Building the Master instance database, please wait...
20200618:15:36:33:007212 gpinitsystem:mdw:gpadmin-[INFO]:-Starting the Master in admin mode
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[FATAL]:-Unknown host sdw1: ping: sdw1: Name or service not known
ping: sdw1: Name or service not known Script Exiting!
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[WARN]:-Script has left Greenplum Database in an incomplete state
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[WARN]:-Run command bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_20200618_153511 to remove these changes
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[INFO]:-End Function BACKOUT_COMMAND


原因

segment两台机器真实的hostname需要在/etc/hosts文件中有映射关系。

解决

需要将segment主机真实hostname加入/etc/hosts文件。

报错No segment started for content: 0

现象

初始化命令报错信息:

20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/master/gpseg-1
20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 6.8.0 build commit:d9b16e3438fc6e01e6083cd82cf76ba99c1b50b5'
20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Greenplum Catalog Version: '301908232'
20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Starting Master instance in admin mode
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Setting new master era
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Master Started...
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Shutting down master
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-Process results...
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[ERROR]:-No segment started for content: 0.
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-dumping success segments: ['sdw1:/data2/primary/gpseg1:content=1:dbid=3:role=p:preferred_role=p:mode=n:status=u', 'sdw2:/data2/primary/gpseg3:content=3:dbid=5:role=p:preferred_role=p:mode=n:status=u']
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-DBID:2  FAILED  host:'sdw1' datadir:'/data1/primary/gpseg0' with reason:'PG_CTL failed.'
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-DBID:4  FAILED  host:'sdw2' datadir:'/data1/primary/gpseg2' with reason:'PG_CTL failed.'
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
 
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-   Successful segment starts                                            = 2
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[WARNING]:-Failed segment starts                                                = 2   <<<<<<<<
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-Successfully started 2 of 4 segment instances <<<<<<<<
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[WARNING]:-Segment instance startup failures reported
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[WARNING]:-Failed start 2 of 4 segment instances <<<<<<<<
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[WARNING]:-Review /home/gpadmin/gpAdminLogs/gpstart_20200618.log
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-Commencing parallel segment instance shutdown, please wait...
20200618:15:58:36:014010 gpstart:mdw:gpadmin-[ERROR]:-gpstart error: Do not have enough valid segments to start the array.

gpinitsystem_20200618.log报错信息:

20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:-Failed to start Greenplum instance; review gpstart output to
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:- determine why gpstart failed and reinitialize cluster after resolving
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:- issues.  Not all initialization tasks have completed so the cluster
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:- should not be used.
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:-gpinitsystem will now try to stop the cluster
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:
20200618:16:21:06:015525 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function ERROR_EXIT
20200618:16:21:06:015525 gpinitsystem:mdw:gpadmin-[WARN]:-Failed to stop new Greenplum instance Script Exiting!


诊断

gpstart -m -d /data/master/gpseg-1   /* 只启动master
gpstop -a -M fast /* -a禁止输出确认y/n,-M fast/immediate/smart,相当于oracle shutdown abort/immediate/normal
gpstart -a -v     /* -v verbose输出详细启动日志。


上翻找到该节点启动失败命令

  stderr=''
20200618:16:36:06:016487 gpsegstart.py_sdw2:gpadmin:sdw2:gpadmin-[DEBUG]:-[worker1] finished cmd: Starting seg at dir /data1/primary/gpseg2 
cmdStr='env GPSESSID=0000000000 GPERA=8a0d21cca0b8bbb8_200618163604 
$GPHOME/bin/pg_ctl -D /data1/primary/gpseg2 -l /data1/primary/gpseg2/pg_log/startup.log -w -t 600 -o " -p 6000 " start 2>&1'  
had result: cmd had rc=1 completed=True halted=False
  stdout='waiting for server to start.... stopped waiting
pg_ctl: could not start server


 

去对应节点找到启动日志文件/data2/primary/gpseg3/pg_log/startup.log

2020-06-18 16:36:05.835068 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"LOG","00000","registering background worker ""sweeper process""",,,,,,,,"RegisterBackgroundWorker","bgworker.c",774,
2020-06-18 16:36:05.835486 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"LOG","XX000","could not bind IPv4 socket: Address already in use",,"Is another postmaster already running on port 6000? If not, wait a few seconds and retry.",,,,,,"StreamServerPort","pqcomm.c",503,
2020-06-18 16:36:05.835741 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"LOG","XX000","could not bind IPv6 socket: Address already in use",,"Is another postmaster already running on port 6000? If not, wait a few seconds and retry.",,,,,,"StreamServerPort","pqcomm.c",503,
2020-06-18 16:36:05.836023 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"WARNING","01000","could not create listen socket for ""*""",,,,,,,,"PostmasterMain","postmaster.c",1202,
2020-06-18 16:36:05.836162 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"FATAL","XX000","could not create any TCP/IP sockets",,,,,,,,"PostmasterMain","postmaster.c",1207,1    0xbe84ec postgres errstart (elog.c:557)


检查发现图形界面占用了6000端口,导致segment启动失败。

[root@sdw2 ~]# netstat -anp|grep 6000
tcp        0      0 0.0.0.0:6000            0.0.0.0:*               LISTEN      3969/X              
tcp6       0      0 :::6000                 :::*                    LISTEN      3969/X              
[root@sdw2 ~]#


解决

修改gpinitsystem gpconfig,将Master instance与Segment instance端口修改为:PORT_BASE=6500,MIRROR_PORT_BASE=7500

重新运行初始化

报错Inconsistency between number of multi-home hostnames and number of segments per host

现象

[gpadmin@mdw gpconfigs]$ gpinitsystem -c gpinitsystem_config -h hostfile_gpinitsystem -s mdw -S /data/
master/  standby/ 
[gpadmin@mdw gpconfigs]$ gpinitsystem -c gpinitsystem_config -h hostfile_gpinitsystem -s mdw -S /data/standby/
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Reading Greenplum configuration file gpinitsystem_config
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Locale has not been set in gpinitsystem_config, will set to default value
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Locale set to en_US.utf8
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250
20200619:21:15:18:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Checking configuration parameters, Completed
20200619:21:15:18:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
....
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Configuring build for multi-home array
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[FATAL]:-Inconsistency between number of multi-home hostnames and number of segments per host
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Have 3 data directories and 2 multi-home hostnames for each host
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[INFO]:-For multi-home configuration, number of segment instance data directories per host must be multiple of
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[INFO]:-the number of multi-home hostnames within the GPDB array
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[FATAL]:-Unable to continue Script Exiting!


原因

hostfile中只有两个主机名对应segment interface,config文件中DATA_DIRECTORY每个主机指定了3个segment instance,无法平衡,报错。

解决

修改DATA_DIRECTORY为每个segment主机四个segment instance,或者修改hostfile列表,修改为3个interface。

添加Standby Master报错

gpinitstandby -S指定目录已存在

[gpadmin@mdw gpconfigs]$ gpinitstandby -s mdw -S /data/standby/ -P 5433
20200619:21:27:37:011844 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for data directory /data/standby/ on mdw
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[ERROR]:-Data directory already exists on host mdw
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[ERROR]:-If you want to initialize a new standby on the same host as the master (not recommended), use -S and -P to specify a new data directory and port
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[ERROR]:-Failed to create standby
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[ERROR]:-Error initializing standby master: master data directory exists


解决

       查看目录,如果已经存在,更换或者删除目录,gpinitstandby命令自行创建。

同机器创建Standby Master Instance使用默认端口与Master Instance冲突

[gpadmin@mdw data]$ gpinitstandby -s mdw -S /data/standby/
20200619:21:29:03:012052 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20200619:21:29:03:012052 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for data directory /data/standby/ on mdw
20200619:21:29:04:012052 gpinitstandby:mdw:gpadmin-[ERROR]:-Failed to create standby
20200619:21:29:04:012052 gpinitstandby:mdw:gpadmin-[ERROR]:-Error initializing standby master: cannot create standby on the same host and port


解决

       使用gpinitstandby -P指定与Master不同端口。

解决问题重新运行

[gpadmin@mdw data]$ gpinitstandby -s mdw -S /data/standby/ -P 5532
20200619:21:29:23:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20200619:21:29:23:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for data directory /data/standby/ on mdw
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master initialization parameters
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master hostname               = mdw
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master data directory         = /data/master/gpseg-1
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master port                   = 5432
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master hostname       = mdw
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master port           = 5532
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master data directory = /data/standby/
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum update system catalog         = On
Do you want to continue with standby master initialization? Yy|Nn (default=N):
> y
20200619:21:29:27:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby
20200619:21:29:28:012192 gpinitstandby:mdw:gpadmin-[INFO]:-The packages on mdw are consistent.
20200619:21:29:28:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Adding standby master to catalog...
20200619:21:29:28:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Database catalog updated successfully.
20200619:21:29:28:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Updating pg_hba.conf file...
20200619:21:29:51:012192 gpinitstandby:mdw:gpadmin-[INFO]:-pg_hba.conf files updated successfully.
20200619:21:29:53:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Starting standby master
20200619:21:29:53:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Checking if standby master is running on host: mdw  in directory: /data/standby/
20200619:21:29:58:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files...
20200619:21:30:07:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
20200619:21:30:07:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Successfully created standby master on mdw
[gpadmin@mdw data]$


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/31439444/viewspace-2699576/,如需转载,请注明出处,否则将追究法律责任。

全部评论
让生活更美好。

注册时间:2017-02-28

  • 博文量
    80
  • 访问量
    40007