ITPub博客

首页 > 数据库 > Oracle > nagios使用check_oracle_health配置文档

nagios使用check_oracle_health配置文档

原创 Oracle 作者:fei890910 时间:2014-03-12 14:22:18 0 删除 编辑
环境:192.168.10.101(监控机)
      192.168.10.10(被监控机)上面跑着oracle数据库。

准备工作 在数据库上创建用户并赋予权限
CREATE USER nagios IDENTIFIED BY oradbmon; 
GRANT CREATE SESSION TO nagios;
GRANT SELECT any dictionary TO nagios;
GRANT SELECT ON V_$SYSSTAT TO nagios;
GRANT SELECT ON V_$INSTANCE TO nagios;
GRANT SELECT ON V_$LOG TO nagios;
GRANT SELECT ON SYS.DBA_DATA_FILES TO nagios;
GRANT SELECT ON SYS.DBA_FREE_SPACE TO nagios;

GRANT SELECT ON sys.dba_tablespaces TO nagios;
GRANT SELECT ON dba_temp_files TO nagios;
GRANT SELECT ON sys.v_$Temp_extent_pool TO nagios;
GRANT SELECT ON sys.v_$TEMP_SPACE_HEADER  TO nagios;
GRANT SELECT ON sys.v_$session TO nagios;

1、查看被监控是否安装了perl?并且被监控机安装DBI
输入perl -v,出现以下信息则说明已安装
This is perl, v5.8.8 built for x86_64-linux-thread-multi
Copyright 1987-2006, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl".  If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
下载DBI
wget http://search.cpan.org/CPAN/authors/id/T/TI/TIMB/DBI-1.609.tar.gz
tar zxvf DBI-1.609.tar.gz 
cd DBI-1.609
perl Makefile.PL 
make all
make install
2、没有报错我们进行下一步安装DBD-Oracle
wget http://mirrors.neusoft.edu.cn/cpan/authors/id/P/PY/PYTHIAN/DBD-Oracle-1.52.tar.gz
tar zxvf DBD-Oracle-1.52.tar.gz 
cd DBD-Oracle-1.52
perl Makefile.PL
执行上述命令你肯定会遇到如下错误:
Using DBI 1.605 (for perl 5.008005 on i386-linux-thread-multi) installed in 

/usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi/auto/DBI/ 
Configuring DBD::Oracle for perl 5.008005 on linux (i386-linux-thread-multi) 
Remember to actually *READ* the README file! Especially if you have any problems. 
Trying to find an ORACLE_HOME
Your LD_LIBRARY_PATH env var is set to '' 
      The ORACLE_HOME environment variable is not set and I couldn't guess it.
      It must be set to hold the path to an Oracle installation directory
      on this machine (or a machine with a compatible architecture).
      See the appropriate README file for your OS for more information.
      ABORTED!
然后你需要设置你的临时ORACLE_HOME变量,参考你的oracle用户的环境变量,贴上下面的语句:
export ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1
再执行perl Makefile.PL就OK了
make 
make install
3、被监控机最后一步开始安装主角了,check_oracle_health
wget http://labs.consol.de/wp-content/uploads/2009/09/check_oracle_health-1.6.3.tar.gz
tar zxvf check_oracle_health-1.6.3.tar.gz 
cd check_oracle_health-1.6.3
./configure --prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-
mymodules-dir=/usr/local/nagios/libexec --with-mymodules-dyndir=/usr/local/nagios/libexec
make all
make install
上面的步骤注意写你自己的nagios安装路径。
查看被监控机/usr/local/nagios/libexec目录下插件check_oracle_health是否有了?

4、切换到oracle用户,试运行一下这个插件看看?注意这里数据库最好开监听
/usr/local/nagios/libexec/check_oracle_health --connect=你oracle的SID --user=oracle用户 --password=oracle密码 
--mode=tnsping
输出如下信息说明没有问题:
OK - connection established to 你oracle的SID.
或者你可以把最后的--mode=tnsping换成--mode=tablespace-usage试试看是否能查看所有表空间了?

5、上面是oracle用户运行没有任何问题,但是我们是root运行的,所以必须把oracle用户下的所有变量加入到root用户的变
量下,再尝试上面的第4步看看是否有问题?没问题则说明OK了!有问题则说明环境变量没加好!

6、被监控测试自己是没问题了,如何让监控机去调用这个脚本呢?在被监控上面的nrpe.cfg文件加入如下内容:
vi /usr/local/nagios/etc/nrpe.cfg 我先加了三个服务      
command[check_oracle_health]=/usr/local/nagios/libexec/check_oracle_health --connect=你oracle的SID --
user=oracle用户 --password=oracle密码 --mode=tablespace-usage

command[check_oracle_health_tbs]=/usr/local/nagios/libexec/check_oracle_health --connect=prod --user=nagios 
--password=oradbmon --mode=tablespace-usage
command[check_oracle_health_tnsping]=/usr/local/nagios/libexec/check_oracle_health --connect=prod --
user=nagios --password=oradbmon --mode=tnsping
command[check_oracle_health_soft]=/usr/local/nagios/libexec/check_oracle_health --connect=prod --user=nagios 
--password=oradbmon --mode=soft-parse-ratio
保存后退出,然后我们重启被监控的nrpe服务
[root@James10g etc]# /etc/rc.d/init.d/xinetd restart
Stopping xinetd:                                           [  OK  ]
Starting xinetd:                                           [  OK  ]

/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
7、修改监控机的/usr/local/nagios/etc/objects下的两个文件,增加如下内容:
/usr/local/nagios/etc/objects/hosts.cfg
define host {  
        use                      linux-server  
        host_name        James10g_oracle  
        alias                    Oracle_10g  
        address              192.168.10.10
/usr/local/nagios/etc/objects/services.cfg
define service{
        use                                     generic-service         ; Name of service template to use
        host_name                       James10g_oracle
        service_description          check-oracle-tablespace
        check_command              check_nrpe!check_oracle_health_tbs
        }
define service{
        use                                    generic-service         ; Name of service template to use
        host_name                       James10g_oracle
        service_description          check-oracle-tnsping
        check_command              check_nrpe!check_oracle_health_tnsping
        }
define service{
        use                                    generic-service         ; Name of service template to use
        host_name                       James10g_oracle
        service_description          check-oracle-soft-parse-ratio
        check_command              check_nrpe!check_oracle_health_soft
        }

8、下面我们该到监控上去检查这个插件
/usr/local/nagios/libexec/check_nrpe -H 你的被监控机IP地址 -c check_oracle_health
[root@node1 objects]# /usr/local/nagios/libexec/check_nrpe -H 192.168.10.10 -c check_oracle_health_tbs
OK - tbs USERS usage is 4.87%
tbs UNDOTBS1 usage is 0.00%
tbs TOOLS usage is 54.94%
tbs TEMP usage is 0.00%
tbs SYSTEM usage is 1.47%
tbs SYSAUX usage is 0.77%
tbs EXAMPLE usage is 0.21% | 'tbs_users_usage_pct'=4.87%;90;98
'tbs_users_usage'=1597MB;29491;32112;0;32767
'tbs_users_alloc'=1601MB;;;0;32767
'tbs_undotbs1_usage_pct'=0.00%;90;98
'tbs_undotbs1_usage'=0MB;29491;32112;0;32767
'tbs_undotbs1_alloc'=1135MB;;;0;32767
'tbs_tools_usage_pct'=54.94%;90;98
'tbs_tools_usage'=164MB;270;294;0;300
'tbs_tools_alloc'=300MB;;;0;300
'tbs_temp_usage_pct'=0.00%;90;98
'tbs_temp_usage'=0MB;29491;32112;0;32767
'tbs_temp_alloc'=462MB;;;0;32767
'tbs_system_usage_pct'=1.47%;90;98
'tbs_system_usage'=481MB;29491;32112;0;32767
'tbs_system_alloc'=490MB;;;0;32767
'tbs_sysaux_usage_pct'=0.77%;90;98
'tbs_sysaux_usage'=253MB;29491;32112;0;32767
'tbs_sysaux_alloc'=260MB;;;0;32767
'tbs_example_usage_pct'=0.21%;90;98
'tbs_example_usage'=68MB;29491;32112;0;32767
'tbs_example_alloc'=100MB;;;0;32767
[root@node1 objects]# /usr/local/nagios/libexec/check_nrpe -H 192.168.10.10 -c check_oracle_health_tnsping
OK - connection established to prod.

如果出现错误 NRPE: Command 'check_oracle_health_tbs' not defined
    提示错误:NRPE: Command 'check_oracle_health_tbs' not defined
    这是因为没有配置好两端的NRPE和Nagios,使得monitoring server不能远程执行check_disk命令.
    在被监控服务器端,需要修改nrpe.cfg文件:
    dont_blame_nrpe=1
    这将允许命令带参数执行.
重启nagios            
[root@node1 objects]# /etc/init.d/nagios restart

最后在界面上显示下图就就差不多了
                                                                 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29108064/viewspace-1108306/,如需转载,请注明出处,否则将追究法律责任。

上一篇: spool 用法
请登录后发表评论 登录
全部评论

注册时间:2013-08-15

  • 博文量
    120
  • 访问量
    753994