ITPub博客

首页 > Linux操作系统 > Linux操作系统 > nproc/nfile之于oracle的意义

nproc/nfile之于oracle的意义

原创 Linux操作系统 作者:myownstars 时间:2012-04-09 15:36:04 0 删除 编辑
近来常常收到开发抱怨的邮件,即通过JDBC经常连接不上若干测试数据库,对应的错误代码如下
Caused by: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:204)
at org.springframework.orm.ibatis.SqlMapClientTemplate.queryForObject(SqlMapClientTemplate.java:271)
at com.yihaodian.central.util.SqlMapClientTemplateMonitor.queryForObject(SqlMapClientTemplateMonitor.java:387)
at com.yihaodian.central.dao.impl.ProvinceDAOImpl.getProvinceByID(ProvinceDAOImpl.java:130)
at sun.reflect.GeneratedMethodAccessor690.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.aop.framework.adapter.AfterReturningAdviceInterceptor.invoke(AfterReturningAdviceInterceptor.java:50)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy89.getProvinceByID(Unknown Source)
at com.yihaodian.central.service.impl.ProvinceServiceImpl.getProvinceByID(ProvinceServiceImpl.java:14)
at sun.reflect.GeneratedMethodAccessor689.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.aop.framework.adapter.AfterReturningAdviceInterceptor.invoke(AfterReturningAdviceInterceptor.java:50)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy125.getProvinceByID(Unknown Source)
at com.yihaodian.central.action.home.DetailAction.execute(DetailAction.java:566)
at sun.reflect.GeneratedMethodAccessor723.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.opensymphony.xwork2.DefaultActionInvocation.invokeAction(DefaultActionInvocation.java:404)
at com.opensymphony.xwork2.DefaultActionInvocation.invokeActionOnly(DefaultActionInvocation.java:267)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:229)
at com.yihaodian.central.interceptor.ContinuePurchaseInterceptor.intercept(ContinuePurchaseInterceptor.java:58)
at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
at com.yihaodian.central.interceptor.CalInterceptor.intercept(CalInterceptor.java:74)
... 73 more
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
at com.yihaodian.front.kernel.spring.YihaodianSplitDbDataSource.getConnection(YihaodianSplitDbDataSource.java:16)
at org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource.getConnection(AbstractRoutingDataSource.java:133)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:113)
at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:190)
... 118 more
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134)
at org.apache.commons.dbcp.AbandonedObjectPool.borrowObject(AbandonedObjectPool.java:79)
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
... 123 more

检查process数量,并没有超过最大连接值;
对应的数据库压力也不大,没有严重的等待事件,且所在server可以ping通;
最后才想起检查操作系统配置
[oracle@test ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 268288
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 268288
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
查看/etc/security/limits.conf文件,里面没有关于oracle的配置信息,于是手工添加如下信息
oracle soft nofile 2047
oracle hard nofile 65536
oracle soft nproc 1024
oracle hard nproc 65536
退出保存
并不确定这种调整是否有效,但是连续好几次都神奇的解决了此类问题,有可能是巧合。
关于如下两个参数
- nofile - max number of open files
- nproc - max number of processes
进行一个实验
 
 
先在oracle用户下创建一个crontab job
[oracle@testdb ~]$ crontab -l
* */1 * * * sh /home/oracle/tools/test.sh >> /home/oracle/test1.log
内容如下
[oracle@testdb tools]$ more test.sh
#Enviroment define.
source /home/oracle/tools/SET_JOB_ENV.SH
#following is excution part
sqlplus / as sysdba@justin<exec test_shell_proc();
exec test_shell_proc_param($test_id);
exit
eof


Nproc参数测试
1、
一开始把nproc设置成10
oracle soft nofile 131072
oracle hard nofile 131072
oracle soft nproc 10
oracle hard nproc 10

通过secureCRT以oracle用户登录,一直hang在那里;而root用户则可以顺利登录,进入到/var/log目录
[root@testdb log]# tail -n 200 -f secure
Nov 8 16:49:40 testdb sshd[16536]: Accepted password for oracle from 192.168.16.173 port 1719 ssh2
Nov 8 16:49:40 testdb sshd[16536]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Nov 8 16:49:40 testdb sshd[16538]: fatal: setresuid 500: Resource temporarily unavailable
Nov 8 16:49:40 testdb sshd[16536]: pam_unix(sshd:session): session closed for user oracle
Nov 8 16:49:45 testdb sshd[16539]: Accepted password for oracle from 192.168.16.173 port 1720 ssh2
Nov 8 16:49:45 testdb sshd[16539]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Nov 8 16:49:45 testdb sshd[16541]: fatal: setresuid 500: Resource temporarily unavailable
Nov 8 16:49:45 testdb sshd[16539]: pam_unix(sshd:session): session closed for user oracle


[root@testdb log]# tail -n 200 -f cron
Nov 8 16:49:01 testdb crond[16535]: (CRON) setuid failed: (Resource temporarily unavailable)
Nov 8 16:49:01 testdb crond[16535]: CRON (oracle) ERROR: failed to open PAM security session: Resource temporarily unavailable
Nov 8 16:49:01 testdb crond[16535]: CRON (oracle) ERROR: cannot set security context
Nov 8 16:50:01 testdb crond[16548]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Nov 8 16:50:01 testdb crond[16547]: (CRON) setuid failed: (Resource temporarily unavailable)
Nov 8 16:50:01 testdb crond[16547]: CRON (oracle) ERROR: failed to open PAM security session: Resource temporarily unavailable
Nov 8 16:50:01 testdb crond[16547]: CRON (oracle) ERROR: cannot set security context


2、
恢复hard nproc值,soft nproc依旧设置为10
oracle soft nproc 10
oracle hard nproc 131072
依旧无法通过oracle用户登录,错误日志提示也一样。

3、
将soft nproc设置成131072
可以通过oracle用户顺利登录,crontab job可以顺利执行
[root@testdb log]# tail -n 200 -f secure
Nov 8 16:53:33 testdb sshd[16581]: Accepted password for oracle from 192.168.16.173 port 1753 ssh2
Nov 8 16:53:33 testdb sshd[16581]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
[root@testdb log]# tail -n 200 -f cron
Nov 8 16:52:01 testdb crond[16568]: CRON (oracle) ERROR: failed to open PAM security session: Resource temporarily unavailable
Nov 8 16:52:01 testdb crond[16568]: CRON (oracle) ERROR: cannot set security context
Nov 8 16:53:01 testdb crond[16577]: (oracle) CMD (sh /home/oracle/tools/test.sh >> /home/oracle/test1.log)
Nov 8 16:54:01 testdb crond[16625]: (oracle) CMD (sh /home/oracle/tools/test.sh >> /home/oracle/test1.log)

小结: nproc设置过低,会导致无法以oracle用户登录OS,且在/var/log/secure里报告Resource temporarily unavailable错误;且oracle所属的crontab job也无法运行

Nofile参数测试
1、
将其设置为2
oracle soft nofile 2
oracle hard nofile 2
oracle soft nproc 131072
oracle hard nproc 131072

此时通过secureCRT连接数据库,弹出错误窗口,提示“socket: Too many open files”
[root@testdb log]# tail -n 10 -f secure
Nov 8 16:41:59 testdb sshd[16392]: Accepted password for oracle from 192.168.16.173 port 1651 ssh2
Nov 8 16:41:59 testdb sshd[16392]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Nov 8 16:41:59 testdb sshd[16394]: Disconnecting: socket: Too many open files
Nov 8 16:41:59 testdb sshd[16392]: pam_unix(sshd:session): session closed for user oracle

[root@testdb log]# tail -n 200 -f cron
Nov 8 16:44:01 testdb crond[16424]: System error
Nov 8 16:44:01 testdb crond[16424]: CRON (oracle) ERROR: failed to open PAM security session: Too many open files
Nov 8 16:44:01 testdb crond[16424]: CRON (oracle) ERROR: cannot set security context

2、
改为20
则可以登录
oracle soft nofile 20
oracle hard nofile 20

[root@testdb log]# tail -n 10 -f secure
Nov 8 16:44:45 testdb sshd[16426]: Accepted password for oracle from 192.168.16.173 port 1671 ssh2
Nov 8 16:44:45 testdb sshd[16426]: pam_unix(sshd:session): session opened for user oracle by (uid=0)

[root@testdb log]# tail -n 200 -f cron
Nov 8 16:44:01 testdb crond[16424]: System error
Nov 8 16:44:01 testdb crond[16424]: CRON (oracle) ERROR: failed to open PAM security session: Too many open files
Nov 8 16:44:01 testdb crond[16424]: CRON (oracle) ERROR: cannot set security context
Nov 8 16:45:01 testdb crond[16468]: (oracle) CMD (sh /home/oracle/tools/test.sh >> /home/oracle/test1.log)
可以执行crontab job了

3、
将soft设置为10
oracle soft nofile 10
oracle hard nofile 20
依旧无法登录
[root@testdb log]# tail -n 10 -f secure
Nov 8 16:46:47 testdb sshd[16504]: Accepted password for oracle from 192.168.16.173 port 1694 ssh2
Nov 8 16:47:01 testdb crond[16512]: pam_succeed_if(crond:session): error retrieving information about user 0
Nov 8 16:47:01 testdb crond[16512]: pam_unix(crond:session): session opened for user oracle by (uid=0)
Nov 8 16:47:01 testdb crond[16512]: PAM audit_open() failed: Too many open files
Nov 8 16:47:01 testdb crond[16512]: pam_succeed_if(crond:session): error retrieving information about user 0
Nov 8 16:47:01 testdb crond[16512]: pam_unix(crond:session): session closed for user oracle
Nov 8 16:47:01 testdb crond[16512]: PAM audit_open() failed: Too many open files
[root@testdb log]# tail -n 200 -f cron
Nov 8 16:47:01 testdb crond[16512]: System error
Nov 8 16:47:01 testdb crond[16512]: CRON (oracle) ERROR: failed to open PAM security session: Too many open files
Nov 8 16:47:01 testdb crond[16512]: CRON (oracle) ERROR: cannot set security context
Nov 8 16:48:01 testdb crond[16521]: System error
Nov 8 16:48:01 testdb crond[16521]: CRON (oracle) ERROR: failed to open PAM security session: Too many open files
Nov 8 16:48:01 testdb crond[16521]: CRON (oracle) ERROR: cannot set security context

但是可以通过pl/sql developer远程连接数据库。

小结:nofile设置过低,同样也会导致无法以oracle用户登录OS,且报告错误 Too many open files,同时crontab job也无法正常运行;soft值设置过低同样会导致相应程序无法正常运行


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/15480802/viewspace-720722/,如需转载,请注明出处,否则将追究法律责任。

上一篇: 11g AMM和/dev/shm
下一篇: direct IO和AIO
请登录后发表评论 登录
全部评论

注册时间:2010-03-18

  • 博文量
    375
  • 访问量
    3041222