ITPub博客

首页 > 数据库 > Oracle > [20200220]windows设置keepalive参数.txt

[20200220]windows设置keepalive参数.txt

原创 Oracle 作者:lfree 时间:2020-02-20 11:23:15 0 删除 编辑

[20200220]windows设置keepalive参数.txt

--//昨天测试了ENABLE=BROKEN在连接串中,可以发现在客户端启用了TCP keep-alive feature特性。而缺省tcp_keepalive_time设置
--//7200秒,时间有点长。许多客户端或者中间服务器使用的是windows系统,如何修改注册表呢?

--//检索找到如下链接:http://www.cppblog.com/Robertxiao/articles/153510.html
1)在Windows NT平台上, 我们利用regedit来修改系统注册表,修改
HKEY_LOCAL_MACHINE\CurrentControlSet\Services\Tcpip\Parameters下的以下三个参数:

KeepAliveInterval        :设置其值为1000
KeepAliveTime            :设置其值为300000(单位为毫秒,300000代表5分钟)
TcpMaxDataRetransmissions:设置其值为5
--//在我的工作机器测试看看。注:我的测试环境是windows 7.

1.修改注册表:

REGEDIT4

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Tcpip\Parameters]
"KeepAliveTime"=dword:00001770
"KeepAliveInterval"=dword:000003e8
"MaxDataRetries"="5"

--//KeepAliveTime=0x1770 = 6000
--//KeepAliveInterval=0x000003e8 = 1000
--//注:不知道MaxDataRetries还是TcpMaxDataRetransmissions,windows 技术资料太少。那位知道,我最终测试2个不是。
--//或者像https://blog.csdn.net/shenya1314/article/details/70187767介绍那样,客户端无法设置。

2.测试:
--//服务端设置:
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 75
/proc/sys/net/ipv4/tcp_keepalive_probes: 9
/proc/sys/net/ipv4/tcp_keepalive_time: 7200

$ grep SQLNET.EXPIRE_TIME $ORACLE_HOME/network/admin/sqlnet.ora
#SQLNET.EXPIRE_TIME = 1

--//延长服务端tcp_keepalive_time时间,避免服务端干扰。

sqlplus scott/book@"(DESCRIPTION=(ENABLE=BROKEN)(CONNECT_DATA=(SERVICE_NAME=book)(SERVER = DEDICATED))(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.100.78)(PORT=1521)))"

SCOTT@book> @ spid
       SID    SERIAL# PROCESS                  SERVER    SPID                     PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
         3       2097 1688:7476                DEDICATED 21518                     24        159 alter system kill session '3,2097' immediate;

# netstat -npo 2>/dev/null | grep 21518
tcp        0      0 192.168.100.78:1521         192.168.98.6:56411          ESTABLISHED 21518/oraclebook    keepalive (7177.06/0/0)
--//确定端口号 56411

# tcpdump -vvnni eth0  port 56411
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
--//等1分钟毫无反应。噢!才想起来也许要重启机器再测试。不知道断开网络再连接是否有效,先测试禁用再启用连接的情况。

3.继续测试:
--//禁用再启用网络连接,操作细节略。

sqlplus scott/book@"(DESCRIPTION=(ENABLE=BROKEN)(CONNECT_DATA=(SERVICE_NAME=book)(SERVER = DEDICATED))(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.100.78)(PORT=1521)))"

SCOTT@book> @ spid
       SID    SERIAL# PROCESS                  SERVER    SPID                     PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
        58       1379 8420:2824                DEDICATED 21635                     28        114 alter system kill session '58,1379' immediate;

# netstat -npo 2>/dev/null | grep 21635
tcp        0      0 192.168.100.78:1521         192.168.98.6:57543          ESTABLISHED 21635/oraclebook    keepalive (7084.23/0/0)
--//确定端口号 57543

# tcpdump -vvnni eth0  port 57543
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
--//等1分钟毫无反应。

4.再继续测试:
--//重启测试机器客户端。
sqlplus scott/book@"(DESCRIPTION=(ENABLE=BROKEN)(CONNECT_DATA=(SERVICE_NAME=book)(SERVER = DEDICATED))(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.100.78)(PORT=1521)))"

SCOTT@book> @ spid
       SID    SERIAL# PROCESS                  SERVER    SPID                     PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
        58       1391 4052:5612                DEDICATED 22059                     28        119 alter system kill session '58,1391' immediate;

$ netstat -npo 2>/dev/null | egrep "22059"
tcp        0      0 192.168.100.78:1521         192.168.98.6:49682          ESTABLISHED 22059/oraclebook    keepalive (7163.03/0/0)

# tcpdump -vvnni eth0  port 49682
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
10:06:23.007811 IP (tos 0x0, ttl 127, id 3580, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 3057020812:3057020813(1) ack 1803888726 win 16289
10:06:23.007991 IP (tos 0x0, ttl  64, id 63755, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49682: ., cksum 0x47cc (incorrect (-> 0x63a3), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>
10:06:29.010284 IP (tos 0x0, ttl 127, id 3611, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:06:29.010324 IP (tos 0x0, ttl  64, id 63756, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49682: ., cksum 0x47cc (incorrect (-> 0x63a3), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>
10:06:35.004759 IP (tos 0x0, ttl 127, id 3656, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:06:35.004797 IP (tos 0x0, ttl  64, id 63757, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49682: ., cksum 0x47cc (incorrect (-> 0x63a3), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>
10:06:41.013022 IP (tos 0x0, ttl 127, id 3695, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:06:41.013075 IP (tos 0x0, ttl  64, id 63758, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49682: ., cksum 0x47cc (incorrect (-> 0x63a3), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>
--//总算OK了。注意看时间间隔正好6秒。
10:09:23.021838 IP (tos 0x0, ttl 127, id 4958, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:09:23.021929 IP (tos 0x0, ttl  64, id 63785, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49682: ., cksum 0x47cc (incorrect (-> 0x63a3), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>

# iptables -I INPUT 1 -p tcp --dport 49682 -j drop
--//奇怪执行以上命令不行。因为没有这样类型的包。
--//192.168.100.78.1521 > 192.168.98.6.49682 对应的是OUTPUT链。

# iptables -D INPUT 1
# iptables -I INPUT 1 -p tcp --sport 49682 -j DROP

10:15:59.099141 IP (tos 0x0, ttl 127, id 7794, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:15:59.099179 IP (tos 0x0, ttl  64, id 63851, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49682: ., cksum 0x47cc (incorrect (-> 0x63a3), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>
--//正常。以下就是执行iptables -I INPUT 1 -p tcp --sport 49682 -j DROP的情况。
10:16:05.101208 IP (tos 0x0, ttl 127, id 7848, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:06.102474 IP (tos 0x0, ttl 127, id 7850, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:07.103134 IP (tos 0x0, ttl 127, id 7859, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:08.103356 IP (tos 0x0, ttl 127, id 7861, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:09.103903 IP (tos 0x0, ttl 127, id 7870, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:10.102031 IP (tos 0x0, ttl 127, id 7872, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:11.100107 IP (tos 0x0, ttl 127, id 7880, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:12.100189 IP (tos 0x0, ttl 127, id 7884, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:13.100203 IP (tos 0x0, ttl 127, id 7892, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:14.100300 IP (tos 0x0, ttl 127, id 7895, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49682 > 192.168.100.78.1521: ., cksum 0xa6ea (correct), 0:1(1) ack 1 win 16289
10:16:15.100914 IP (tos 0x0, ttl 127, id 7903, offset 0, flags [DF], proto: TCP (6), length: 40) 192.168.98.6.49682 > 192.168.100.78.1521: R, cksum 0xe687 (correct), 1:1(0) ack 1 win 0
--//出现11次,时间间隔1秒也正确。出现11次说明注册表的这个参数MaxDataRetries不对。

--//客户端执行sql语句,马上报错。
10:12:18 SCOTT@book> set time on escape on
10:12:20 SCOTT@book> select sysdate from dual ;
select sysdate from dual
*
ERROR at line 1:
ORA-03135: connection lost contact
Process ID: 22059
Session ID: 58 Serial number: 1391

5.验证重试测试是那个参数:
https://blog.csdn.net/shenya1314/article/details/70187767

其中, setsockopt 设置了 keepalive 模式,但是系统对 keepalive 默认的参数可能不符合我们的要求,比如空闲 2 小时后才探测对
端是否活跃,所以 WSAIoctl 函数通过 tcp_keepalive 结构体对这些参数进行了相应设置。 tcp_keepalive 这 个 结构体在 mstcpip.h
头文件中有定义:

struct tcp_keepalive {
  ULONG onoff ;   // 是否开启 keepalive
  ULONG keepalivetime ;  // 多长时间( ms )没有数据就开始 send 心跳包
  ULONG keepaliveinterval ; // 每隔多长时间( ms ) send 一个心跳包,
// 发 5 次 (2000 XP 2003 默认 ), 10 次 (Vista 后系统默认 )
};

这个结构体设置了空闲检测时间,及检测时重复发送的间隔时间。详细的可以查询 msdn:http://msdn.microsoft.com/en-us/library/dd877220(VS.85).aspx 。
按照 msdn 上的说法,这些参数也可以通过在注册表里设置,分别为:

HKLM/SYSTEM/CurrentControlSet/Services/Tcpip/Parameters/KeepAliveTime
HKLM/SYSTEM/CurrentControlSet/Services/Tcpip/Parameters/KeepAliveInterval
 
另外,有些人可能已经发现了, tcp_keepalive 这个结构体中没有对重试次数这个参数的设置,这个参数可以通过注册表来设置,具体位置为:
HKLM/SYSTEM/CurrentControlSet/Services/Tcpip/Parameters/TcpMaxDataRetransmissions
关于在注册表中设置这几个参数,我在 XP 和 Server2008 系统中都没有找到, msdn 上说貌似只是支持 server 2003 ,我这里没有实验,具体不太清楚。

REGEDIT4

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Tcpip\Parameters]
"KeepAliveTime"=dword:00001770
"KeepAliveInterval"=dword:000003e8
"MaxDataRetries"="5"
"TcpMaxDataRetransmissions"="5"

--//再次重启测试,其它步骤不再贴出。

# tcpdump -vvnni eth0  port 49513
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
10:41:35.694775 IP (tos 0x0, ttl 127, id 3395, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 4042311713:4042311714(1) ack 2224808308 win 16289
10:41:35.694958 IP (tos 0x0, ttl  64, id 4842, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49513: ., cksum 0x47cc (incorrect (-> 0xd828), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>
10:41:41.692546 IP (tos 0x0, ttl 127, id 3440, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:41:41.692584 IP (tos 0x0, ttl  64, id 4843, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49513: ., cksum 0x47cc (incorrect (-> 0xd828), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>
10:41:47.694272 IP (tos 0x0, ttl 127, id 3514, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:41:47.694313 IP (tos 0x0, ttl  64, id 4844, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49513: ., cksum 0x47cc (incorrect (-> 0xd828), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>
10:41:53.697620 IP (tos 0x0, ttl 127, id 3586, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:41:53.697664 IP (tos 0x0, ttl  64, id 4845, offset 0, flags [DF], proto: TCP (6), length: 52) 192.168.100.78.1521 > 192.168.98.6.49513: ., cksum 0x47cc (incorrect (-> 0xd828), 1:1(0) ack 1 win 330 <nop,nop,sack 1 {0:1}>

# iptables -I INPUT 1 -p tcp --sport 49513 -j DROP

10:41:59.696732 IP (tos 0x0, ttl 127, id 3653, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:00.691771 IP (tos 0x0, ttl 127, id 3666, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:01.691836 IP (tos 0x0, ttl 127, id 3675, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:02.691884 IP (tos 0x0, ttl 127, id 3686, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:03.692474 IP (tos 0x0, ttl 127, id 3698, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:04.695563 IP (tos 0x0, ttl 127, id 3717, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:05.695603 IP (tos 0x0, ttl 127, id 3725, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:06.696245 IP (tos 0x0, ttl 127, id 3736, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:07.697502 IP (tos 0x0, ttl 127, id 3744, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:08.698103 IP (tos 0x0, ttl 127, id 3755, offset 0, flags [DF], proto: TCP (6), length: 41) 192.168.98.6.49513 > 192.168.100.78.1521: ., cksum 0x420f (correct), 0:1(1) ack 1 win 16289
10:42:09.697217 IP (tos 0x0, ttl 127, id 3764, offset 0, flags [DF], proto: TCP (6), length: 40) 192.168.98.6.49513 > 192.168.100.78.1521: R, cksum 0x81ac (correct), 1:1(0) ack 1 win 0
--//还是不对,放弃测试。

总结:
1.windows 测试真心的烦,一共重启3次。不知道修改注册表如何快速生效。
2.知道如何修改注册表的相关信息的具体位置。
3.重试次数默认好像是10次,我设置的"MaxDataRetries"="5","TcpMaxDataRetransmissions"="5"或者根本不能改变。看链接:

https://docs.microsoft.com/en-us/previous-versions/windows/desktop/legacy/dd877220(v=vs.85)?redirectedfrom=MSDN

/* Argument structure for SIO_KEEPALIVE_VALS */
struct tcp_keepalive {
    u_long  onoff;
    u_long  keepalivetime;
    u_long  keepaliveinterval;
};

The value specified in the onoff member determines if TCP keep-alive is enabled or disabled. If the onoff member is set
to a nonzero value, TCP keep-alive is enabled and the other members in the structure are used. The keepalivetime member
specifies the timeout, in milliseconds, with no activity until the first keep-alive packet is sent. The
keepaliveinterval member specifies the interval, in milliseconds, between when successive keep-alive packets are sent if
no acknowledgement is received.

The SO_KEEPALIVE option, which is one of the SOL_SOCKET Socket Options, can also be used to enable or disable the TCP
keep-alive on a connection, as well as query the current state of this option. To query whether TCP keep-alive is
enabled on a socket, the getsockopt function can be called with the SO_KEEPALIVE option. To enable or disable TCP
keep-alive, the setsockopt function can be called with the SO_KEEPALIVE option. If TCP keep-alive is enabled with
SO_KEEPALIVE, then the default TCP settings are used for keep-alive timeout and interval unless these values have been
changed using SIO_KEEPALIVE_VALS.

The default settings when a TCP socket is initialized sets the keep-alive timeout to 2 hours and the keep-alive interval
to 1 second. The default system-wide value of the keep-alive timeout is controllable through the KeepAliveTime registry
setting which takes a value in milliseconds. The default system-wide value of the keep-alive interval is controllable
through the KeepAliveInterval registry setting which takes a value in milliseconds.

On Windows Vista and later, the number of keep-alive probes (data retransmissions) is set to 10 and cannot be changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--//这里提示不能改变。

On Windows Server 2003, Windows XP, and Windows 2000, the default setting for number of keep-alive probes is 5. The
number of keep-alive probes is controllable through the TcpMaxDataRetransmissions and PPTPTcpMaxDataRetransmissions
registry settings. The number of keep-alive probes is set to the larger of the two registry key values. If this number
is 0, then keep-alive probes will not be sent. If this number is above 255, then it is adjusted to 255.


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-2676353/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
熟悉oracle相关技术,擅长sql优化,rman备份与恢复,熟悉linux shell编程。

注册时间:2008-01-03

  • 博文量
    2669
  • 访问量
    6426380