ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 某台机器下语句写法不同结果不同

某台机器下语句写法不同结果不同

原创 Linux操作系统 作者:lsl031 时间:2011-08-17 09:09:30 0 删除 编辑

select repolicyno,rectimes   from grreplyshare group by repolicyno,rectimes having sum(sharerate)=90
这个语句的结果直在247-230条记录间变化。
select repolicyno,rectimes   from grreplyshare group by repolicyno,rectimes having sum(sharerate)=90
order by repolicyno
则是保持244条记录不变。
不知道为什么?这三个列是没有null值的。
详细看后,实际上有些select repolicyno,rectimes   from grreplyshare group by repolicyno,rectimes having sum(sharerate)=90 出现在这个结果里的数据实际上sum(sharerate)=100 而不是90

错误语句的执行计划:
select repolicyno,rectimes   from grreplyshare group by repolicyno,rectimes having sum(sharerate)=90
Operation        Object Name        Rows        Bytes        Cost        Object Node        In/Out        PStart        PStop

SELECT STATEMENT Optimizer Mode=ALL_ROWS                253                   6963                                                      
  FILTER                                                                                        
    HASH GROUP BY                253          6 K        6963                                                      
      TABLE ACCESS FULL        HKTESTDB.GRREPLYSHARE        1 M        25 M        6690          
是hash group by


正确结果:
select repolicyno,rectimes   from grreplyshare group by repolicyno,rectimes having sum(sharerate)=90
order by repolicyno
Operation        Object Name        Rows        Bytes        Cost        Object Node        In/Out        PStart        PStop

SELECT STATEMENT Optimizer Mode=ALL_ROWS                253                   6963                                                      
  FILTER                                                                                        
    SORT GROUP BY                253          6 K        6963                                                      
      TABLE ACCESS FULL        HKTESTDB.GRREPLYSHARE        1 M        25 M        6690                                                      
是SORT GROUP BY的。

select * from (select repolicyno,sum(sharerate) a,rectimes from  grreplyshare  group by repolicyno,rectimes )where a=90 order by repolicyno
select * from (select repolicyno,sum(sharerate) a,rectimes from  grreplyshare  group by repolicyno,rectimes )where a=90
两者结果也不一样。
有order by的结果就是224条,是正确的结果,而没有的结果就不确定。有几条记过sum结果实际上是100的都列出来了

我把这个表的数据用create table as 的方式拿到同一台服务器上另外的数据库上,就不会出现这样的结果。
难道是和服务器某个参数相关?相关的参数应该和nls有关(奇怪的是出问题的结果是不确定的),但我比对了两者间的参数,没有找到任何异常。
到metalink上查sql结果不正确的问题,其中有一个是需要做10046的trace跟踪。


Please upload 10046 trace for the query to Metalink.
If you can obtain both the correct and incorrect results with a workaround then please provide 10046 trace for both.

To gather 10046 trace at the session level:

SQL> alter session set timed_statistics = true;
SQL> alter session set statistics_level=all;
SQL> alter session set max_dump_file_size = unlimited;
SQL> alter session set events '10046 trace name context forever,level 12';

SQL> -- run your select(s) --

SQL> select * from dual;
SQL> exit;


按照上面方法做dump后结果如下:
PARSING IN CURSOR #1 len=101 dep=0 uid=55 ct=3 lid=55 tim=1327097076347 hv=2052279411 ad='36830788'
select repolicyno,rectimes from  grreplyshare  group by repolicyno,rectimes having sum(sharerate)=90
END OF STMT
PARSE #1:c=10000,e=11381,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=1327097076343
BINDS #1:
EXEC #1:c=0,e=81,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1327097076488
WAIT #1: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=312 tim=1327097076519
WAIT #1: nam='SQL*Net message from client' ela= 68079 driver id=1413697536 #bytes=1 p3=0 obj#=312 tim=1327097144655
WAIT #1: nam='direct path write temp' ela= 511 file number=202 first dba=105097 block cnt=7 obj#=312 tim=1327097215875
WAIT #1: nam='direct path write temp' ela= 496 file number=202 first dba=105111 block cnt=7 obj#=312 tim=1327097216557
WAIT #1: nam='direct path write temp' ela= 279 file number=202 first dba=105125 block cnt=7 obj#=312 tim=1327097217221
WAIT #1: nam='direct path write temp' ela= 481 file number=202 first dba=105139 block cnt=7 obj#=312 tim=1327097217890
WAIT #1: nam='direct path write temp' ela= 530 file number=202 first dba=105153 block cnt=7 obj#=312 tim=1327097218545
WAIT #1: nam='direct path write temp' ela= 597 file number=202 first dba=105167 block cnt=7 obj#=312 tim=1327097219225
WAIT #1: nam='direct path write temp' ela= 410 file number=202 first dba=105181 block cnt=7 obj#=312 tim=1327097219891
WAIT #1: nam='direct path write temp' ela= 434 file number=202 first dba=105195 block cnt=7 obj#=312 tim=1327097220551
WAIT #1: nam='direct path write temp' ela= 407 file number=202 first dba=105209 block cnt=7 obj#=312 tim=1327097221226
WAIT #1: nam='direct path write temp' ela= 5 file number=202 first dba=105609 block cnt=7 obj#=312 tim=1327097222033
WAIT #1: nam='direct path write temp' ela= 437 file number=202 first dba=105623 block cnt=7 obj#=312 tim=1327097222698
WAIT #1: nam='direct path write temp' ela= 390 file number=202 first dba=105637 block cnt=7 obj#=312 tim=1327097223361
WAIT #1: nam='direct path write temp' ela= 245 file number=202 first dba=105651 block cnt=7 obj#=312 tim=1327097224034
WAIT #1: nam='direct path write temp' ela= 613 file number=202 first dba=105665 block cnt=7 obj#=312 tim=1327097224713
WAIT #1: nam='direct path write temp' ela= 128 file number=202 first dba=105679 block cnt=7 obj#=312 tim=1327097225368
WAIT #1: nam='direct path write temp' ela= 5 file number=202 first dba=105693 block cnt=7 obj#=312 tim=1327097228151
WAIT #1: nam='direct path write temp' ela= 9 file number=202 first dba=105707 block cnt=7 obj#=312 tim=1327097279993
WAIT #1: nam='direct path write temp' ela= 5 file number=202 first dba=105118 block cnt=7 obj#=312 tim=1327097281446
WAIT #1: nam='direct path write temp' ela= 6 file number=202 first dba=105728 block cnt=7 obj#=312 tim=1327097282210
WAIT #1: nam='direct path write temp' ela= 445 file number=202 first dba=104969 block cnt=7 obj#=312 tim=1327097282800
WAIT #1: nam='direct path write temp' ela= 270 file number=202 first dba=105202 block cnt=7 obj#=312 tim=1327097283482
WAIT #1: nam='direct path write temp' ela= 758 file number=202 first dba=104990 block cnt=7 obj#=312 tim=1327097284778
WAIT #1: nam='direct path write temp' ela= 0 file number=202 first dba=104997 block cnt=7 obj#=312 tim=1327097284967
WAIT #1: nam='direct path write temp' ela= 326 file number=202 first dba=105146 block cnt=7 obj#=312 tim=1327097285634
WAIT #1: nam='direct path write temp' ela= 347 file number=202 first dba=105160 block cnt=7 obj#=312 tim=1327097286304
WAIT #1: nam='direct path write temp' ela= 732 file number=202 first dba=105700 block cnt=7 obj#=312 tim=1327097287088
WAIT #1: nam='direct path write temp' ela= 156 file number=202 first dba=105132 block cnt=7 obj#=312 tim=1327097287744
WAIT #1: nam='direct path write temp' ela= 159 file number=202 first dba=105174 block cnt=7 obj#=312 tim=1327097288527
WAIT #1: nam='direct path write temp' ela= 5 file number=202 first dba=105046 block cnt=7 obj#=312 tim=1327097289299
WAIT #1: nam='direct path write temp' ela= 319 file number=202 first dba=105053 block cnt=7 obj#=312 tim=1327097289835
WAIT #1: nam='direct path write temp' ela= 1147 file number=202 first dba=105067 block cnt=7 obj#=312 tim=1327097291151
WAIT #1: nam='direct path write temp' ela= 1 file number=202 first dba=105074 block cnt=7 obj#=312 tim=1327097291232
WAIT #1: nam='direct path write temp' ela= 79 file number=202 first dba=105216 block cnt=7 obj#=312 tim=1327097291897
WAIT #1: nam='direct path write temp' ela= 938 file number=202 first dba=105993 block cnt=7 obj#=312 tim=1327097293238
WAIT #1: nam='direct path write temp' ela= 1 file number=202 first dba=106000 block cnt=7 obj#=312 tim=1327097293285
WAIT #1: nam='direct path write temp' ela= 297 file number=202 first dba=105188 block cnt=7 obj#=312 tim=1327097293945
WAIT #1: nam='direct path write temp' ela= 478 file number=202 first dba=105672 block cnt=7 obj#=312 tim=1327097294597
.............

 

看到很多的direct path write temp等待事件(结果集太大,使用到了磁盘排序),为什么不在内存中就排序呢,察看pga的参数
select * from V$PARAMETER where name like '%pga%'
pga_aggregate_target居然只有10M,

alter system set pga_aggregate_target=1024M scope=both;
后结果正常,一直为224条记录。


其中棉花糖的回复:
_gby_hash_aggregation_enabled设置成false看看第一个。

根据这个提示到网上找到以下内容:

在10gR2中,group by由以前的sort group by改成了hash group by,这种算法上的改进,取消了sort group by必须进行的排序操作,即然是用hash算法,就存在碰撞的可能性,itpub的godlessme就碰到这样的问题,应该算是bug吧。

    下面给大家演示一下如何解决这种问题,其实要解决hash group by引起的排序不准确的问题,就是还用以前的sort group by就可以啦,10gR2中引入_gby_hash_aggregation_enabled隐藏参数,该参数默认设置为true,将它改成false即可。

This problem is introduced in the 10.2.0.1 base release.

Wrong results possible from the result of a non-distinct aggregation
with a group by (such as sum(col)) when HASH GROUP BY is used.
The wrong results show as missing values from the aggregate.


Workaround:
Increase the memory used by hash group by
or
Disable the use of hash group by
eg: set optimizer_features_enable to "10.1.0"
or
set "_gby_hash_aggregation_enabled" to FALSE.
Versions confirmed as being affected 10.2.0.1
10.2.0.2

This issue is fixed in 10.2.0.3 (Server Patch Set)
11g (Future version)

 

 SQL> col ksppinm format a39
    SQL> col ksppstvl format a39
    SQL> select ksppinm, ksppstvl
     from x$ksppi pi, x$ksppcv cv
     where cv.indx=pi.indx and pi.ksppinm like '_%' escape ''
     and pi.ksppinm like '%¶meter%';

    Enter value for parameter: gby
    old 4: and pi.ksppinm like '%¶meter%'
    new 4: and pi.ksppinm like '%gby%'

    KSPPINM KSPPSTVL
    -------------------- ------------
    _gby_onekey_enabled TRUE
    _gby_hash_aggregation_enabled TRUE

    SQL> alter session set "_gby_hash_aggregation_enabled"=false;

alter system set "_gby_hash_aggregation_enabled"=false scope=SPFILE;

最终确定是bug。
hash group by, 通过_gby_hash_aggregation_enabled隐藏参数=false可以设置成原来的排序结果集) 但从严谨的角度看还是建议新的代码里最好有order by 子句

解决办法:
1.代码里最好有order by 子句
2.修改pga,增大pga。
3.修改隐含参数。
4.升级版本,打补丁。p4604970_10202_LINUX.zip。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/288166/viewspace-705047/,如需转载,请注明出处,否则将追究法律责任。

下一篇: 蝴蝶效应
请登录后发表评论 登录
全部评论

注册时间:2009-03-24

  • 博文量
    56
  • 访问量
    799681