ITPub博客

首页 > Linux操作系统 > Linux操作系统 > ITPUB SQL大赛第二期(二)

ITPUB SQL大赛第二期(二)

原创 Linux操作系统 作者:yangtingkun 时间:2011-03-29 23:20:31 0 删除 编辑

贴一下第二期的第二道题。

第二期题目参考:http://www.itpub.net/thread-1403356-1-1.html

版主newkid点评参考:http://www.itpub.net/thread-1411980-1-1.html

 

 

简单描述一下问题:

问题2(40)

在一个N*N的棋盘中,每行放置M个球,每列、每个45度的斜线上最多放置M个球,其中5<=N<=6,1<=M<=(N-1),现要求出每个M,N组合中最多摆放球的不同的摆法的个数(包括重复和不重复的,分别输出)。请用最多两条SQL语句得到以下结果:(以M=2, N=5为例)

两条SQL的输出格式:

SQL1:

M      N         AllCnt

2      5          92

 

SQL2:

M      N         NoReptCnt

2      5          xx

 

 

一条SQL的输出格式:

M      N         AllCnt           NoReptCnt

2      5          92                xx

 

Oracle变量定义如下(以M=2, N=5为例):

var m number;

exec :m:=2;

var n number;

exec :n:=5;

 

这道题其实就是在第一题的基础上要求了扩展性,虽然题目要求5<=N<=6,但是真正扩展性好的解题应该是2<=N,当然由于算法和机器性能的限制,可能N最多也就支持到7左右。

SQL> VAR N NUMBER
SQL> VAR M NUMBER
SQL> EXEC :N := 5

PL/SQL 过程已成功完成。

SQL> EXEC :M := 2

PL/SQL 过程已成功完成。

SQL> SET PAGES 100 LINES 120 TIMING ON

SQL> with i as
  2  (select rownum i from dual connect by rownum <= :n),
  3  j as
  4  (select rownum j from dual connect by rownum <= :n),
  5  position as
  6  (select i, j
  7  from i, j),
  8  b as
  9  (select rownum - 1 b from dual connect by rownum <= 2),
 10  b_line as
 11  (select replace(sys_connect_by_path(b, ','), ',', '') line
 12  from b
 13  where level = :n
 14  connect by level <= :n),
 15  lines as
 16  (select rownum, line from b_line
 17  where instr(line, 1, 1, :m) > 0
 18  and instr(line, 0, 1, :n - :m) > 0),
 19  lines_result as
 20  (select replace(sys_connect_by_path(line, ','), ',', '') result
 21  from lines
 22  where level = :n
 23  connect by level <= :n)
 24  select :m M, :n N, count(*) AllCnt, count(distinct greatest(res1, res2, res3, res4, res5, res6, res7, res8)) NoReptCnt
 25  from (
 26  select result res1, reverse(result) res2,
 27  (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + 1, :n)), ',', '') from (select i from i order by i desc)) res3,
 28  (select replace(wmsys.wm_concat(reverse(substr(result, (i-1)*:n + 1, :n))), ',', '') from (select i from i order by i)) res4,
 29  (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j desc, i)) res5,
 30  (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j, i)) res6,
 31  (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j, i desc)) res7,
 32  (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j desc, i desc)) res8
 33  from (
 34  select result, j col, (j-i) l, (j+i) r, sum(substr(result, (i-1)*:n + j, 1)) c
 35  from lines_result, position
 36  group by grouping sets ((result, j), (result, (j-i)), (result, (j+i))))
 37  group by result
 38  having max(case when l is null and r is null then c end) = :m
 39  and max(case when col is null and l is null then c end) <= :m
 40  and max(case when col is null and r is null then c end) <= :m
 41  );

         M          N     ALLCNT  NOREPTCNT
---------- ---------- ---------- ----------
         2          5         92         14

已用时间:  00: 00: 22.43

这是最初的SQL,以实现为主要目的。主要的思路仍然是仿照第一题的似乎,只不过这里不能确定多少个表进行连接,因此构造的工作由树形查询来完成。

首先仍然是确定每行两个球,然后将符合条件的行构造数组。将数据结果和位置进行关联,分别对jj+ij-i进行聚集,聚集的结果分别对应列、左对角线和右对角线,然后利用having语句过滤不满足条件的列,最终的结果就是符合条件的数组。

在符合条件的数组上,分别进行7种变形,然后最终通过和第一题一样的count distinct greatest方式,获取不冲突的值。

虽然结果满足要求,但是本来满足条件的数组就不少,还多次和N*N的数组进行笛卡尔积,肯定性能不会太好。另外这种方式大量利用了wmsys.wm_concat,性能不但较低,而且这个函数不属于Oracle的标准函数,不一定在所有的数据库中都会存在。

当前这个SQLM=2N=5需要将近23秒,跑M=2N=6基本上就出不来结果了。

SQL> with i as
  2  (select rownum i from dual connect by rownum <= :n),
  3  j as
  4  (select rownum j from dual connect by rownum <= :n),
  5  position as
  6  (select i, j from i, j),
  7  trim as
  8  (select sys_connect_by_path(level -1, ' ') str
  9  from dual
 10  where level = :m + 1
 11  connect by level <= :m + 1),
 12  b as
 13  (select rownum - 1 b from dual connect by rownum <= 2),
 14  b_line as
 15  (select sys_connect_by_path(b, ',') line
 16  from b
 17  where level = :n
 18  connect by level <= :n),
 19  lines as
 20  (select rownum, replace(line, ',', '') line from b_line
 21  where instr(line, 1, 1, :m) > 0
 22  and instr(line, 0, 1, :n - :m) > 0),
 23  lines_result as
 24  (select replace(sys_connect_by_path(line, ','), ',', '') result
 25  from lines
 26  where level = :n
 27  connect by level <= :n)
 28  select :m M, :n N, count(*) AllCnt, count(distinct greatest(res1, reverse(res1), res2, reverse(res2), res3, reverse(res3), res4, reverse(res4))) NoReptCnt
 29  from (
 30  select result res1,
 31  (select replace(sys_connect_by_path(reverse(substr(result, (level-1)*:n + 1, :n)), ','), ',', '') from dual where level = :n connect by level <= :n) res2,
 32  (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j desc, i)) res3,
 33  (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j, i)) res4
 34  from lines_result, i
 35  group by result
 36  having sum(substr(result, (i-1)*:n + 1, :n)) = to_number(lpad(:m, :n, :m))
 37  and ltrim(sum(substr(result, (i-1)*:n + 1, :n)*power(10, i-1)), (select str from trim)) is null
 38  and ltrim(sum(substr(result, (i-1)*:n + 1, :n)*power(10, :n-i)), (select str from trim)) is null
 39  );

         M          N     ALLCNT  NOREPTCNT
---------- ---------- ---------- ----------
         2          5         92         14

已用时间:  00: 00: 03.48
SQL> exec :n := 6

PL/SQL 过程已成功完成。

已用时间:  00: 00: 00.00
SQL> /

         M          N     ALLCNT  NOREPTCNT
---------- ---------- ---------- ----------
         2          6       1097        155

已用时间:  00: 08: 23.27

这是第一次大规模优化后的结果,跑M=2N=5需要3秒多,是原始SQL1/7,而M=2,N=6也可以得到结果,但是用了8分钟之久,速度仍然太慢。

这里优化主要在两个方面,首先去掉了对数组和位置的笛卡尔积,而变成了与行的笛卡尔积,这使得最终结果小了N倍,同时把GROUPING SET聚集方式又修改为ltrim方式。然后优化了RESULT的变形,除了本身结果外,只需要在获得3种变形,然后利用REVERSE获得另外的4种变形。

当前的效率仍然很低,主要原因在于使用树形查询没有办法在运行的中间将不符合条件的结果过滤掉,为了解决这个问题,只能使用11.2的新特性递归WITH语句。

递归WITH11.2刚出来的时候,作为新特性研究了一下,不过并不深入。现在别说递归WITH,就是WITH语句,甚至SELECT语句,写的都少得多了。所以基本上仍然处于现学现卖的过程,经过若干次小的优化之后,提交的最终SQL如下:

SQL> with trim as
  2  (select sys_connect_by_path(level - 1, ' ') str
  3  from dual
  4  where level = :m + 1
  5  connect by level <= :m + 1),
  6  line_count as
  7  (select (select round(power(2, sum(log(2, rownum)))) from dual connect by rownum <= :n)
  8     / (select round(power(2, sum(log(2, rownum)))) from dual connect by rownum <= :m)
  9     / (select round(power(2, sum(log(2, rownum)))) from dual connect by rownum <= :n - :m)
 10  from dual),
 11  b as
 12  (select rownum - 1 b from dual connect by rownum <= 2),
 13  b_line (line, cnt) as
 14  (select b || '', b from b
 15  union all
 16  select b || line, b + cnt
 17  from b, b_line
 18  where cnt <= :m)
 19  cycle line set dup_line to 'y' default 'n',
 20  lines as
 21  (select line
 22  from b_line
 23  where length(line) = :n
 24  and cnt = :m
 25  and rownum <= (select * from line_count)),
 26  line_result (result, cnt, r, l) as
 27  (select line || '', to_number(line), to_number(line), to_number(line)
 28  from lines
 29  union all
 30  select line || result, to_number(line) + cnt, to_number(line) + 10*r, to_number(line) + l/10
 31  from lines, line_result
 32  where ltrim(cnt, (select str from trim)) is null
 33  and ltrim(r, (select str from trim)) is null
 34  and ltrim(l, (select str || '.' from trim)) is null
 35  and instr(case when length(result) >= :n*(:n-:m+1) then to_char(cnt, rpad('0', :n, '9')) else '9' end, '0', 1) = 0
 36  and instr(case when :n-:m+2 < :n and length(result) >= :n*(:n-:m+2) then to_char(cnt, rpad('0', :n, '9')) else '9' end, '1', 1) = 0)
 37  cycle result set dup_result to 'y' default 'n',
 38  result as
 39  (select result, cnt, r, l from line_result
 40  where rownum <= power((select * from line_count), :n))
 41  select :m M, :n N, count(*) AllCnt, count(distinct greatest(res1, reverse(res1), res2, reverse(res2), res3, reverse(res3), res4, reverse(res4))) NoReptCnt
 42  from (
 43  select result res1,
 44  (select replace(sys_connect_by_path(reverse(substr(result, (level-1)*:n + 1, :n)), ','), ',', '') from dual where level = :n connect by level <= :n) res2,
 45  (select replace(sys_connect_by_path(substr(result, mod(level-1,:n)*:n + ceil(level/:n), 1), ','), ',', '') from dual where level = :n*:n connect by level <= :n*:n) res3,
 46  (select replace(sys_connect_by_path(substr(result, mod(level-1,:n)*:n + :n-ceil(level/:n)+1, 1), ','), ',', '') from dual where level = :n*:n connect by level <= :n*:n) res4
 47  from result
 48  where length(result) = :n*:n
 49  and cnt = to_number(lpad(:m, :n, :m))
 50  and ltrim(r, (select str from trim)) is null
 51  and ltrim(l, (select str || '.' from trim)) is null
 52  );

         M          N     ALLCNT  NOREPTCNT
---------- ---------- ---------- ----------
         2          6       1097        155

已用时间:  00: 00: 06.26
SQL> exec :m := 3

PL/SQL 过程已成功完成。

已用时间:  00: 00: 00.00
SQL> /

         M          N     ALLCNT  NOREPTCNT
---------- ---------- ---------- ----------
         3          6      14412       1811

已用时间:  00: 03: 25.54

采用递归WITH的最大好处就是可以在每层迭代的同时过滤掉不满足条件的数据,使得查询速度有了质的飞跃,运行M=2N=6现在只需要6秒多点,不过最耗时的M=3N=6还需要3分半钟,不是很令人满意。

简单介绍一下思路:trim字句是获取ltrim的字符串;line_count用于计算Cnm的值;两层递归WITH查询分别构造行记录和数组记录,其中第二个构造数组结果的递归WITH,还同时求出了列和两个对角线的和,在递归的过程中可以用LTRIM函数过滤列和对角线来清除不满足条件的记录。

对获得的结果,利用树形查询替代了wmsys.wm_concat对结果进行转换,获取另外三种变形,然后利用REVERSE获取其他四种变形,最终利用COUNT DISTINCT GREATEST来获得不重复的记录。

由于最近工作上的事情很多,导致第二题没有足够的时间进行思考和优化,因此虽然明知道性能不是很满意,也没有办法只能提交了。

不过在做第三题的时候,发现第二题还有很多可以优化的地方,只需要做些简单的修改,在递归的时候就可以过滤更多的无用数据:

SQL> with trim as
  2  (select sys_connect_by_path(level - 1, ' ') str
  3  from dual
  4  where level = :m + 1
  5  connect by level <= :m + 1),
  6  b as
  7  (select rownum - 1 b from dual connect by rownum <= 2),
  8  b_line (n, line, cnt) as
  9  (select 1 n, to_char(b), b from b
 10  union all
 11  select n + 1, line || b, cnt + b
 12  from b, b_line
 13  where n < :n
 14  and cnt <= :m),
 15  lines as
 16  (select line
 17  from b_line
 18  where cnt = :m
 19  and length(line) = :n),
 20  line_result (n, result, cnt, r, l) as
 21  (select 1 n, line, to_number(line), to_number(line), to_number(line)
 22  from lines
 23  union all
 24  select n + 1, result || line, to_number(line) + cnt, to_number(line) + 10*r, to_number(line) + l/10
 25  from lines, line_result
 26  where n < :n
 27  and ltrim(to_number(line) + cnt, (select str from trim)) is null
 28  and ltrim(to_number(line) + 10*r, (select str from trim)) is null
 29  and ltrim(to_number(line) + l/10, (select str || '.' from trim)) is null
 30  and instr(case when length(result) >= :n*(:n-:m+1) then to_char(cnt, rpad('0', :n, '9')) else '9' end, '0', 1) = 0
 31  and instr(case when :n-:m+2 < :n and length(result) >= :n*(:n-:m+2) then to_char(cnt, rpad('0', :n, '9')) else '9' end, '1', 1) = 0)
 32  select :m M, :n N, count(*) AllCnt, count(distinct greatest(res1, reverse(res1), res2, reverse(res2), res3, reverse(res3), res4, reverse(res4))) NoReptCnt
 33  from (
 34  select result res1,
 35  (select replace(sys_connect_by_path(reverse(substr(result, (level-1)*:n + 1, :n)), ','), ',', '') from dual where level = :n connect by level <= :n) res2,
 36  (select replace(sys_connect_by_path(substr(result, mod(level-1,:n)*:n + ceil(level/:n), 1), ','), ',', '') from dual where level = :n*:n connect by level <= :n*:n) res3,
 37  (select replace(sys_connect_by_path(substr(result, mod(level-1,:n)*:n + :n-ceil(level/:n)+1, 1), ','), ',', '') from dual where level = :n*:n connect by level <= :n*:n) res4
 38  from line_result
 39  where length(result) = :n*:n
 40  and cnt = to_number(lpad(:m, :n, :m))
 41  and ltrim(r, (select str from trim)) is null
 42  and ltrim(l, (select str || '.' from trim)) is null
 43  );

         M          N     ALLCNT  NOREPTCNT
---------- ---------- ---------- ----------
         2          6       1097        155

已用时间:  00: 00: 01.37
SQL> exec :m := 3

PL/SQL 过程已成功完成。

已用时间:  00: 00: 00.00
SQL> /

         M          N     ALLCNT  NOREPTCNT
---------- ---------- ---------- ----------
         3          6      14412       1811

已用时间:  00: 00: 25.14

速度又提高了将近8倍。对于M=3N=6而言,26秒的结果应该算是比较满意了,而且代码长度减少了1/5

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-691264/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
暂无介绍

注册时间:2007-12-29

  • 博文量
    1955
  • 访问量
    10526208