ITPub博客

首页 > Linux操作系统 > Linux操作系统 > Oracle’s Lag in MS SQL Server 2005

Oracle’s Lag in MS SQL Server 2005

原创 Linux操作系统 作者:duckula81 时间:2011-03-15 14:44:45 0 删除 编辑

I am currently porting our new database from Oracle 10g to MS SQL Server 2005, and I have it all done except the views that use the Oracle LAG and LEAD functions (non-ANSI).

What these functions provide (for the MS SQL camp) is the ability to get the next or previous rows when sorted. In my case I have a value that is the ‘volume change since the start’ at time intervals, and I want the relative change between each interval.

So PL/SQL of the view is:

SELECT t.*,
    t.volume - LAG(volume) OVER (PARTITION BY group_number 
                                 ORDER BY timestamp) AS volume_change
FROM volume_table t;

The partition clause splits the data into different buckets, then each bucket is sorted, with all results returned.

Asking on the NZ .Net User Group mailing list I got a pointer to this MS feedback page, but the solution presented there gives me an error “Incorrect syntax near ‘ROWS’.” when I run this query against SQL 2K5

SELECT MIN(volume) OVER(PARTITION BY group_number 
                        ORDER BY timestamp
                        ROWS BETWEEN 1 PRECEDING 
                        AND 1 PRECEDING) change
FROM volume_table;
GO

I had a side point showing why I wanted to avoid sub-select, as the performance of a different query had an orders of magnitude improvement from changing to using a LAG function, yet that same sub-select query runs just as fast as the “improved” Oracle statement in MS SQL Server, so I’ll just stick to the main topic, and post about that another day…

Chris recently showed how to use Common Table Expressions (CTE) (sort of auto-magic temp table) to find the first entries for a day, which is very close to what I was want, but the filtering is hard coded.  I could not see how to make it dynamic, so I used the idea, and started massaging the concept, till I finally got what I wanted.

Conceptually the Oracle solution could be done using cursors under the hood to provide the rolling previous (LAG) rows, where-as here I’m doing many look-ups but the table is not getting re-created as in the nested select method.

So my code is as follows:

WITH Rows( vol_diff, time, rn, gn ) AS
(
    SELECT v.volume,
        v.timestamp,
        Row_Number() OVER (PARTITION BY group_number
                           ORDER BY timestamp),
        group_number
    FROM volume_table v
),
PrevRows( timestamp, prev_vol, group_number) AS
(
    SELECT a.time, b.vol_diff, a.gn
    FROM Rows a
    LEFT JOIN Rows b 
        ON a.rn = b.rn + 1 
        AND a.gn = b.gn
)
SELECT v.*, v.volume - p.prev_vol as volume_change
FROM volume_table v
LEFT JOIN PrevRows p
    ON v.timestamp = p.timestamp
    AND v.group_number = p.group_number;
GO

So I use two CTE tables, one to partition and sort the data, the second to do a lag based join, then I can select the lagged based data, by matching the time and group to the current entry.It works a treat, and I will do some performance testing tomorrow once my production data has finished loading into my db.

After the results of the not discussed query I expect that the sub-select will be just as performant.

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/222840/viewspace-689505/,如需转载,请注明出处,否则将追究法律责任。

上一篇: 开源BI系统简述
请登录后发表评论 登录
全部评论

注册时间:2010-09-26

  • 博文量
    11
  • 访问量
    18026