ITPub博客

首页 > Linux操作系统 > Linux操作系统 > ORACLE分析函数手册二(转)

ORACLE分析函数手册二(转)

原创 Linux操作系统 作者:zhouwf0726 时间:2019-07-13 11:06:07 0 删除 编辑

REGR_ (Linear Regression) Functions 功能描述:这些线性回归函数适合最小二乘法回归线,9个不同的回归函数可使用。 REGR_SLOPE:返回斜率,等于COVAR_POP(expr1, expr2) / VAR_POP(expr2) REGR_INTERCEPT:返回回归线的y截距,等于 AVG(expr1) - REGR_SLOPE(expr1, expr2) * AVG(expr2) REGR_COUNT:返回用于填充回归线的非空数字对的数目 REGR_R2:返回回归线的决定系数,计算式为: If VAR_POP(expr2) = 0 then return NULL If VAR_POP(expr1) = 0 and VAR_POP(expr2) != 0 then return 1 If VAR_POP(expr1) > 0 and VAR_POP(expr2 != 0 then return POWER(CORR(expr1,expr),2) REGR_AVGX:计算回归线的自变量(expr2)的平均值,去掉了空对(expr1, expr2)后,等于AVG(expr2) REGR_AVGY:计算回归线的应变量(expr1)的平均值,去掉了空对(expr1, expr2)后,等于AVG(expr1) REGR_SXX 返回值等于REGR_COUNT(expr1, expr2) * VAR_POP(expr2) REGR_SYY 返回值等于REGR_COUNT(expr1, expr2) * VAR_POP(expr1) REGR_SXY: 返回值等于REGR_COUNT(expr1, expr2) * COVAR_POP(expr1, expr2)(下面的例子都是在SH用户下完成的)SAMPLE 1:下例计算1998年最后三个星期中两种产品(260270)在周末的销售量中已开发票数量和总数量的累积斜率和回归线的截距SELECT t.fiscal_month_number "Month", t.day_number_in_month "Day", REGR_SLOPE(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_SLOPE, REGR_INTERCEPT(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_ICPT FROM sales s, times tWHERE s.time_id = t.time_id AND s.prod_id IN (270, 260) AND t.fiscal_year=1998 AND t.fiscal_week_number IN (50, 51, 52) AND t.day_number_in_week IN (6,7) ORDER BY t.fiscal_month_desc, t.day_number_in_month; Month Day CUM_SLOPE CUM_ICPT---------- ---------- ---------- ---------- 12 12 -68 1872 12 12 -68 1872 12 13 -20.244898 1254.36735 12 13 -20.244898 1254.36735 12 19 -18.826087 1287 12 20 62.4561404 125.28655 12 20 62.4561404 125.28655 12 20 62.4561404 125.28655 12 20 62.4561404 125.28655 12 26 67.2658228 58.9712313 12 26 67.2658228 58.9712313 12 27 37.5245541 284.958221 12 27 37.5245541 284.958221 12 27 37.5245541 284.958221SAMPLE 2:下例计算19984月每天的累积交易数量SELECT UNIQUE t.day_number_in_month, REGR_COUNT(s.amount_sold, s.quantity_sold) OVER (PARTITION BY t.fiscal_month_number ORDER BY t.day_number_in_month) "Regr_Count"FROM sales s, times tWHERE s.time_id = t.time_id AND t.fiscal_year = 1998 AND t.fiscal_month_number = 4;DAY_NUMBER_IN_MONTH Regr_Count------------------- ---------- 1 825 2 1650 3 2475 4 3300... 26 21450 30 22200SAMPLE 3:下例计算1998年每月销售量中已开发票数量和总数量的累积回归线决定系数SELECT t.fiscal_month_number, REGR_R2(SUM(s.amount_sold), SUM(s.quantity_sold)) OVER (ORDER BY t.fiscal_month_number) "Regr_R2" FROM sales s, times t WHERE s.time_id = t.time_id AND t.fiscal_year = 1998 GROUP BY t.fiscal_month_number ORDER BY t.fiscal_month_number;FISCAL_MONTH_NUMBER Regr_R2------------------- ---------- 1 2 1 3 .927372984 4 .807019972 5 .932745567 6 .94682861 7 .965342011 8 .955768075 9 .959542618 10 .938618575 11 .880931415 12 .882769189SAMPLE 4:下例计算199812月最后两周产品260的销售量中已开发票数量和总数量的累积平均值SELECT t.day_number_in_month, REGR_AVGY(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) "Regr_AvgY", REGR_AVGX(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) "Regr_AvgX" FROM sales s, times t WHERE s.time_id = t.time_id AND s.prod_id = 260 AND t.fiscal_month_desc = @#1998-12@# AND t.fiscal_week_number IN (51, 52) ORDER BY t.day_number_in_month;DAY_NUMBER_IN_MONTH Regr_AvgY Regr_AvgX------------------- ---------- ---------- 14 882 24.5 14 882 24.5 15 801 22.25 15 801 22.25 16 777.6 21.6 18 642.857143 17.8571429 18 642.857143 17.8571429 20 589.5 16.375 21 544 15.1111111 22 592.363636 16.4545455 22 592.363636 16.4545455 24 553.846154 15.3846154 24 553.846154 15.3846154 26 522 14.5 27 578.4 16.0666667SAMPLE 5:下例计算产品26027019982月周末销售量中已开发票数量和总数量的累积REGR_SXY, REGR_SXX, and REGR_SYY统计值SELECT t.day_number_in_month, REGR_SXY(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_sxy", REGR_SYY(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_syy", REGR_SXX(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_sxx"FROM sales s, times tWHERE s.time_id = t.time_id AND prod_id IN (270, 260) AND t.fiscal_month_desc = @#1998-02@# AND t.day_number_in_week IN (6,7)ORDER BY t.day_number_in_month;DAY_NUMBER_IN_MONTH Regr_sxy Regr_syy Regr_sxx------------------- ---------- ---------- ---------- 1 18870.4 2116198.4 258.4 1 18870.4 2116198.4 258.4 1 18870.4 2116198.4 258.4 1 18870.4 2116198.4 258.4 7 18870.4 2116198.4 258.4 8 18870.4 2116198.4 258.4 14 18870.4 2116198.4 258.4 15 18870.4 2116198.4 258.4 21 18870.4 2116198.4 258.4 22 18870.4 2116198.4 258.4

ROW_NUMBER 功能描述:返回有序组中一行的偏移量,从而可用于按特定标准排序的行号SAMPLE:下例返回每个员工再在每个部门中按员工号排序后的顺序号

SELECT department_id, last_name, employee_id, ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY employee_id) AS emp_id FROM employeesWHERE department_id < 50;DEPARTMENT_ID LAST_NAME EMPLOYEE_ID EMP_ID------------- ------------------------- ----------- ---------- 10 Whalen 200 1 20 Hartstein 201 1 20 Fay 202 2 30 Raphaely 114 1 30 Khoo 115 2 30 Baida 116 3 30 Tobias 117 4 30 Himuro 118 5 30 Colmenares 119 6 40 Mavris 203 1

STDDEV 功能描述:计算当前行关于组的标准偏离。(Standard Deviation

SAMPLE:下例返回部门30按雇佣日期排序的薪水值的累积标准偏离SELECT last_name, hire_date,salary, STDDEV(salary) OVER (ORDER BY hire_date) "StdDev" FROM employees WHERE department_id = 30;LAST_NAME HIRE_DATE SALARY StdDev------------------------- ---------- ---------- ----------Raphaely 07-12-94 11000 0Khoo 18-5 -95 3100 5586.14357Tobias 24-7 -97 2800 4650.0896Baida 24-12-97 2900 4035.26125Himuro 15-11-98 2600 3649.2465Colmenares 10-8 -99 2500 3362.58829S

TDDEV_POP 功能描述:该函数计算总体标准偏离,并返回总体变量的平方根,其返回值与VAR_POP函数的平方根相同。(Standard DeviationPopulationSAMPLE:下例返回部门203060的薪水值的总体标准偏差SELECT department_id, last_name, salary, STDDEV_POP(salary) OVER (PARTITION BY department_id) AS pop_std FROM employeesWHERE department_id in (20,30,60);DEPARTMENT_ID LAST_NAME SALARY POP_STD------------- ------------------------- ---------- ---------- 20 Hartstein 13000 3500 20 Fay 6000 3500 30 Raphaely 11000 3069.6091 30 Khoo 3100 3069.6091 30 Baida 2900 3069.6091 30 Colmenares 2500 3069.6091 30 Himuro 2600 3069.6091 30 Tobias 2800 3069.6091 60 Hunold 9000 1722.32401 60 Ernst 6000 1722.32401 60 Austin 4800 1722.32401 60 Pataballa 4800 1722.32401 60 Lorentz 4200 1722.32401

STDDEV_SAMP 功能描述: 该函数计算累积样本标准偏离,并返回总体变量的平方根,其返回值与VAR_POP函数的平方根相同。(Standard DeviationSample

SAMPLE:下例返回部门203060的薪水值的样本标准偏差SELECT department_id, last_name, hire_date, salary, STDDEV_SAMP(salary) OVER (PARTITION BY department_id ORDER BY hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cum_sdev FROM employeesWHERE department_id in (20,30,60);DEPARTMENT_ID LAST_NAME HIRE_DATE SALARY CUM_SDEV------------- ------------------------- ---------- ---------- ---------- 20 Hartstein 17-2 -96 13000 20 Fay 17-8 -97 6000 4949.74747 30 Raphaely 07-12-94 11000 30 Khoo 18-5 -95 3100 5586.14357 30 Tobias 24-7 -97 2800 4650.0896 30 Baida 24-12-97 2900 4035.26125 30 Himuro 15-11-98 2600 3649.2465 30 Colmenares 10-8 -99 2500 3362.58829 60 Hunold 03-1 -90 9000 60 Ernst 21-5 -91 6000 2121.32034 60 Austin 25-6 -97 4800 2163.33077 60 Pataballa 05-2 -98 4800 1982.42276 60 Lorentz 07-2 -99 4200 1925.61678

SUM 功能描述:该函数计算组中表达式的累积和

SAMPLE:下例计算同一经理下员工的薪水累积值SELECT manager_id, last_name, salary, SUM (salary) OVER (PARTITION BY manager_id ORDER BY salary RANGE UNBOUNDED PRECEDING) l_csum FROM employees WHERE manager_id in (101,103,108);MANAGER_ID LAST_NAME SALARY L_CSUM---------- ------------------------- ---------- ---------- 101 Whalen 4400 4400 101 Mavris 6500 10900 101 Baer 10000 20900 101 Greenberg 12000 44900 101 Higgins 12000 44900 103 Lorentz 4200 4200 103 Austin 4800 13800 103 Pataballa 4800 13800 103 Ernst 6000 19800 108 Popp 6900 6900 108 Sciarra 7700 14600 108 Urman 7800 22400 108 Chen 8200 30600 108 Faviet 9000 39600

VAR_POP功能描述:Variance Population)该函数返回非空集合的总体变量(忽略null),VAR_POP进行如下计算: (SUM(expr2) - SUM(expr)2 / COUNT(expr)) / COUNT(expr)

SAMPLE:下例计算1998年每月销售的累积总体和样本变量(本例在SH用户下运行)SELECT t.calendar_month_desc, VAR_POP(SUM(s.amount_sold)) OVER (ORDER BY t.calendar_month_desc) "Var_Pop", VAR_SAMP(SUM(s.amount_sold)) OVER (ORDER BY t.calendar_month_desc) "Var_Samp" FROM sales s, times tWHERE s.time_id = t.time_id AND t.calendar_year = 1998GROUP BY t.calendar_month_desc;CALENDAR Var_Pop Var_Samp-------- ---------- ----------1998-01 01998-02 6.1321E+11 1.2264E+121998-03 4.7058E+11 7.0587E+111998-04 4.6929E+11 6.2572E+111998-05 1.5524E+12 1.9405E+121998-06 2.3711E+12 2.8453E+121998-07 3.7464E+12 4.3708E+121998-08 3.7852E+12 4.3260E+121998-09 3.5753E+12 4.0222E+121998-10 3.4343E+12 3.8159E+121998-11 3.4245E+12 3.7669E+121998-12 4.8937E+12 5.3386E+12

VAR_SAMP 功能描述:Variance Sample)该函数返回非空集合的样本变量(忽略null),VAR_POP进行如下计算: (SUM(expr*expr)-SUM(expr)*SUM(expr)/COUNT(expr))/(COUNT(expr)-1)SAMPLE:下例计算1998年每月销售的累积总体和样本变量SELECT t.calendar_month_desc, VAR_POP(SUM(s.amount_sold)) OVER (ORDER BY t.calendar_month_desc) "Var_Pop", VAR_SAMP(SUM(s.amount_sold)) OVER (ORDER BY t.calendar_month_desc) "Var_Samp" FROM sales s, times tWHERE s.time_id = t.time_id AND t.calendar_year = 1998GROUP BY t.calendar_month_desc;CALENDAR Var_Pop Var_Samp-------- ---------- ----------1998-01 01998-02 6.1321E+11 1.2264E+121998-03 4.7058E+11 7.0587E+111998-04 4.6929E+11 6.2572E+111998-05 1.5524E+12 1.9405E+121998-06 2.3711E+12 2.8453E+121998-07 3.7464E+12 4.3708E+121998-08 3.7852E+12 4.3260E+121998-09 3.5753E+12 4.0222E+121998-10 3.4343E+12 3.8159E+121998-11 3.4245E+12 3.7669E+121998-12 4.8937E+12 5.3386E+12

VARIANCE 功能描述:该函数返回表达式的变量Oracle计算该变量如下: 如果表达式中行数为1,则返回0 如果表达式中行数大于1,则返回VAR_SAMP

SAMPLE:下例返回部门30按雇佣日期排序的薪水值的累积变化SELECT last_name, salary, VARIANCE(salary) OVER (ORDER BY hire_date) "Variance" FROM employees WHERE department_id = 30;LAST_NAME SALARY Variance------------------------- ---------- ----------Raphaely 11000 0Khoo 3100 31205000Tobias 2800 21623333.3Baida 2900 16283333.3Himuro 2600 13317000Colmenares 2500 11307000


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/756652/viewspace-242343/,如需转载,请注明出处,否则将追究法律责任。

下一篇: ORACLE回滚段(转)
请登录后发表评论 登录
全部评论

注册时间:2006-02-22

  • 博文量
    458
  • 访问量
    343471