线性回归的具体算法说明如下:
1. 用所给样本求出两个相关变量的(算术)平均值:X_=(x1+x2+x3+...+xn)/n;Y_=(y1+y2+y3+...+yn)/n
2. 分别计算分子和分母,分子=(x1y1+x2y2+x3y3+...+xnyn)-nX_Y_;分母=(x1^2+x2^2+x3^2+...+xn^2)-n*X_^2
3. 计算b = 分子/分母
具体oracle实现过程如下
1 CREATE OR REPLACE PROCEDURE SP_BILINEAR_REGRESSION(NEW_X IN VARCHAR2, 2 NEW_Y IN VARCHAR2, 3 TABLENAME IN VARCHAR2, 4 COLNAME_Y1 IN VARCHAR2, 5 COLNAME_Y2 IN VARCHAR2, 6 COLNAME_X1 IN VARCHAR2, 7 COLNAME_X2 IN VARCHAR2) 8 IS V_AVER_X NUMBER(23,9); 9 V_AVER_Y NUMBER(23,9); 10 CNT PLS_INTEGER; 11 B_FORECAST NUMBER(23,9); 12 A_FORECAST NUMBER(23,9); 13 V_TABLENAME VARCHAR2(50); 14 V_SQL VARCHAR2(2000); 15 BEGIN 16 V_TABLENAME := TABLENAME; 17 V_SQL := 'SELECT COUNT(1) FROM '||V_TABLENAME; 18 EXECUTE IMMEDIATE V_SQL INTO CNT; 19 IF CNT <> 0 THEN --ADD COLUMN TYPE FILTER 20 V_SQL := 'SELECT 1/2*AVG('||COLNAME_X1||'+'||COLNAME_X2||') FROM '||V_TABLENAME;EXECUTE IMMEDIATE V_SQL INTO V_AVER_X; 21 V_SQL := 'SELECT 1/2*AVG('||COLNAME_Y1||'+'||COLNAME_Y2||') FROM '||V_TABLENAME;EXECUTE IMMEDIATE V_SQL INTO V_AVER_Y; 22 V_SQL := 'SELECT (SUM('||COLNAME_X1||' * '||COLNAME_Y1||'+'||COLNAME_X2||' * '||COLNAME_Y2||') - '||CNT||' * 2 * '||V_AVER_X||' * '||V_AVER_Y||')/ 23 (SUM('||COLNAME_X1||' * '||COLNAME_X1||'+'||COLNAME_X2||' * '||COLNAME_X2||') - '||CNT||' * 2 * '||V_AVER_X||' * '||V_AVER_X||') 24 FROM '||V_TABLENAME; 25 EXECUTE IMMEDIATE V_SQL INTO B_FORECAST; 26 V_SQL := 'SELECT '||V_AVER_Y||' - '||B_FORECAST||' * '||V_AVER_X||' FROM DUAL'; 27 EXECUTE IMMEDIATE V_SQL INTO A_FORECAST; 28 END IF; 29 V_SQL := 'UPDATE '||V_TABLENAME||' T SET T.'||NEW_Y||' = T.'||NEW_X||' * '||B_FORECAST||' + '||A_FORECAST; 30 EXECUTE IMMEDIATE V_SQL; 31 COMMIT; 32 END SP_BILINEAR_REGRESSION;
调用如下:
1 CALL SP_BILINEAR_REGRESSION(NEW_X => 'MATURITYDAYS', 2 NEW_Y => 'DISCOUNTFACTOR', 3 TABLENAME => 'TMP_DATA_COLLECT_IRS', 4 COLNAME_Y1 => 'DISCOUNT1', 5 COLNAME_Y2 => 'DISCOUNT2', 6 COLNAME_X1 => 'DAYS1', 7 COLNAME_X2 => 'DAYS2');
其中,第一个参数是线性函数的x值所在字段,第二个是基于x和线性方程计算出来的y值所需要更新的字段,另外四个是传入表和字段名称。