> #########主成分分析与因子分析 > setwd("/Users/yaozhilin/Downloads/R_edu/data") > pt<-read.csv("profile_telecom.csv") > head(pt,5) ID cnt_call cnt_msg cnt_wei cnt_web 1 1964627 46 90 36 31 2 3107769 53 2 0 2 3 3686296 28 24 5 8 4 3961002 9 2 0 4 5 4174839 145 2 0 1 > library(psych) > #用fa.parallel()确定主成分个数 > fa.parallel(pt,fa="pc",n.iter = 100)
> #用principal()进行主成分分析:nfactors =表示主成分个数,rotate = "varimax"旋转方法 > Pc_pt<-principal(pt[,-1],nfactors = 3,rotate = "varimax",scores = T) > Pc_pt Principal Components Analysis Call: principal(r = pt[, -1], nfactors = 3, rotate = "varimax", scores = T) Standardized loadings (pattern matrix) based upon correlation matrix RC1 RC3 RC2 h2 u2 com cnt_call 0.06 0.02 1.00 1 6.5e-08 1.0 cnt_msg 0.33 0.94 0.02 1 3.7e-04 1.2 cnt_wei 0.98 0.19 0.06 1 1.9e-03 1.1 cnt_web 0.88 0.48 0.06 1 3.1e-03 1.6 RC1 RC3 RC2 SS loadings 1.84 1.15 1.00 Proportion Var 0.46 0.29 0.25 Cumulative Var 0.46 0.75 1.00 Proportion Explained 0.46 0.29 0.25 Cumulative Proportion 0.46 0.75 1.00 Mean item complexity = 1.2 Test of the hypothesis that 3 components are sufficient. The root mean square of the residuals (RMSR) is 0 with the empirical chi square 0.01 with prob < NA Fit based upon off diagonal values = 1 > ptpc<-cbind(pt,Pc_pt$scores) > head(ptpc) ID cnt_call cnt_msg cnt_wei cnt_web RC1 RC3 RC2 1 1964627 46 90 36 31 0.1952344 3.8712835 -0.3726676 2 3107769 53 2 0 2 -0.4219981 -0.6793516 -0.1552081 3 3686296 28 24 5 8 -0.4194772 0.5202526 -0.5541321 4 3961002 9 2 0 4 -0.2943034 -0.6714705 -0.8283602 5 4174839 145 2 0 1 -0.5535192 -0.6802487 1.2451860 6 5068087 186 4 3 1 -0.5413228 -0.6159420 1.8639601