• Python Reference in Data Analysis / Mining Tools


    If you are already familiar with the module/package loading methods of Python, the following table is relatively easy to find.

    Python is referenced in the following table as a module. Some modules are not native modules. Please use pip install * to install;

    Mechine Learning

    Category

    Subcategory Python
    LDA   sklearn.discriminant_analysis.LinearDiscriminantAnalysis
    QDA   sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
    SVM (Support Vector Machine) Support Vector Classifier (SVC) sklearn.svm.SVC
    Non-support vector classifier (nonSVC) sklearn.svm.NuSVC
    Linear Support Vector Classifier (Lenear SVC) sklearn.svm.LinearSVC
    Based on proximity K-proximity classifier sklearn.neighbors.KNeighborsClassifier
    Radius proximity classifier sklearn.neighbors.RadiusNeighborsClassifier
    Nearest Centroid Classifier sklearn.neighbors.NearestCentroid
    Bayes Naive Bayes sklearn.naive_bayes.GaussianNB
    Multinomial Naive Bayes sklearn.naive_bayes.MultinomialNB
    Bernoulli Naive Bayes sklearn.naive_bayes.BernoulliNB
    DecisionTree DecisionTree Classifier sklearn.tree.DecisionTreeClassifier
    DecisionTree Regressor sklearn.tree.DecisionTreeRegressor
    Assemble Method Bagging Random Forest Classifier sklearn.ensemble.RandomForestClassifier
    Bagging Random Forest Regressor sklearn.ensemble.RandomForestRegressor
    Boosting Gradient Boosting xgboost Module
    Boosting AdaBoost sklearn.ensemble.AdaBoostClassifier
    Cluster kmeans scipy.cluster.kmeans.kmeans
    Hierarchical Cluster scipy.cluster.hierarchy.fcluster
    DBSCAN sklearn.cluster.DBSCAN
    Birch sklearn.cluster.Birch
    K-Medoids Cluster

    pyclust.KMedoids(Unknown reliability)

    Association Rule Apriori Algorithm

    apriori(Unknown reliability, not support py3),
    PyFIM(Unknown reliability, unable to install with pip)

    FP-Growth Algorithm

    fp-growth(Unknown reliability, not support py3),
    PyFIM(Unknown reliability, unable to install with pip)

    Neural Network Neural Network neurolab.net, keras.*
    Deep Learning keras.*
     


    Connector & IO

    Database

    CategoryPython
    MySQL mysql-connector-python(Official)
    Oracle cx_Oracle
    Redis redis
    MongoDB pymongo
    neo4j py2neo
    Cassandra cassandra-driver
    ODBC pyodbc
    JDBC Unknown[Jython Only]

    IO

    CategoryPython
    excel xlsxWriter, pandas.(from/to)_excel, openpyxl
    csv csv.writer
    json json
    picture PIL


    Statistics

    CategoryPython
    描述性统计汇总 scipy.stats.descirbe
    均值 scipy.stats.gmean(几何平均数), scipy.stats.hmean(调和平均数), numpy.mean, numpy.nanmean, pandas.Series.mean
    中位数 numpy.median, numpy.nanmediam, pandas.Series.median
    众数 scipy.stats.mode, pandas.Series.mode
    分位数 numpy.percentile, numpy.nanpercentile, pandas.Series.quantile
    经验累积函数(ECDF) statsmodels.tools.ECDF
    标准差 scipy.stats.std, scipy.stats.nanstd, numpy.std, pandas.Series.std
    方差 numpy.var, pandas.Series.var
    变异系数 scipy.stats.variation
    协方差 numpy.cov, pandas.Series.cov
    (Pearson)相关系数 scipy.stats.pearsonr, numpy.corrcoef, pandas.Series.corr
    峰度 scipy.stats.kurtosis, pandas.Series.kurt
    偏度 scipy.stats.skew, pandas.Series.skew
    直方图 numpy.histogram, numpy.histogram2d, numpy.histogramdd

    Regression (including statistics and machine learning)

    类别Python
    普通最小二乘法回归(ols) statsmodels.ols, sklearn.linear_model.LinearRegression
    广义线性回归(gls) statsmodels.gls
    分位数回归(Quantile Regress) statsmodels.QuantReg
    岭回归 sklearn.linear_model.Ridge
    LASSO sklearn.linear_model.Lasso
    最小角回归 sklearn.linear_modle.LassoLars
    稳健回归 statsmodels.RLM

    Hypothetical Test

    类别Python
    t检验 statsmodels.stats.ttest_ind, statsmodels.stats.ttost_ind, statsmodels.stats.ttost.paired; scipy.stats.ttest_1samp, scipy.stats.ttest_ind, scipy.stats.ttest_ind_from_stats, scipy.stats.ttest_rel
    ks检验(检验分布) scipy.stats.kstest, scipy.stats.kstest_2samp
    wilcoxon(非参检验,差异检验) scipy.stats.wilcoxon, scipy.stats.mannwhitneyu
    Shapiro-Wilk正态性检验 scipy.stats.shapiro
    Pearson相关系数检验 scipy.stats.pearsonr

    Time series

    CategoryPython
    AR statsmodels.ar_model.AR
    ARIMA statsmodels.arima_model.arima
    VAR statsmodels.var_model.var
  • 相关阅读:
    物联网需要自己的专有操作系统
    基于visual Studio2013解决C语言竞赛题之0201温度转换
    基于visual Studio2013解决C语言竞赛题之前言
    物联网操作系统再思考:建设更加主动的网络,面向连接一切的时代
    经典排序算法分析和代码-下篇
    Windows XP硬盘安装Ubuntu 12.04双系统图文详解
    Eclipse 编码区-保护色-快捷大全
    Android最新源码4.3下载-教程 2013-11
    Windows XP硬盘安装Ubuntu 12.04双系统图文详解
    惠威的M200MK3的前级电子分频板
  • 原文地址:https://www.cnblogs.com/aiden-liu/p/10773803.html
Copyright © 2020-2023  润新知