pandas模块

一、导入方式

import pandas as pd

二、作用

文件处理，更多地是给excle文件做处理，对numpy+xlrd模块做了一层封装

三、pandas的数据类型

3.1 series()

现在一般不使用（一维）

df = pd.series(np.array([1,2,3,4]))
print(df)

3.2 DataFrame() (多维)

3.2.1

dates = pd.date_range('20190101', periods=6, freq='M')
print(dates)    # periods=6, freq='M'代表前六个月

start	开始时间
end	结束时间
periods	时间长度
freq	时间频率，默认为'D'，可选H(our),W(eek),B(usiness),S(emi-)M(onth),(min)T(es), S(econd), A(year),…

3.2.2 属性

属性	详解
dtype是	查看数据类型
index	查看行序列或者索引
columns	查看各列的标签
values	查看数据框内的数据，也即不含表头索引的数据
describe	查看数据每一列的极值，均值，中位数，只可用于数值型数据
transpose	转置，也可用Ｔ来操作
sort_index	排序，可按行或列index排序输出
sort_values	按数据值来排序

3.2.3 取值

#构造一个数组
dates = pd.date_range('20190101', periods=6, freq='M')
print(dates)
values = np.random.rand(6, 4) * 10

print(values)
columns = ['c4','c2','c3','c1']

#主要掌握
df.values[1,1]   #取出第一行第一列
df.iloc[1,1] = 1  #取出第一行第一列，替换为1

3.2.4 操作表格

1、缺失值处理

df = df.dropna(axis = 0)    #按行删除缺失值
df

df = df.dropna(tresh = 4)   #必须得有4个值，写5就不可以，因为只有4列

df = df.dropna(axis=0)  # 1列,0行
df  #按行取缺失值

2、合并处理数据

df1 = pd.DataFrame(np.zeros((2,3)))  #用0合并两行三列
df2 = pd.DataFrame(np.ones((2,3)))  #用1合并两行三列
pd.concat((df1,df2))  #默认按列合并
pd.concat((df1,df2),axis=1)    axis=1是行，0是列
df1.append(df2)   #往后追加

导入数据，读取json文件菜鸟仅做了解

相关阅读:
网页登陆注册认证
把git上的larave项目通过SourceTree安装上再通过composer安装依赖库
Mysql依赖库Boost的源码安装，linux下boost库的安装
oracle date 和 timestamp区别
easyUI datagrid 分页参数page和rows
问题：org.hibernate.LazyInitializationException: failed to lazily initialize
Struts2返回JSON数据的具体应用范例
错误Batch update returned unexpected row count from update [0]; actual row count: 0;
SSH2+proxool 出现No suitable driver found for proxool.mysqlProxool
Proxool Provider unable to load JAXP configurator file: proxoolconf.xml

原文地址：https://www.cnblogs.com/yanjiayi098-001/p/11378066.html