1。data.groupby()#分组操作
2。pivot_table(data, values=None, index=None, columns=None,aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')#透视表操作
1.series:one-dimensional data。 dataframe:several dimensions。
series:contained within the other array,called the index.
2.create series:s=pd.Series([12,-4,7,9],index=['a','b','c','d'])--------index默认从0开始数字。
individually see:s.values----------值。s.index-----------索引。
select individual elements:s[2] or s['b'] 。
select multiple items: s[0:2] s[['b','c']]-----------------切一大块用一个括号,切好多小块用两括号
assigning values:s[1]=0 or s["b"]=0
defining a series:s=pd.Series(np.array([1,2,3,4]))----------not copied ,but are passed by reference.so changes will also be present in new series object
filtering values:s[s>8]------找出s中大于8的元素(返回新series)
mathematical functions:s/2 np.log(s)---------can use numpy functions.
evaluating vales:s.unique()-------------返回s中包含的每一个元素(重复元素只输出一个),array类型。
s.value_counts()-------返回每个元素和其重复次数。
s.isin([0,3])---------------判断s的每个元素是否是0或者3,返回布尔类型。
define------np.NaN
identify the indexs without a value----------s.isnull() 或 s.notnull()-返回布尔类型值--------s(s.notnull())---返回所有没有nan的值
3 .An alternative way to think of a series is think of it as an object dict(dictionary)
mydict = {'red': 2000, 'blue': 1000, 'yellow': 500,'orange': 1000}
4.pd.Series(data,index=[ ] )
5.SeriesA+SeriesB:the items with the same label are added,all other labels present in one of the series are still added to the result but have a NaN value.(标签一样的值相加,不一样的全为NaN)