先看代码:
In [46]: import pandas as pd
In [47]: data = [[1,2,3],[4,5,6]]
In [48]: index = [0,1]
In [49]: columns=['a','b','c']
In [50]: df = pd.DataFrame(data=data, index=index, columns=columns)
In [51]: df
Out[51]:
a b c
0 1 2 3
1 4 5 6
1. loc——通过行标签索引行数据
In [52]: df.loc[1]
Out[52]:
a 4
b 5
c 6
Name: 1, dtype: int64
1.2 loc['d']表示索引的是第’d’行(index 是字符)
In [53]: import pandas as pd
...: data = [[1,2,3],[4,5,6]]
...: index = ['d','e']
...: columns=['a','b','c']
...: df = pd.DataFrame(data=data, index=index, columns=columns)
...:
In [54]: df
Out[54]:
a b c
d 1 2 3
e 4 5 6
In [55]: df.loc['d']
Out[55]:
a 1
b 2
c 3
Name: d, dtype: int64
1.3 如果想索引列数据,像这样做会报错
In [56]: df.loc['a']
Traceback (most recent call last):
File "<ipython-input-56-5dbae926782f>", line 1, in <module>
df.loc['a']
File "E:Anacondalibsite-packagespandascoreindexing.py", line 1328, in __getitem__
return self._getitem_axis(key, axis=0)
...
KeyError: 'the label [a] is not in the [index]'
1.4 loc可以获取多行数据
In [57]: df.loc['d':]
Out[57]:
a b c
d 1 2 3
e 4 5 6
1.5 loc扩展——索引某行某列
In [58]: df.loc['d',['b','c']]
Out[58]:
b 2
c 3
Name: d, dtype: int64
1.6 loc扩展——索引某列
In [59]: df.loc[:,['c']]
Out[59]:
c
d 3
e 6
当然获取某列数据最直接的方式是df.[列标签],但是当列标签未知时可以通过这种方式获取列数据。
需要注意的是,dataframe的索引[1:3]是包含1,2,3的,与平时的不同。
2. iloc——通过行号获取行数据
2.1 想要获取哪一行就输入该行数字
先看之前df
数据:
In [54]: df
Out[54]:
a b c
d 1 2 3
e 4 5 6
现在调用iloc
命令
In [60]: df.iloc[1] #获取第1行
Out[60]:
a 4
b 5
c 6
Name: e, dtype: int64
In [61]: df.iloc[0] #获取第0行
Out[61]:
a 1
b 2
c 3
Name: d, dtype: int64
2.2 通过行标签索引会报错
In [62]: df.iloc['a']
Traceback (most recent call last):
File "<ipython-input-62-0c5fe4e92254>", line 1, in <module>
df.iloc['a']
File "E:Anacondalibsite-packagespandascoreindexing.py", line 1328, in __getitem__
return self._getitem_axis(key, axis=0)
...
TypeError: cannot do positional indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [a] of <class 'str'>
2.3 同样通过行号可以索引多行
In [63]: df.iloc[0:] #获取0和其他行
Out[63]:
a b c
d 1 2 3
e 4 5 6
2.4 iloc索引列数据
In [64]: df.iloc[:,[0]]
Out[64]:
a
d 1
e 4
In [65]: df.iloc[:,[1]]
Out[65]:
b
d 2
e 5
3. ix——结合前两种的混合索引 (现在ix用法不推荐,这是Python2.x常用的)
3.1 通过行号索引
先看之前df
数据:
In [54]: df
Out[54]:
a b c
d 1 2 3
e 4 5 6
现在看看.ix
用法
In [66]: df.ix[1]
__main__:1: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate_ix
Out[66]:
a 4
b 5
c 6
Name: e, dtype: int64
3.2 通过行标签索引
In [67]: df.ix['e']
Out[67]:
a 4
b 5
c 6
Name: e, dtype: int64
参考来源:https://blog.csdn.net/roamer314/article/details/52179191