Python 数据分析

`loc`,`iloc`,`ix`三者间的区别和联系

`loc`

.loc is primarily label based, but may also be used with a boolean array.
就是说，loc方法主要是用label来选择数据的。^[1]

A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index. This use is not an integer position along the index)
A list or array of labels ['a', 'b', 'c']
A slice object with labels 'a':'f', (note that contrary to usual python slices, both the start and the stop are included!)
A boolean array

总的形式还是要保持的df[xx:xx,xx:xx]，只不过这里边可以不用切片，但是中间的,还是很关键的。可以不写,，那么，就表示取某一行。但是，不能表示取某一列。

import pandas as pd

import numpy as np

test=pd.DataFrame(np.random.randn(20).reshape(4,5),index=['A','B','C','D'],columns=['E','F','G','H','I'])

test
Out[4]: 
          E         F         G         H         I
A -0.833316 -1.982666  1.055594  0.781759 -0.107631
B -1.514709 -1.422883  0.204399 -0.487639 -1.652785
C -0.424735  0.400529 -0.786582  0.855885  0.059894
D  2.016221 -1.314878 -1.745535 -0.907778  0.834966

test.loc['A']
Out[5]: 
E   -0.833316
F   -1.982666
G    1.055594
H    0.781759
I   -0.107631
Name: A, dtype: float64

test.loc['E']
KeyError: 'the label [E] is not in the [index]'

#看见了吧，是“闭区间”
test.loc['A':'B','E':'F']
Out[8]: 
          E         F
A -0.833316 -1.982666
B -1.514709 -1.422883

label切片选择时，貌似是“闭区间”，:后边的也是包含进去的。

`iloc`

.iloc is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.
iloc主要就是基于position的选择。注意了，这里的position选择是一种”左闭右开“区间，意思就是df[m:n]只选择m:n-1行的数据。

An integer e.g. 5
A list or array of integers [4, 3, 0]
A slice object with ints 1:7
A boolean array

import pandas as pd

import numpy as np

test=pd.DataFrame(np.random.randn(20).reshape(4,5),index=['A','B','C','D'],columns=['E','F','G','H','I'])

test
Out[4]: 
          E         F         G         H         I
A -0.833316 -1.982666  1.055594  0.781759 -0.107631
B -1.514709 -1.422883  0.204399 -0.487639 -1.652785
C -0.424735  0.400529 -0.786582  0.855885  0.059894
D  2.016221 -1.314878 -1.745535 -0.907778  0.834966

#看见了吧，是“左闭右开”区间呀！
test.iloc[0:1,0:1]
Out[10]: 
          E
A -0.833316

`ix`

.ix supports mixed integer and label based access. It is primarily label based, but will fall back to integer positional access unless the corresponding axis is of integer type.
ix就是一种集大成者的选择方法呀！既支持position选择，也支持label选择。主要是label选择。

import pandas as pd

import numpy as np

test=pd.DataFrame(np.random.randn(20).reshape(4,5),index=['A','B','C','D'],columns=['E','F','G','H','I'])

test
Out[4]: 
          E         F         G         H         I
A -0.833316 -1.982666  1.055594  0.781759 -0.107631
B -1.514709 -1.422883  0.204399 -0.487639 -1.652785
C -0.424735  0.400529 -0.786582  0.855885  0.059894
D  2.016221 -1.314878 -1.745535 -0.907778  0.834966

#下面的`ix`是不是和`loc`作用差不多啊~
test.ix['A':'B','E':'F']
Out[12]: 
          E         F
A -0.833316 -1.982666
B -1.514709 -1.422883

#下面的是和`iloc`差不多了
test.ix[0:1,0:1]
Out[11]: 
          E
A -0.833316

但是需要注意的是，当index或者columns是整数时，ix索引其实是按label选择的，因此，是闭区间的。

参考

发现还是官方文档说的最详细啊！希望以后有机会多看看这里的内容~

官方文档-Indexing and Selecting Data ↩︎

相关阅读:
poli-java开源BI软件
 Spring Boot 教程
 微信小程序支持windows PC版了
 Java-JDK-windows和linux版-百度云下载
 ssh -i 密钥文件无法登陆问题
 锐捷交换机18010-X端口假死现象
 zabbix4.4安装
 yum只下载不安装
 openstack迁移计算节点所有云主机
 ceph SSD HDD分离与openstack调用
原文地址：https://www.cnblogs.com/michael-xiang/p/10466866.html

Python 数据分析

loc,iloc,ix三者间的区别和联系

loc

iloc

ix

参考

`loc`,`iloc`,`ix`三者间的区别和联系

`loc`

`iloc`

`ix`