• Pandas的 loc iloc ix 区别


    先看代码:

    In [46]: import pandas as pd
    
    In [47]: data = [[1,2,3],[4,5,6]]
    
    In [48]: index = [0,1]
    
    In [49]: columns=['a','b','c']
    
    In [50]: df = pd.DataFrame(data=data, index=index, columns=columns)
    
    In [51]: df
    Out[51]: 
       a  b  c
    0  1  2  3
    1  4  5  6
    

    1. loc——通过行标签索引行数据

    
    In [52]: df.loc[1]
    Out[52]: 
    a    4
    b    5
    c    6
    Name: 1, dtype: int64
    

    1.2 loc['d']表示索引的是第’d’行(index 是字符)

    In [53]: import pandas as pd    
        ...: data = [[1,2,3],[4,5,6]]    
        ...: index = ['d','e']    
        ...: columns=['a','b','c']    
        ...: df = pd.DataFrame(data=data, index=index, columns=columns)
        ...: 
    
    In [54]: df
    Out[54]: 
       a  b  c
    d  1  2  3
    e  4  5  6
    
    In [55]: df.loc['d']
    Out[55]: 
    a    1
    b    2
    c    3
    Name: d, dtype: int64
    

    1.3 如果想索引列数据,像这样做会报错

    In [56]: df.loc['a']
    Traceback (most recent call last):
    
      File "<ipython-input-56-5dbae926782f>", line 1, in <module>
        df.loc['a']
    
      File "E:Anacondalibsite-packagespandascoreindexing.py", line 1328, in __getitem__
        return self._getitem_axis(key, axis=0)
        ...
    KeyError: 'the label [a] is not in the [index]'
    

    1.4 loc可以获取多行数据

    
    In [57]: df.loc['d':]
    Out[57]: 
       a  b  c
    d  1  2  3
    e  4  5  6
    

    1.5 loc扩展——索引某行某列

    In [58]: df.loc['d',['b','c']]
    Out[58]: 
    b    2
    c    3
    Name: d, dtype: int64
    

    1.6 loc扩展——索引某列

    In [59]: df.loc[:,['c']]
    Out[59]: 
       c
    d  3
    e  6
    

    当然获取某列数据最直接的方式是df.[列标签],但是当列标签未知时可以通过这种方式获取列数据。

    需要注意的是,dataframe的索引[1:3]是包含1,2,3的,与平时的不同。

    2. iloc——通过行号获取行数据

    2.1 想要获取哪一行就输入该行数字

    先看之前df数据:

    In [54]: df
    Out[54]: 
       a  b  c
    d  1  2  3
    e  4  5  6
    

    现在调用iloc命令

    In [60]: df.iloc[1]  #获取第1行
    Out[60]: 
    a    4
    b    5
    c    6
    Name: e, dtype: int64
    
    In [61]: df.iloc[0]  #获取第0行
    Out[61]: 
    a    1
    b    2
    c    3
    Name: d, dtype: int64
    

    2.2 通过行标签索引会报错

    In [62]: df.iloc['a']
    Traceback (most recent call last):
    
      File "<ipython-input-62-0c5fe4e92254>", line 1, in <module>
        df.iloc['a']
    
      File "E:Anacondalibsite-packagespandascoreindexing.py", line 1328, in __getitem__
        return self._getitem_axis(key, axis=0)
      ...
    
    TypeError: cannot do positional indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [a] of <class 'str'>
    

    2.3 同样通过行号可以索引多行

    In [63]: df.iloc[0:]   #获取0和其他行
    Out[63]: 
       a  b  c
    d  1  2  3
    e  4  5  6
    

    2.4 iloc索引列数据

    In [64]: df.iloc[:,[0]]
    Out[64]: 
       a
    d  1
    e  4
    
    In [65]: df.iloc[:,[1]]
    Out[65]: 
       b
    d  2
    e  5
    

    3. ix——结合前两种的混合索引 (现在ix用法不推荐,这是Python2.x常用的)

    3.1 通过行号索引

    先看之前df数据:

    In [54]: df
    Out[54]: 
       a  b  c
    d  1  2  3
    e  4  5  6
    

    现在看看.ix用法

    In [66]: df.ix[1]
    __main__:1: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
    
    See the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate_ix
    Out[66]: 
    a    4
    b    5
    c    6
    Name: e, dtype: int64
    

    3.2 通过行标签索引

    In [67]: df.ix['e']
    Out[67]: 
    a    4
    b    5
    c    6
    Name: e, dtype: int64
    
    

    参考来源:https://blog.csdn.net/roamer314/article/details/52179191

  • 相关阅读:
    OpenStack源码系列---neutron-server
    理解全虚拟、半虚拟以及硬件辅助的虚拟化
    QEMU+GDB调试方法
    SQL Server故障转移集群
    OpenStack源码系列---nova-conductor
    mysql 安装和基本使用
    数据库原理
    linux 计划任务
    linux 进程管理和内存分配

  • 原文地址:https://www.cnblogs.com/HongjianChen/p/8849981.html
Copyright © 2020-2023  润新知