• Pandas库学习之二:查找


    查找

    isin(...)

        def isin(self, values) -> "DataFrame":
            """
            Examples
            --------
            >>> df = pd.DataFrame({'num_legs': [2, 4], 'num_wings': [2, 0]},
            ...                   index=['falcon', 'dog'])
            >>> df
                    num_legs  num_wings
            falcon         2          2
            dog            4          0
    
            When ``values`` is a list check whether every value in the DataFrame
            is present in the list (which animals have 0 or 2 legs or wings)
    
            >>> df.isin([0, 2])
                    num_legs  num_wings
            falcon      True       True
            dog        False       True
    
            When ``values`` is a dict, we can pass values to check for each
            column separately:
    
            >>> df.isin({'num_wings': [0, 3]})
                    num_legs  num_wings
            falcon     False      False
            dog        False       True
    
            When ``values`` is a Series or DataFrame the index and column must
            match. Note that 'falcon' does not match based on the number of legs
            in df2.
    
            >>> other = pd.DataFrame({'num_legs': [8, 2], 'num_wings': [0, 2]},
            ...                      index=['spider', 'falcon'])
            >>> df.isin(other)
                    num_legs  num_wings
            falcon      True       True
            dog        False      False
            """
            if isinstance(values, dict):
                from pandas.core.reshape.concat import concat
    
                values = collections.defaultdict(list, values)
                return concat(
                    (
                        self.iloc[:, [i]].isin(values[col])
                        for i, col in enumerate(self.columns)
                    ),
                    axis=1,
                )
            elif isinstance(values, Series):
                if not values.index.is_unique:
                    raise ValueError("cannot compute isin with a duplicate axis.")
                return self.eq(values.reindex_like(self), axis="index")
            elif isinstance(values, DataFrame):
                if not (values.columns.is_unique and values.index.is_unique):
                    raise ValueError("cannot compute isin with a duplicate axis.")
                return self.eq(values.reindex_like(self))
            else:
                if not is_list_like(values):
                    raise TypeError(
                        "only list-like or dict-like objects are allowed "
                        "to be passed to DataFrame.isin(), "
                        f"you passed a '{type(values).__name__}'"
                    )
                return self._constructor(
                    algorithms.isin(self.values.ravel(), values).reshape(self.shape),
                    self.index,
                    self.columns,
                )
    

    PS: 筛选具体条件的dataframe

    1. 返回含有具体条件的dataframe, 如返回 'A'列中含有 [4,8] 的dataframe( 用逆函数对筛选后的结果取余,起删除指定行作用 )
    IN [1]: data
    Out[1]: 
       A  B   C   D
    0  0  1   2   3
    1  4  5   6   7
    2  8  9  10  11
     
    In [2]: data[data['A'].isin([4,8])] #返回值满足,{A列含有数值[4,8]}
    Out[2]: 
       A  B   C   D
    1  4  5   6   7
    2  8  9  10  11
     
    In [3]: data[~data['A'].isin([4,8])] #逆函数 剔除{A列含有数值[4,8]}的dataframe
    Out[3]: 
       A  B  C  D
    0  0  1  2  3
    
    1. 返回含有多个条件的dataframe, 如返回 'A'列中含有 4, 'A'列中含有 5 的dataframe( 用逆函数对筛选后的结果取余,起删除指定行作用 )
    In [4]: data[data['A'].isin([4]) & data['B'].isin([5])] #返回值满足,{A列含有4, B列含有5}
    Out[4]: 
       A  B  C  D
    1  4  5  6  7
     
    #逆函数 剔除{A列含有4, B列含有5}的dataframe, 多个条件的逆函数,一定要记得用()将条件包含起来
    In [5]: data[~(data['A'].isin([4]) & data['B'].isin([5]))] 
    Out[5]: 
       A  B   C   D
    0  0  1   2   3
    
    1. 返回含有条件所在行的行号(Index)
    In [6]: list(data[data['A'].isin([4,8])].index)
    Out[6]: [1, 2]
        
    print(type(df[df['A'].isin([4])].index), df[df['A'].isin([4])].index)
    # <class 'pandas.core.indexes.numeric.Int64Index'> Int64Index([1], dtype='int64')
    print(list(df[df['A'].isin([4])].index))
    # [1]
    print(df[df['A'].isin([4])].index[0])
    # 1
    
  • 相关阅读:
    Linux development tools
    Windows Live Mail: getting fewer ads
    美国签证(B1)经验总结
    谁要windows live messenger(msn8.0)的邀请?
    Use Google Calendar in Office
    C#中的ReaderWriterLock和LockFree Data Structure
    第一次看到“谷歌”出现在google.cn上
    解决SQL安装时提示挂起的方法
    asp 常见错误 不能打开注册表关键字 的处理方法
    Apache Web服务器安全配置全攻略
  • 原文地址:https://www.cnblogs.com/854594834-YT/p/14204472.html
Copyright © 2020-2023  润新知