• python数据清洗(pandas使用)


    对于给定的样例数据:

     对其进行缺失值填补、名字切分、删除重复值操作:

    import pandas as pd
    from pandas import DataFrame,Series
    df = DataFrame(pd.read_excel("F:\python入门\数据1\food.xlsx"))
    print('原始数据为:
    ',df)
    #利用均值填充缺失值
    df['ounces'].fillna(df['ounces'].mean(),inplace=True)
    print('填充均值后的数据:
    ',df)
    #将food列拆分成两列
    df[['first_name','last_name']]=df['food'].str.split(expand=True)
    df.drop('food',axis=1,inplace=True)
    print('将食物名称拆分后的数据:
    ',df)
    #删除重复数据
    df.drop_duplicates(['first_name','last_name'],inplace=True)
    print('删除重复值后的数据:
    ',df)
    #df.to_excel("F:\python入门\数据1\food_new.xlsx")

    结果:

    原始数据为:
               food  ounces  animal
    0        bacon     4.0     pig
    1  pulled pork     3.0     pig
    2        bacon     NaN     pig
    3     Pastrami     6.0     cow
    4  corned beef     7.5     cow
    5        Bacon     8.0     pig
    6     pastrami    -3.0     cow
    7    honey ham     5.0     pig
    8     nova lox     6.0  salmon
    填充均值后的数据:
               food  ounces  animal
    0        bacon  4.0000     pig
    1  pulled pork  3.0000     pig
    2        bacon  4.5625     pig
    3     Pastrami  6.0000     cow
    4  corned beef  7.5000     cow
    5        Bacon  8.0000     pig
    6     pastrami -3.0000     cow
    7    honey ham  5.0000     pig
    8     nova lox  6.0000  salmon
    将食物名称拆分后的数据:
        ounces  animal first_name last_name
    0  4.0000     pig      bacon      None
    1  3.0000     pig     pulled      pork
    2  4.5625     pig      bacon      None
    3  6.0000     cow   Pastrami      None
    4  7.5000     cow     corned      beef
    5  8.0000     pig      Bacon      None
    6 -3.0000     cow   pastrami      None
    7  5.0000     pig      honey       ham
    8  6.0000  salmon       nova       lox
    删除重复值后的数据:
        ounces  animal first_name last_name
    0     4.0     pig      bacon      None
    1     3.0     pig     pulled      pork
    3     6.0     cow   Pastrami      None
    4     7.5     cow     corned      beef
    5     8.0     pig      Bacon      None
    6    -3.0     cow   pastrami      None
    7     5.0     pig      honey       ham
    8     6.0  salmon       nova       lox

  • 相关阅读:
    [Baltic2013]ballmachine BZOJ3133
    [Jxoi2012]奇怪的道路 BZOJ3195 状压DP
    [Baltic 2011]Lamp BZOJ2346
    可并堆
    [Jsoi2016]最佳团体 BZOJ4753 01分数规划+树形背包/dfs序
    点分治
    J2EE WEB应用架构分析
    {经典}springmvc+mybatis+restful+webservice Jeesz分布式架构
    深入Spring Boot:那些注入不了的 Spring 占位符 ( ${} 表达式 )
    G1 垃圾收集器之对象分配过程
  • 原文地址:https://www.cnblogs.com/xiao02fang/p/13451507.html
Copyright © 2020-2023  润新知