• 使用python处理excel表格


         在我的工作中,有时候要处理大批量的数据的excel表格,通常是几万行的,如果手工做的话,大概需要半天时间,而且还不一定能保证不出错,如果是筛选数据的话,还不能对筛选好的数据直接进行求取平均值的需求,必须拷贝到另一张表里进行平均值操作。下面我针对具体的需求,来说一下python是如何处理的。

         需求:对OPER、STATENAME两列进行筛选,得到CPU、ELAPSE两列的数据,然后求出它们的平均值,最好有几个OPER生成几张表

         代码:

    代码
    1 import xlrd,sys
    2  from pyExcelerator import*
    3 OPER =5
    4 CPU =6
    5 ELAPSE =8
    6 STATENAME =15
    7 FILENAME ="WUPHG1HTC.xls"
    8  class CpuTestTime:
    9 StateTable = {}
    10 OperList = []
    11 def__init__(self,filename):
    12 self.fname = filename
    13
    14 def openExcelFile(self,index):
    15 self.bk = xlrd.open_workbook(self.fname)
    16 self.sh = self.bk.sheets()[index]
    17 self.nrows = self.sh.nrows
    18 self.dic3 = {}
    19
    20 def cpuTestTime(self):
    21 for i in range(1,self.nrows):
    22 ifnot str(self.sh.row(i)[OPER])[7:-1] in CpuTestTime.OperList: #oper
    23   CpuTestTime.OperList.append(str(self.sh.row(i)[OPER])[7:-1])
    24 CpuTestTime.StateTable[str(self.sh.row(i)[OPER])[7:-1]]=[]
    25 for i in range(1,self.nrows):
    26 ifnot str(self.sh.row(i)[STATENAME])[7:-1] in CpuTestTime.StateTable[str(self.sh.row(i)[OPER])[7:-1]]:#statename
    27   CpuTestTime.StateTable[str(self.sh.row(i)[OPER])[7:-1]].append(str(self.sh.row(i)[STATENAME])[7:-1])
    28
    29 dic1 ={}
    30 dic2 ={}
    31 for key,values in CpuTestTime.StateTable.items():
    32 for value in values:
    33 for tt in ['ELAPSED_TIME','CPU_ELAPSED_TIME','COUNT']:
    34 dic1.update({tt:0})
    35 dic2.update({value:dic1.copy()})
    36 self.dic3.update({key:dic2.copy()})
    37
    38 for i in xrange(1,self.nrows):
    39 self.dic3[str(self.sh.row(i)[OPER])[7:-1]][str(self.sh.row(i)[STATENAME])[7:-1]]['ELAPSED_TIME'] += self.sh.row(i)[ELAPSE].value
    40 self.dic3[str(self.sh.row(i)[OPER])[7:-1]][str(self.sh.row(i)[STATENAME])[7:-1]]['CPU_ELAPSED_TIME'] += self.sh.row(i)[CPU].value
    41 self.dic3[str(self.sh.row(i)[OPER])[7:-1]][str(self.sh.row(i)[STATENAME])[7:-1]]['COUNT'] +=1
    42
    43 for key,values in CpuTestTime.StateTable.items():
    44 for value in values:
    45 for tt in ['ELAPSED_TIME','CPU_ELAPSED_TIME']:
    46 self.dic3[key][value][tt]=self.dic3[key][value][tt]/self.dic3[key][value]['COUNT']
    47
    48 def saveResult(self):
    49 wb = Workbook()
    50 for key,values in CpuTestTime.StateTable.items():
    51 loop =1
    52 ws = wb.add_sheet(key)
    53 ws.write(0,0,'STATE')
    54 ws.write(0,1,'ELAPSED_TIME')
    55 ws.write(0,2,'CPU_ELAPSED_TIME')
    56 ws.write(0,3,'COUNT')
    57 for value in values:
    58 ws.write(loop,0,value)
    59 ws.write(loop,1,round(self.dic3[key][value]['ELAPSED_TIME'],2))
    60 ws.write(loop,2,round(self.dic3[key][value]['CPU_ELAPSED_TIME'],2))
    61 ws.write(loop,3,self.dic3[key][value]['COUNT'])
    62 loop = loop+1
    63 wb.save('result'+FILENAME)
    64
    65  if"__main__"==__name__:
    66 excelFile = CpuTestTime(FILENAME)
    67 excelFile.openExcelFile(0)
    68 excelFile.cpuTestTime()
    69 excelFile.saveResult()

          首先讲一下两个库:xlrd 和 pyExcelerator,xlrd从名字上来看就是读excel比较方便的,而pyExcelerator对于生成excel并写excel很方便。让我们分析一下上述代码的两个函数:

    1 def openExcelFile(self,index):
    2 self.bk = xlrd.open_workbook(self.fname)
    3 self.sh = self.bk.sheets()[index]
    4 self.nrows = self.sh.nrows
    5 self.dic3 = {}

          行2是打开一个文件名为self.fname的excel工作薄,行3是获取工作薄中的第(index+1)张表,因为一个工作薄里可以新建好几张表,它们的索引从0开始,行4记录了这张表里数据的行数。下面读取每个表格的值就很容易了,sh.row(i)[3]这个就代表第i行的第4列.特别注意sh.row(i)[3][7:-1]才能取到正确的字符串值,和excel的编码有关。

    代码
    1 def saveResult(self):
    2 wb = Workbook()
    3 for key,values in CpuTestTime.StateTable.items():
    4 loop =1
    5 ws = wb.add_sheet(key)
    6 ws.write(0,0,'STATE')
    7 ws.write(0,1,'ELAPSED_TIME')
    8 ws.write(0,2,'CPU_ELAPSED_TIME')
    9 ws.write(0,3,'COUNT')
    10 for value in values:
    11 ws.write(loop,0,value)
    12 ws.write(loop,1,round(self.dic3[key][value]['ELAPSED_TIME'],2))
    13 ws.write(loop,2,round(self.dic3[key][value]['CPU_ELAPSED_TIME'],2))
    14 ws.write(loop,3,self.dic3[key][value]['COUNT'])
    15 loop = loop+1
    16 wb.save('result'+FILENAME)

              上面这段代码,行2创建一个excel工作薄,行5增加一个表,行6写cell(0,0)单元格,值为最后一个参数'STATE'。行16保存到文件。

              这种方式处理excel表比较好懂,也比较简单,当然,也可以用windows的COM来处理,不过就没有这么简单了~

  • 相关阅读:
    1-Java类结构和main函数
    0-java概述
    2-python元组和列表
    SSH密码暴力破解及防御实战----防
    SQL注入攻击及防御详解
    XSS跨站攻防安全
    文件包含渗透----当我们无法进行上传渗透时另一种黑客攻击
    jspgou商城部署时报错:Could not open Hibernate Session for transaction; nested exception is org.hibernate.ex
    上传漏洞----看完之后你也是黑客(中国菜刀和kali)
    部署jenkins服务器出现Please wait while Jenkins is getting ready to work ...一直进不去该怎么办?
  • 原文地址:https://www.cnblogs.com/jurkymaomao/p/1749952.html
Copyright © 2020-2023  润新知