5. 数据存储
5.1 CSV格式数据存取(Comma-Separated Values)
5.1.1 将DataFrame保存到csv文件中:df.to_csv('xxx.csv')
# -*- coding: utf-8 -*- """ Created on Tue Jan 24 13:07:01 2017 @author: Wayne """ from matplotlib.finance import quotes_historical_yahoo_ochl from datetime import date import pandas as pd today = date.today() start = (today.year-1, today.month, today.day) quotes = quotes_historical_yahoo_ochl('IBM',start,today) quotesdf = pd.DataFrame(quotes) quotesdf.to_csv('stockIBM.csv')
5.1.2 读取CSV文件:pd.read_csv('xxx.csv')
In [2]:pd.read_csv('stockIBM.csv') Out[2]: Unnamed: 0 0 1 2 3 4 5 0 0 735988.0 117.618075 117.598813 119.669894 117.174961 5446000.0 1 1 735989.0 117.762571 118.090087 119.072651 117.752936 4617800.0 2 2 735990.0 118.224954 116.519922 118.841462 116.221304 5026400.0 ... 251 251 736352.0 170.080002 171.029999 171.250000 170.009995 5327300.0 [252 rows x 7 columns] In [3]:pd.read_csv('stockIBM.csv')['2'] Out[3]: 0 117.598813 1 118.090087 2 116.519922 ... 251 171.029999 Name: 2, dtype: float64
5.2 XLS格式数据存取
5.2.1 将DataFrame保存到xlsx文件中:quotesdf.to_excel('stockIBM.xlsx',sheet_name='IBM')
# -*- coding: utf-8 -*- """ Created on Tue Jan 24 13:07:01 2017 @author: Wayne """ from matplotlib.finance import quotes_historical_yahoo_ochl from datetime import date import pandas as pd today = date.today() start = (today.year-1, today.month, today.day) quotes = quotes_historical_yahoo_ochl('IBM',start,today) quotesdf = pd.DataFrame(quotes) quotesdf.to_excel('stockIBM.xlsx',sheet_name='IBM')
5.2.2 读取XLSX文件:pd.read_excel('stockIBM.xlsx',sheet_name='IBM')
In [7]: result = pd.read_excel('stockIBM.xlsx',sheet_name='IBM') In [8]: result Out[8]: 0 1 2 3 4 5 0 735988 117.618075 117.598813 119.669894 117.174961 5446000 1 735989 117.762571 118.090087 119.072651 117.752936 4617800 2 735990 118.224954 116.519922 118.841462 116.221304 5026400 ... 251 736352 170.080002 171.029999 171.250000 170.009995 5327300 [252 rows x 6 columns] In [9]: result[2] Out[9]: 0 117.598813 1 118.090087 2 116.519922 ... 251 171.029999 Name: 2, dtype: float64
5.3 练习:
已知有excel表格内容(请自行创建):
1.创建一个DataFrame将已有表格内容读出并存入此DataFrame中;
2.添加字段Sum_score,其值为相应学生的Python和Math两门课程分数之和;
3.将更新过的DataFrame中的内容写回students.xlsx文件的scores工作表中。
# -*- coding: utf-8 -*- """ Created on Tue Jan 24 13:07:01 2017 @author: Wayne """ import pandas as pd df = pd.read_excel('test.xlsx') df['sum'] = df['Python'] + df['Math'] df.to_excel('test.xlsx',sheet_name='scores')