Python文件操作

Python文件操作
文件的分类：

(1) 文本文件    .txt(通过编码保存成字节)

(2) 二进制文件   .mp3/ .wmv/ .doc

一、获取文件对象

1. 格式：

    open([path]file, mode, )

        file:文件或者文件夹，其中path包括绝对路径和相对路径。

            绝对路径：从当前路径开始的路径

            相对路径：从盘符，即根目录开始路径

        mode:

            r:  只读模式(默认)

            w:  覆盖写

            a:  追加写



            t:  文本模式，以字符串格式显示(仅限于txt格式文件)(默认)

            b：二进制模式，以字节格式显示

            +：读取和显示

                r+ 覆盖写





模式 r r+ w w+ a a+

读 + + + +

写 + + + + +

创建 + + + +

覆盖 + +

指针在开始 + + + +

指针在结尾 + +

二、文件的关闭

    (1) 直接关闭            --- 不推荐

    (2) try-finaly 中关闭  --- 需要声明全局变量，因为是同一等级；

                                迭代器，用掉即删掉了

    (3) with打开和关闭

        with open(path,mod) as file_object_name

file_object_name.write('dsafas')

        > 多个用逗号隔开



三、文件的读取

1. read(size [= -1]):

--- 默认读取全部内容，如果有参数，读取前num*3个字节

    > 文件对象是迭代器，当文件读取到末尾时候，无法访问任何内容

2. readline()

    --- 返回文件的一行，保留' '



3. readlines()

    --- 返回一个列表，读取多行



4. for

    --- 如果文件过大，使用for循环读取文件对象进行迭代

    for i in f:

        print(i, end = '')

四、文件的写入

(1) write(content)
```
with open(path, 'r', encoding='utf-8') as f, open(aid, 'w', encoding='utf-8') as g:
	for i in f:
		print(i,end='')
		g.write(i)
```
(2) writelines(list)

    --- 将list内容写入文件，一个元素一行



五、文件的定位

1. 文件的指针：

    当r模式的时候，指针指向文件第一个字符的位置

    当a、w模式时，指针在文件末尾的最后一个字符的下一个位置

    (1) tell

--- 返回指针位置，即下一个要读/写的字符位置
```
>>> with open('c:/test.txt', 'wt') as f:
        f.write('1234567')
            print(f.tell())
    7
    7
>>> with open('c:/test.txt', 'rt') as f:
        print(f.read(1))
        print(f.tell())
    1
    1
```
    (2) seek(offset, whence)

        offset: 偏移量

        whence:

from os import SEEK_SET, SEEK_CUR, SEEK_END

① 0，从文件头开始计算（SEEK_SET）

② 1, 从文件当前位置(SEEK_CUR)

③ 2，从文件尾开始计算(SEEK_END)

            > 以字节格式(b)打开，支持任意offset,whence

            > 以字符串格式(t)打开文件，如果whence=0，偏移量随意

                                    如果whence = 1/2, 偏移量只能是0

六、文件的路径操作

1. os

import os

(1) mkr:

    --- 父目录必须存在，子文件夹必须不存在，父目录不存在或者文件夹已存在报错

(2) makedirs

    --- 创建文件夹，父目录不存在会同时创建父目录

(3) rmdir

    --- 删除空目录

(4) removedirs

    --- 删除空文件夹，如果父目录也为空文件夹，删除直到不为空位置

(5) remove

    --- 删除文件，文件不存在不会报错

(6) rename(old, new)

    --- 修改文件名，目录要求一致

(7) renames(old, new)

    --- 修改文件名，可以实现【移动+重命名】操作,要记得写文件后缀

(8) getcwd()

    --- 返回当前工作目录
```
>>> os.getcwd()

'C:\Users\aura-bd\AppData\Local\Programs\Python\Python35'
```
(9) walk

    --- 遍历路径下的文件

    (1) dirpath:    string  路径

    (2) dirnames:   list    dirpath下的目录名字

    (3) filenames:  list    非目录文件名
```
# 下面使用Acrobat安装包作为实例，进行试验

# walk进行递归查找，root是当前路径字符串，dirs是文件夹名列表，files是文件名列表
>>> for root, dirs, files in os.walk('D:Acrobat'):
	for name in files:
		r = os.path.join(root, name)
		print(r)

		
D:AcrobatBerime.htm
D:AcrobatLeame.htm
D:AcrobatLeesMij.htm
D:AcrobatLeggimi.htm
D:AcrobatLeiaMe.htm
D:AcrobatLiesmich.htm
D:AcrobatLisezmoi.htm
D:AcrobatLueMinut.htm
D:AcrobatReadMe.htm
D:AcrobatReadMeCS.htm
D:AcrobatReadMeCT.htm
D:AcrobatReadMeCZE.htm
D:AcrobatReadMeHUN.htm
D:AcrobatReadMeJ.htm
D:AcrobatReadMeK.htm
D:AcrobatReadMeMEA.htm
D:AcrobatReadMeMEH.htm
D:AcrobatReadMePOL.htm
D:AcrobatReadMeRUS.htm
D:AcrobatReadMeSKY.htm
D:AcrobatReadMeTUR.htm
D:AcrobatReadMeUKR.htm
D:AcrobatVigtigt.htm
D:AcrobatViktig.htm
D:AcrobatViktigt.htm
D:AcrobatAdobe AcrobatABCPY.INI
D:AcrobatAdobe AcrobatAcrobatDCUpd1801120035.msp
D:AcrobatAdobe AcrobatAcroPro.msi
D:AcrobatAdobe AcrobatData1.cab
D:AcrobatAdobe AcrobatSetup.exe
D:AcrobatAdobe Acrobatsetup.ini
D:AcrobatAdobe AcrobatWindowsInstaller-KB893803-v2-x86.exe
D:AcrobatAdobe AcrobatTransforms1025.mst
D:AcrobatAdobe AcrobatTransforms1028.mst
D:AcrobatAdobe AcrobatTransforms1029.mst
D:AcrobatAdobe AcrobatTransforms1030.mst
D:AcrobatAdobe AcrobatTransforms1031.mst
D:AcrobatAdobe AcrobatTransforms1033.mst
D:AcrobatAdobe AcrobatTransforms1034.mst
D:AcrobatAdobe AcrobatTransforms1035.mst
D:AcrobatAdobe AcrobatTransforms1036.mst
D:AcrobatAdobe AcrobatTransforms1037.mst
D:AcrobatAdobe AcrobatTransforms1038.mst
D:AcrobatAdobe AcrobatTransforms1040.mst
D:AcrobatAdobe AcrobatTransforms1041.mst
D:AcrobatAdobe AcrobatTransforms1042.mst
D:AcrobatAdobe AcrobatTransforms1043.mst
D:AcrobatAdobe AcrobatTransforms1044.mst
D:AcrobatAdobe AcrobatTransforms1045.mst
D:AcrobatAdobe AcrobatTransforms1046.mst
D:AcrobatAdobe AcrobatTransforms1049.mst
D:AcrobatAdobe AcrobatTransforms1051.mst
D:AcrobatAdobe AcrobatTransforms1053.mst
D:AcrobatAdobe AcrobatTransforms1055.mst
D:AcrobatAdobe AcrobatTransforms1058.mst
D:AcrobatAdobe AcrobatTransforms1060.mst
D:AcrobatAdobe AcrobatTransforms2052.mst
D:AcrobatAdobe AcrobatTransforms6156.mst
D:AcrobatAdobe AcrobatVCRT_x64cab1.cab
D:AcrobatAdobe AcrobatVCRT_x64vc_runtimeMinimum_x64.msi
D:AcrobatGB18030ReadMe.htm
D:AcrobatGB18030ReadMeCS.htm
```
(10) listdir(path)

--- 显示参数路径下的第一层目录名
```
# 获取当前路径下所有文件夹和文件名字
>>> os.listdir('D:Acrobat')


['Adobe Acrobat', 'Berime.htm', 'GB18030', 'Leame.htm', 'LeesMij.htm', 'Leggimi.htm', 
'LeiaMe.htm', 'Liesmich.htm', 'Lisezmoi.htm', 'LueMinut.htm', 'ReadMe.htm', 'ReadMeCS.htm',
 'ReadMeCT.htm', 'ReadMeCZE.htm', 'ReadMeHUN.htm', 'ReadMeJ.htm', 'ReadMeK.htm', 
'ReadMeMEA.htm', 'ReadMeMEH.htm', 'ReadMePOL.htm', 'ReadMeRUS.htm', 'ReadMeSKY.htm', 
'ReadMeTUR.htm', 'ReadMeUKR.htm', 'Vigtigt.htm', 'Viktig.htm', 'Viktigt.htm']
```
2. os.path

(1) abspath(path)

--- 局对路径

(2) basepath

--- 返回路径中最后面的部分，以/分隔符为准
```
>>> os.path.basename('C:/abc/def/i.jpg')

'i.jpg'
```
(3) commonpath(path1, path2...)

    --- 返回最长公共路径

(4) exist(path)

    --- 判断是否存在

(5) getatime/ getmtime()

    ---- 返回目录或者文件的最后访问时间，修改时间(Access / Modification)

(6) getsize

    --- 获得文件大小，单位为bit,

(7) isdir

    --- 判断是否存在文件或目录

(8) join(str1, str2, str3)

    --- 拼接路径

        从绝对路径开始，之前的胡去掉，空元素也会去掉

(9) split(path)

    --- 拆分路径 dirname + basename

3. shutil模块

import shutill

(1) copy('file', 'new_path')

    --- 把文件在新路径下复制一份文件，新路径必须已经存在

(2) copy2

    --- 深度复制，包含元信息

(3) copytree

    --- 复制目录树，包含文件和文件夹

七、序列化

    --- ①传输快；

②解析任务交与客户端

7.1 CSV (Comma Separated Values)

    --- 即逗号分隔值（也称字符分隔值，因为分隔符可以不是逗号），是一种常用的逗号分隔文本格式，用以存储表格数据，包括数字或者字符。
```
import csv

# 使用数字和字符串的数字都可以
datas = [['name', 'age'],
         ['Bob', 14],
         ['Tom', 23],
        ['Jerry', '18']]

# newline 表示写操作的时候，调用write方法，默认会加

with open('C:/test.csv', 'w', newline = '') as f:
	# 需要先初始化writer对象
	writer = csv.writer(f)
	# 单行写入
	for row in datas:
        writer.writerow(row)
	# 多行写入
	# writer.writerows(datas)
```
7.2 json

    --- (JavaScript Object Notation) 是一种轻量级的数据交换文本格式。

    > 采用键值对映射模式，使用逗号分隔

        对象 - {}

        数组 - []

        字符串 - ''

        布尔 - true与false。

数值类型 - 整数与浮点数。

    > python的数据类型和json几乎一致，python和json之间的转化很方便：

        (1) python >>> json,使用json.dump方法

        格式：

            json.dump(data, file[, ensure_ascii=True])

            # ensure_ascii如果修改为false,才看正常显示中文
```
# eg.
dic = 
	{
	"bg": "green",
	"title": {
		"data": ["data1", "data2", "data3", "data4"],
		"align": "左对齐"
			}
	}
import json
with open('c:test.json', 'wt') as f:
	json.dump(d, f, ens
```
        (2) json >>> python,使用load方法

            with open('c:test.json', 'rw') as f:

                #读取json文件，返回字典类型字符串

                d = json.load(f)



7.3 序列化和反序列化

    (1) 序列化：Python字典 >>> 字符串

                --- json.dumps函数

        格式：

            temp = json.dumps(data, file, [ensure_ascii=True])

    (2) 反序列化：字符串 >>> Python字典

                --- json.loads函数

        格式：     temp = json.loads(str)

7.4 python字典和json类型之间的映射
```
>>> data = {
	'布尔':True,
	'空值':None,
	'浮点':1.2,
	'整数':1,
	'字符串':'asdf',
	'列表':[1,23,4],
	'字典':{"one":1},
	}
>>> json.dumps(data, ensure_ascii=False)

'{"列表": [1, 23, 4], "空值": null, "整数": 1, "浮点": 1.2, "布尔": true, "字典": {"one": 1}, "字符串": "asdf"}'
```
7.5 自定义序列化类型

    > JSONEncoder: 处理普通类型的序列化

    > 继承JSONEcoder,重写default方法
```
class Person:
	def __init__(self, name, age):
		self.name = name
		self.age = age
p = Person()

class My_Encoder(json.JSONEncoder):
	def default(self, o):
		# 变成字典
		if isinstance(o, Person):
			return {'name':o.name, 'age':o.age}	# 规定输出
		else:
			return super().default(o)
# 调用自定义的序列化方法
json.dumps(dic, cls=My_Encoder, ensure_ascii=False)
```
7.6 pickle(针对python)
```
class Person:
	def __init__(self, name, age):
		self.name = name
		self.age = age
p = Person()
# 写入数据
with open('c:test.pikle', 'wb') as f:	#二进制格式
	pickle.dump(p, f)
# 读取数据
with open('c:test.pikle', 'wb') as f:
	pickle.load(f)
```
四、上下文管理器

1. 通过重写魔法方法

    --- 上下文管理器需要定义进入和退出两个部分：

    在with语句进入和退出时，分别执行

    __enter__   :进入语句体      返回值为with方法的文件对象

    __exit__    :退出语句体      返回值为None,抛出异常；返回值为True，镇压异常
```
# eg.

class File(object):
    def __init__(self, filename, mode):
        self.filename = filename
        self.mode = mode

    def __enter__(self):
        print("entering")
        self.f = open(self.filename, self.mode)
        return self.f

    def __exit__(self, *args):
        print("will exit")
        self.f.close()

with File('a.txt', 'w') as file:
    file.write('ssssss')
```
2. 通过装饰器实现上下文管理器
```
from contextlib import contextmanager
class MyResource:
    def query(self):
        print('query data')

@contextmanager
def make_myresource():
    print('start to connect')
    yield MyResource()
    print('end connect')
    pass

with make_myresource() as r:
     r.query()
```
被装饰器装饰的函数分为三部分:
with语句中的代码块执行前执行函数中yield之前代码
yield返回的内容复制给as之后的变量
with代码块执行完毕后执行函数中yield之后的代码
```
start to connect
query data
end connect
```
@contextlib

def gen():

    print('enter方法，执行')

    yield 'enter方法返回值，即with语句体的绑定对象'

    print('exit方法，执行')



@contextlib

def gen():

    print('enter方法，执行')

    try:

        yield 'enter方法返回值，即with语句体的绑定对象'

    except:# 等同于返回true

        pass

    finally:

        print('exit方法，执行')
相关阅读:
【Beta】Scrum Meeting 8
【Beta】Scrum Meeting 7
【Beta】Scrum Meeting 6
【Beta】Scrum Meeting 5
【Beta】Scrum Meeting 4
【Beta】Scrum Meeting 3
“北航Clubs”项目汇报
 Alpha阶段展示报告
 Alpha阶段产品功能说明
 Alpha阶段个人贡献分
原文地址：https://www.cnblogs.com/geoffreyone/p/9899759.html

模式	r	r+	w	w+	a	a+
读	+	+		+		+
写		+	+	+	+	+
创建			+	+	+	+
覆盖			+	+
指针在开始	+	+	+	+
指针在结尾					+	+

Python文件操作

文件的分类：

一、获取文件对象

1. 格式：

二、文件的关闭

三、文件的读取

1. read(size [= -1]):

2. readline()

3. readlines()

4. for

四、文件的写入

(1) write(content)

(2) writelines(list)

五、文件的定位

1. 文件的指针：

(1) tell

(2) seek(offset, whence)

六、 文件的路径操作

1. os

(1) mkr:

(2) makedirs

(3) rmdir

(4) removedirs

(5) remove

(6) rename(old, new)

(7) renames(old, new)

(8) getcwd()

(9) walk

(10) listdir(path)

2. os.path

(1) abspath(path)

(2) basepath

(3) commonpath(path1, path2...)

(4) exist(path)

(5) getatime/ getmtime()

(6) getsize

(7) isdir

(8) join(str1, str2, str3)

(9) split(path)

3. shutil模块

(1) copy('file', 'new_path')

(2) copy2

(3) copytree

七、序列化

7.1 CSV (Comma Separated Values)

7.2 json

7.3 序列化和反序列化

(1) 序列化：Python字典 >>> 字符串

(2) 反序列化：字符串 >>> Python字典

7.4 python字典和json类型之间的映射

7.5 自定义序列化类型

7.6 pickle(针对python)

四、上下文管理器

1. 通过重写魔法方法

2. 通过装饰器实现上下文管理器

六、文件的路径操作