- 基础拾遗
- 列表、元组操作
- 字符串操作
- 字典操作
- 集合操作
- 文件操作
- 字符编码与转码
一、基础拾遗
1、bytes类型
Python 3最重要的新特性大概要算是对文本和二进制数据作了更为清晰的区分。文本总是Unicode,由str类型表示,二进制数据则由bytes类型表示。Python 3不会以任意隐式的方式混用str和bytes,正是这使得两者的区分特别清晰。你不能拼接字符串和字节包,也无法在字节包里搜索字符串(反之亦然),也不能将字符串传入参数为字节包的函数(反之亦然)。
2、三元运算
result = 值1 if 条件 else 值2
如果条件为真:result = 值1
如果条件为假:result = 值2
3、进制
- 二进制,01
- 八进制,01234567
- 十进制,0123456789
- 十六进制,0123456789ABCDEF
4、一切皆对象
对于Python,一切事物都是对象,对象基于类创建
二、列表、元祖操作
列表是最常用的数据类型之一,通过列表可以对数据实现最方便的存储、修改等操作
定义列表
names = ['faker',"insec",'pdd']
通过下标访问列表中的元素,下标从0开始计数
>>> names[0] 'faker' >>> names[2] 'insec' >>> names[-1] 'pdd' >>> names[-2] #还可以倒着取 'insec'
切片:取多个元素
1 >>> names = ['faker','insec','pdd','kid','madlife','illusion'] 2 >>> names[1:4] #取下标1到下标4之间的值,包括1,不包括4 3 ['insec', 'pdd', 'kid'] 4 >>> names[1:-1] #取下标1至-1的值,不包括-1 5 ['insec', 'pdd', 'kid', 'madlife'] 6 >>> names[0:3] 7 ['faker', 'insec', 'pdd'] 8 >>> names[:3] #从开头取到下标3,0可以忽略 9 ['faker', 'insec', 'pdd'] 10 >>> names[3:] #取下标3到最后一个, 11 ['kid', 'madlife', 'illusion'] 12 >>> names[3:-1] #取下标3到倒数第二个值,忽略最后一个值 13 ['kid', 'madlife'] 14 >>> names[0::2] #0:表示从开头取值到最后一个,:2表示每隔一个元素就取一个值 15 ['faker', 'pdd', 'madlife'] 16 >>> names[::2] #同上 17 ['faker', 'pdd', 'madlife'] 18 >>> names[1::2] #从下标1即第二个值开始取到最后一个值,每隔1个元素取一个 19 ['insec', 'kid', 'illusion']
追加
1 >>> names 2 ['faker', 'insec', 'pdd', 'kid', 'madlife', 'illusion'] 3 >>> names.append('kakao') 4 >>> names 5 ['faker', 'insec', 'pdd', 'kid', 'madlife', 'illusion', 'kakao']
插入
>>> names ['faker', 'insec', 'pdd', 'kid', 'madlife', 'illusion', 'kakao'] >>> names.insert(2,'shy') >>> names ['faker', 'insec', 'shy', 'pdd', 'kid', 'madlife', 'illusion', 'kakao'] >>> names.insert(5,'jacklove') >>> names ['faker', 'insec', 'shy', 'pdd', 'kid', 'jacklove', 'madlife', 'illusion', 'kakao']
修改
>>> names ['faker', 'insec', 'shy', 'pdd', 'kid', 'jacklove', 'madlife', 'illusion', 'kakao'] >>> names[3] = 'looper' >>> names ['faker', 'insec', 'shy', 'looper', 'kid', 'jacklove', 'madlife', 'illusion', 'kakao']
删除
>>> names ['faker', 'insec', 'shy', 'looper', 'kid', 'jacklove', 'madlife', 'illusion', 'kakao'] >>> del names[4] >>> names ['faker', 'insec', 'shy', 'looper', 'jacklove', 'madlife', 'illusion', 'kakao'] >>> names.remove("shy") #删除指定元素 >>> names ['faker', 'insec', 'looper', 'jacklove', 'madlife', 'illusion', 'kakao'] >>> names.pop() #删除列表内最后一个元素 'kakao' >>> names ['faker', 'insec', 'looper', 'jacklove', 'madlife', 'illusion']
扩展
>>> names ['faker', 'insec', 'looper', 'jacklove', 'madlife', 'illusion'] >>> i = [1,2,3,4] >>> names.extend(i) >>> names ['faker', 'insec', 'looper', 'jacklove', 'madlife', 'illusion', 1, 2, 3, 4]
拷贝
['faker', 'insec', 'looper', 'jacklove', 'madlife', 'illusion', 1, 2, 3, 4] >>> name_copy = names[:] >>> name_copy ['faker', 'insec', 'looper', 'jacklove', 'madlife', 'illusion', 1, 2, 3, 4]
统计
>>> names ['faker', 'insec', 'looper', 'jacklove', 'madlife', 'illusion', 1, 2, 3, 'insec'] >>> names.count("insec") 2
排序&翻转
>>> names ['faker', 'insec', 'looper', 'jacklove', 'madlife', 'illusion', 1, 2, 3, 'insec'] >>> names.sort() #python3.x不同数据类型的数据不支持放在一起排序了 >>> names [1, 2, 3, 'faker', 'illusion', 'insec', 'insec', 'jacklove', 'looper', 'madlife'] >>> names.reverse() #反转 >>> names ['madlife', 'looper', 'jacklove', 'insec', 'insec', 'illusion', 'faker', 3, 2, 1]
获取下标
>>> names ['madlife', 'looper', 'jacklove', 'insec', 'insec', 'illusion', 'faker', 3, 2, 1] >>> names.index('insec') 3 #只返回第一个找到的下标
元祖
元祖与列表相似,也是可以存一组数据,只是他一旦创建便不能再修改,所以又叫只读列表。
语法
names = ("faker","insec","illusion")
元祖只有2个方法,一个是count,一个是index
程序练习
程序:购物车程序
需求:
- 启动程序后,让用户输入工资,然后打印商品列表
- 允许用户根据商品编号购买商品
- 用户选择商品后,检测余额是否够,够就直接扣款,不够就提醒
- 可随时退出,退出时,打印已购买商品和余额
三、字符串操作
特性:不可修改
name.capitalize() 首字母大写 name.casefold() 大写全部变小写 name.center(50,"-") 输出 '---------------------Insec Jungle----------------------' name.count('Insec') 统计 Insec出现次数 name.encode() 将字符串编码成bytes格式 name.endswith("Jungle") 判断字符串是否以 Jungle结尾 "Insec Jungle".expandtabs(10) 输出'Alex Li', 将 转换成多长的空格 name.find('A') 查找A,找到返回其索引, 找不到返回-1 format : >>> msg = "my name is {}, and age is {}" >>> msg.format("Insec",20) 'my name is Insec, and age is 20' >>> msg = "my name is {1}, and age is {0}" >>> msg.format("Insec",20) 'my name is Insec, and age is 20' >>> msg = "my name is {name}, and age is {Insec}" >>> msg.format(age=20,name="Insec") 'my name is Insec, and age is 20' format_map >>> msg.format_map({'name':'Insec','age':20}) 'my name is alex, and age is 20' msg.index('a') 返回a所在字符串的索引 '9aA'.isalnum() True '9'.isdigit() 是否整数 name.isnumeric name.isprintable name.isspace name.istitle name.isupper "|".join(['insec','faker','illusion']) 'insec|faker|illusion' maketrans >>> intab = "aeiou" #This is the string having actual characters. >>> outtab = "12345" #This is the string having corresponding mapping character >>> trantab = str.maketrans(intab, outtab) >>> >>> str = "this is string example....wow!!!" >>> str.translate(trantab) 'th3s 3s str3ng 2x1mpl2....w4w!!!' msg.partition('is') 输出 ('my name ', 'is', ' {name}, and age is {age}') >>> "insec C, LCK name is Crs".replace("C","LI",1) 'alex LI, chinese name is lijie' msg.swapcase 大小写互换 >>> msg.zfill(40) '00000my name is {name}, and age is {age}' >>> n4.ljust(40,"-") 'Hello 2orld-----------------------------' >>> n4.rjust(40,"-") '-----------------------------Hello 2orld' >>> b="ddefdsdff_有意义" >>> b.isidentifier() #检测一段字符串可否被当作标志符,即是否符合变量命名规则 True
四、字典操作
字典一种key - value 的数据类型,使用就像我们上学用的字典,通过笔划、字母来查对应页的详细内容。
info = { 'stu1101': "Insec C", 'stu1102': "Faker L", 'stu1103': "looper Z", }
字典的特性:
- dict是无序的
- key必须是唯一的
增加
>>> info {'st1': 'Insec C', 'st2': 'Faker L', 'st3': 'illusion Z'} >>> info["st4"] = "Shy J" >>> info {'st4': 'Shy J', 'st1': 'Insec C', 'st2': 'Faker L', 'st3': 'illusion Z'}
修改
>>> info {'st4': 'Shy J', 'st1': 'Insec C', 'st2': 'Faker L', 'st3': 'illusion Z'} >>> info['st4'] = "Looper Z" >>> info {'st4': 'Looper Z', 'st1': 'Insec C', 'st2': 'Faker L', 'st3': 'illusion Z'}
删除
>>> info {'st4': 'Looper Z', 'st1': 'Insec C', 'st2': 'Faker L', 'st3': 'illusion Z'} >>> info.pop('st3') #删除指定key的值 'illusion Z' >>> info {'st4': 'Looper Z', 'st1': 'Insec C', 'st2': 'Faker L'} >>> del info['st2'] #删除指定key的值方式2 >>> info {'st4': 'Looper Z', 'st1': 'Insec C'} >>> info.popitem() #随机删除一个值 ('st4', 'Looper Z') >>> info {'st1': 'Insec C'}
查找
>>> info {'st4': 'Illusion Z', 'st1': 'Faker L', 'st2': 'Looper Z', 'st3': 'Insec C'} >>> "st1" in info #判断info字典里有没有key为st1 True >>> info.get("st1") #获取,不存在返回NULL 'Faker L' >>> info['st1'] #获取,不存在则会报错 'Faker L' >>> info['st11'] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'st11'
多级字典嵌套及操作
#!/usr/bin/env python # -*- coding:utf-8 -*- import json loler = { "韩国":{ "Faker":["李相赫,Mid","LOL史上最强选手,没有之一"], "Insec":["崔仁石,Jungle","野区魔术师"], "Looper":["张亨硕,Top","稳如防御塔的男人"], "Madlife":["洪珉绮,Support","被称为'辅助之神'的选手"], "Pray":["金钟仁,Adc","老而弥坚,不可多得"] }, "欧美":{ "Dyrus":["TSirDiesAlot,Top","老兵不死,只是凋零"], "Xpeke":["Enrique,Mid","欧洲法王"], "Alexey":["Alex Ich,Mid","战术开发师"] }, "中国":{ "IG":["可见影子,Jungle","谦逊努力的人总不会太差"] } } loler["中国"]["IG"][1] += ",Go ahead" print json.dumps(loler["中国"],encoding='UTF-8',ensure_ascii=False) #output {"IG": ["可见影子,Jungle", "谦逊努力的人总不会太差,Go ahead"]}
其它姿势
#values >>> info.values() dict_values(['Faker', 'Insec']) #keys >>> info.keys() dict_keys(['st1', 'st2']) #setdefault >>> info.setdefault("st3","Shy") 'Shy' >>> info {'st1': 'Faker', 'st2': 'Insec', 'st3': 'Shy'} >>> info.setdefault("st1","Li") 'Faker' >>> info {'st1': 'Faker', 'st2': 'Insec', 'st3': 'Shy'} #update >>> info {'st1': 'Faker', 'st2': 'Insec', 'st3': 'Shy'} >>> i = {"st4":"Madlife","st5":"Pray","st1":"Dade"} >>> info.update(i) >>> info {'st4': 'Madlife', 'st5': 'Pray', 'st1': 'Dade', 'st2': 'Insec', 'st3': 'Shy'} #items info.items() [('st4', 'Madlife'), ('st5', 'Pray'), ('st1', 'Dade'), ('st2', 'Insec'), ('st3', 'Shy')] #通过一个列表生成默认dict,有个没办法解释的坑 >>> dict.fromkeys([1,2,3],'testd') {1: 'testd', 2: 'testd', 3: 'testd'}
循环dict
#方法1 for key in info: print(key,info[key]) #方法2 for k,v in info.items(): #会先把dict转成list,数据量大时禁用 print(k,v)
程序练习
程序: 三级菜单
要求:
- 打印省、市、县三级菜单
- 可返回上一级
- 可随时退出程序
代码示例
#!/usr/bin/env python # -*- coding:utf-8 -*- menu = { '北京':{ '海淀':{ '五道口':{ 'soho':{}, '网易':{}, 'google':{} }, '中关村':{ '爱奇艺':{}, '汽车之家':{}, 'youku':{}, }, '上地':{ '百度':{}, }, }, '亮马桥':{ '领事馆':{ 'IMORA':{}, '蓝港':{}, }, '天通苑':{}, '回龙观':{}, }, '朝阳':{}, '东城':{}, }, '上海':{ '闵行':{ "广场":{ '花店':{} } }, '闸北':{ '火车战':{ '携程':{} } }, '浦东':{}, }, '山东':{}, } exit_flag = False current_layer = menu layers = [menu] while not exit_flag: for k in current_layer: print(k) choice = input(">>:").strip() if choice == "b": current_layer = layers[-1] #print("change to laster", current_layer) layers.pop() elif choice not in current_layer:continue else: layers.append(current_layer) current_layer = current_layer[choice]
五、集合操作
集合是一个无序的,不重复的数据组合,它的主要作用如下:
- 去重,把一个列表变成集合,就自动去重了
- 关系测试,测试两组数据之前的交集、差集、并集等关系
常用操作
s = set([3,5,9,10]) #创建一个数值集合 t = set("Hello") #创建一个唯一字符的集合 a = t | s # t 和 s的并集 b = t & s # t 和 s的交集 c = t – s # 求差集(项在t中,但不在s中) d = t ^ s # 对称差集(项在t或s中,但不会同时出现在二者中) 基本操作: t.add('x') # 添加一项 s.update([10,37,42]) # 在s中添加多项 使用remove()可以删除一项: t.remove('H') len(s) set 的长度 x in s 测试 x 是否是 s 的成员 x not in s 测试 x 是否不是 s 的成员 s.issubset(t) s <= t 测试是否 s 中的每一个元素都在 t 中 s.issuperset(t) s >= t 测试是否 t 中的每一个元素都在 s 中 s.union(t) s | t 返回一个新的 set 包含 s 和 t 中的每一个元素 s.intersection(t) s & t 返回一个新的 set 包含 s 和 t 中的公共元素 s.difference(t) s - t 返回一个新的 set 包含 s 中有但是 t 中没有的元素 s.symmetric_difference(t) s ^ t 返回一个新的 set 包含 s 和 t 中不重复的元素 s.copy() 返回 set “s”的一个浅复制
六、文件操作
对文件操作流程
- 打开文件,得到文件句柄并赋值给一个变量
- 通过句柄对文件进行操作
- 关闭文件
现有文件poetry.txt,内容如下
Somehow, it seems the love I knew was always the most destructive kind 不知为何,我经历的爱情总是最具毁灭性的的那种 Yesterday when I was young 昨日当我年少轻狂 The taste of life was sweet 生命的滋味是甜的 As rain upon my tongue 就如舌尖上的雨露 I teased at life as if it were a foolish game 我戏弄生命 视其为愚蠢的游戏 The way the evening breeze 就如夜晚的微风 May tease the candle flame 逗弄蜡烛的火苗 The thousand dreams I dreamed 我曾千万次梦见 The splendid things I planned 那些我计划的绚丽蓝图 I always built to last on weak and shifting sand 但我总是将之建筑在易逝的流沙上 I lived by night and shunned the naked light of day 我夜夜笙歌 逃避白昼赤裸的阳光 And only now I see how the time ran away 事到如今我才看清岁月是如何匆匆流逝 Yesterday when I was young 昨日当我年少轻狂 So many lovely songs were waiting to be sung 有那么多甜美的曲儿等我歌唱 So many wild pleasures lay in store for me 有那么多肆意的快乐等我享受 And so much pain my eyes refused to see 还有那么多痛苦 我的双眼却视而不见 I ran so fast that time and youth at last ran out 我飞快地奔走 最终时光与青春消逝殆尽 I never stopped to think what life was all about 我从未停下脚步去思考生命的意义 And every conversation that I can now recall 如今回想起的所有对话 Concerned itself with me and nothing else at all 除了和我相关的 什么都记不得了 The game of love I played with arrogance and pride 我用自负和傲慢玩着爱情的游戏 And every flame I lit too quickly, quickly died 所有我点燃的火焰都熄灭得太快 The friends I made all somehow seemed to slip away 所有我交的朋友似乎都不知不觉地离开了 And only now I'm left alone to end the play, yeah 只剩我一个人在台上来结束这场闹剧 Oh, yesterday when I was young 噢 昨日当我年少轻狂 So many, many songs were waiting to be sung 有那么那么多甜美的曲儿等我歌唱 So many wild pleasures lay in store for me 有那么多肆意的快乐等我享受 And so much pain my eyes refused to see 还有那么多痛苦 我的双眼却视而不见 There are so many songs in me that won't be sung 我有太多歌曲永远不会被唱起 I feel the bitter taste of tears upon my tongue 我尝到了舌尖泪水的苦涩滋味 The time has come for me to pay for yesterday 终于到了付出代价的时间 为了昨日 When I was young 当我年少轻狂
基本操作
f = open('lyrics') #打开文件 first_line = f.readline() print('first line:',first_line) #读一行 print('我是分隔线'.center(50,'-')) data = f.read()# 读取剩下的所有内容,文件大时不要用 print(data) #打印文件 f.close() #关闭文件
打开文件的模式有:
- r,只读模式(默认)。
- w,只写模式。【不可读;不存在则创建;存在则删除内容;】
- a,追加模式。【可读; 不存在则创建;存在则只追加内容;】
"+" 表示可以同时读写某个文件
- r+,可读写文件。【可读;可写;可追加】
- w+,写读
- a+,同a
"U"表示在读取时,可以将 自动转换成 (与 r 或 r+ 模式同使用)
- rU
- r+U
"b"表示处理二进制文件(如:FTP发送上传ISO镜像文件,linux可忽略,windows处理二进制文件时需标注)
- rb
- wb
- ab
其它语法
def close(self): # real signature unknown; restored from __doc__ """ Close the file. A closed file cannot be used for further I/O operations. close() may be called more than once without error. """ pass def fileno(self, *args, **kwargs): # real signature unknown """ Return the underlying file descriptor (an integer). """ pass def isatty(self, *args, **kwargs): # real signature unknown """ True if the file is connected to a TTY device. """ pass def read(self, size=-1): # known case of _io.FileIO.read """ 注意,不一定能全读回来 Read at most size bytes, returned as bytes. Only makes one system call, so less data may be returned than requested. In non-blocking mode, returns None if no data is available. Return an empty bytes object at EOF. """ return "" def readable(self, *args, **kwargs): # real signature unknown """ True if file was opened in a read mode. """ pass def readall(self, *args, **kwargs): # real signature unknown """ Read all data from the file, returned as bytes. In non-blocking mode, returns as much as is immediately available, or None if no data is available. Return an empty bytes object at EOF. """ pass def readinto(self): # real signature unknown; restored from __doc__ """ Same as RawIOBase.readinto(). """ pass #不要用,没人知道它是干嘛用的 def seek(self, *args, **kwargs): # real signature unknown """ Move to new file position and return the file position. Argument offset is a byte count. Optional argument whence defaults to SEEK_SET or 0 (offset from start of file, offset should be >= 0); other values are SEEK_CUR or 1 (move relative to current position, positive or negative), and SEEK_END or 2 (move relative to end of file, usually negative, although many platforms allow seeking beyond the end of a file). Note that not all file objects are seekable. """ pass def seekable(self, *args, **kwargs): # real signature unknown """ True if file supports random-access. """ pass def tell(self, *args, **kwargs): # real signature unknown """ Current file position. Can raise OSError for non seekable files. """ pass def truncate(self, *args, **kwargs): # real signature unknown """ Truncate the file to at most size bytes and return the truncated size. Size defaults to the current file position, as returned by tell(). The current file position is changed to the value of size. """ pass def writable(self, *args, **kwargs): # real signature unknown """ True if file was opened in a write mode. """ pass def write(self, *args, **kwargs): # real signature unknown """ Write bytes b to file, return number written. Only makes one system call, so not all of the data may be written. The number of bytes actually written is returned. In non-blocking mode, returns None if the write would block. """ pass
with语句
为了避免打开文件后忘记关闭,可以通过管理上下文,即:
with open('log','r') as f: ...
如此,当with代码块执行完毕时,内部会自动关闭并释放文件资源。
在Python 2.7 后,with又支持同时对多个文件的上下文进行管理,即:
with open('log1') as obj1, open('log2') as obj2: pass
程序练习
程序1: 实现简单的shell sed替换功能
程序2:修改haproxy配置文件
需求:
1、查 输入:www 获取当前backend下的所有记录 2、新建 输入: arg = { 'bakend': 'www', 'record':{ 'server': '10.1.1.1', 'weight': 22, 'maxconn': 30 } } 3、删除 输入: arg = { 'bakend': 'www', 'record':{ 'server': '100.1.1.1', 'weight': 22, 'maxconn': 30 } }
原配置文件内容
global log 127.0.0.1 local2 daemon maxconn 256 log 127.0.0.1 local2 info defaults log global mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms option dontlognull listen stats :8888 stats enable stats uri /admin stats auth admin:1234 frontend org bind 0.0.0.0:80 option httplog option httpclose option forwardfor log global acl www hdr_reg(host) -i www.m.com use_backend www.m.com if www backend www.m.com server 100.1.7.9 100.1.7.9 weight 20 maxconn 3000
七、字符编码与转码
详细文章:
http://www.cnblogs.com/yuanchenqi/articles/5956943.html
http://www.diveintopython3.net/strings.html
需知:
1.在python2默认编码是ASCII, python3里默认是unicode
2.unicode 分为 utf-32(占4个字节),utf-16(占两个字节),utf-8(占1-4个字节), so utf-16就是现在最常用的unicode版本, 不过在文件里存的还是utf-8,因为utf8省空间
3.在py3中encode,在转码的同时还会把string 变成bytes类型,decode在解码的同时还会把bytes变回string
上图仅适用于python 2
#!/usr/bin/env python #-*- coding:utf-8 -*- import sys print(sys.getdefaultencoding()) msg = "我爱北京" msg_gb2312 = msg.decode("utf-8").encode("gb2312") gb2312_to_gbk = msg_gb2312.decode("gbk").encode("gbk") print(msg) print(msg_gb2312) print(gb2312_to_gbk)
#-*-coding:gb2312 -*- #这个也可以去掉 import sys print(sys.getdefaultencoding()) msg = "我爱北京" #msg_gb2312 = msg.decode("utf-8").encode("gb2312") msg_gb2312 = msg.encode("gb2312") #默认就是unicode,不用再decode gb2312_to_unicode = msg_gb2312.decode("gb2312") gb2312_to_utf8 = msg_gb2312.decode("gb2312").encode("utf-8") print(msg) print(msg_gb2312) print(gb2312_to_unicode) print(gb2312_to_utf8)