Python2的编解码
python2中程序数据类型默认为ASCII,所以需要先将数据解码(decode)成为Unicode类型,然后再编码(encode)成为想要转换的数据类型(gbk,utf-8,gb18030,gb2312),然后再解码成为对应的数据类型显示在屏幕上;
Python3的编解码
python3中程序默认数据类型为Unicode,所以直接将数据编码(encode)成为想要转换的数据类型(gbk,utf-8,gb18030,gb2312),然后解码成为对应的数据类型显示在屏幕上。
base64
Base64编码是一种“防君子不防小人”的编码方式。广泛应用于MIME协议,作为电子邮件的传输编码,生成的编码可逆,后一两位可能有“=”,生成的编码都是ascii字符。
因此对于python2来说,编解码相对要容易一些。python3因为要从Unicode转换一下,相对麻烦一些。一切见下例:
- Python2
1 def b64encode(s, altchars=None): 2 """Encode a string using Base64. 3 4 s is the string to encode. Optional altchars must be a string of at least 5 length 2 (additional characters are ignored) which specifies an 6 alternative alphabet for the '+' and '/' characters. This allows an 7 application to e.g. generate url or filesystem safe Base64 strings. 8 9 The encoded string is returned. 10 """ 11 # Strip off the trailing newline 12 encoded = binascii.b2a_base64(s)[:-1] 13 if altchars is not None: 14 return encoded.translate(string.maketrans(b'+/', altchars[:2])) 15 return encoded
1 def b64decode(s, altchars=None): 2 """Decode a Base64 encoded string. 3 4 s is the string to decode. Optional altchars must be a string of at least 5 length 2 (additional characters are ignored) which specifies the 6 alternative alphabet used instead of the '+' and '/' characters. 7 8 The decoded string is returned. A TypeError is raised if s is 9 incorrectly padded. Characters that are neither in the normal base-64 10 alphabet nor the alternative alphabet are discarded prior to the padding 11 check. 12 """ 13 if altchars is not None: 14 s = s.translate(string.maketrans(altchars[:2], '+/')) 15 try: 16 return binascii.a2b_base64(s) 17 except binascii.Error, msg: 18 # Transform this exception for consistency 19 raise TypeError(msg)
这里面的s是一个字符串类型的对象。
1 import base64 2 3 s = 'Hello, python' 4 b = base64.b64encode(s) 5 print 'b为:', b 6 7 c = base64.b64decode(b) 8 print 'c为:', c 9 10 11 # output 12 b为: SGVsbG8sIHB5dGhvbg== 13 c为: Hello, python
- Python3
1 def b64encode(s, altchars=None): 2 """Encode the bytes-like object s using Base64 and return a bytes object. 3 4 Optional altchars should be a byte string of length 2 which specifies an 5 alternative alphabet for the '+' and '/' characters. This allows an 6 application to e.g. generate url or filesystem safe Base64 strings. 7 """ 8 encoded = binascii.b2a_base64(s, newline=False) 9 if altchars is not None: 10 assert len(altchars) == 2, repr(altchars) 11 return encoded.translate(bytes.maketrans(b'+/', altchars)) 12 return encoded
1 def b64decode(s, altchars=None, validate=False): 2 """Decode the Base64 encoded bytes-like object or ASCII string s. 3 4 Optional altchars must be a bytes-like object or ASCII string of length 2 5 which specifies the alternative alphabet used instead of the '+' and '/' 6 characters. 7 8 The result is returned as a bytes object. A binascii.Error is raised if 9 s is incorrectly padded. 10 11 If validate is False (the default), characters that are neither in the 12 normal base-64 alphabet nor the alternative alphabet are discarded prior 13 to the padding check. If validate is True, these non-alphabet characters 14 in the input result in a binascii.Error. 15 """ 16 s = _bytes_from_decode_data(s) 17 if altchars is not None: 18 altchars = _bytes_from_decode_data(altchars) 19 assert len(altchars) == 2, repr(altchars) 20 s = s.translate(bytes.maketrans(altchars, b'+/')) 21 if validate and not re.match(b'^[A-Za-z0-9+/]*={0,2}$', s): 22 raise binascii.Error('Non-base64 digit found') 23 return binascii.a2b_base64(s)
这里面的s是一个bytes对象,则字符串首先要经过编码encode()。经过b64encode/b64decode之后的返回结果也是bytes对象,所以我们要转换为Unicode对象就要再使用decode()方法去解码。
1 import base64 2 3 s = 'Hello, Python!' 4 b = base64.b64encode(s.encode('utf-8')).decode('utf-8') 5 print(b) 6 7 c = base64.b64decode(b.encode('utf-8')).decode('utf-8') 8 print(c) 9 10 # output 11 SGVsbG8sIFB5dGhvbiE= 12 Hello, Python!