python-编码

ps:该处所讲的是针对python3

编码

编码方式有：int，str，bytes,bool,list,tuple,dict

bytes 表现形式： s = b'alex' 内部存储是（utf-8,gbk，gb2312.....）010101
str 表现形式： s = 'alex' 内部存储是 unicode 0101010

s1 = '晓梅'
b11 = s1.encode('utf-8')
表现形式： s = b'xe6x99x93xe6xa2x85' （utf-8,gbk，gb2312.....）010101
0000 1000 0000 0000 0000 0001 0000 1001 0000 0000 0000 0001
s2 = '晓梅'
b22 = s1.encode('gbk')
print(b22)
b'xcfxfexc3xb7'
表现形式： s = b'xcfxfexc3xb7' （utf-8,gbk，gb2312.....）010101
0000 1000 0000 0000 0000 0001 0000 1001

py3:
unicode A :00000000 00000000 00000000 00001001 四个字节
中 :00000000 00001000 00000000 00000001 四个字节

utf-8 A :00000001 一个字节
欧洲@ ： 00000010 00000001 两个字节
亚洲中 : 00001000 00000000 00000001 三个字节
中国：00001000 00000000 00000001 00001001 00000000 00000001

gbk A :00000000 00001001 两个字节
中 :00000000 00100001 两个字节
不同的编码之间是不能相互识别，会产生乱码。

存储，传输的：utf-8，或者 gbk，或者是gb2312，或者是其他（绝对不是unicode）。

unicode 与其他编码形式之间的转换使用的是encode（）

s = 'Q'
# b1 = s.encode('utf-8')
# print(b1)
#
# b2 = s.encode('gbk')
# print(b2)
s1 = '晓梅'
b11 = s1.encode('utf-8')
s2 = '晓梅'
b22 = s1.encode('gbk')
print(b22)

相关阅读:
Build a pile of Cubes
一键升级所有pip过期库
AWGN
调制详解——待完善
BASK、BFSK、BPSK调制方法的Matlab程序实现
tomcat运行问题解决方法
ehcache简单使用
MySQL 数据库中用户表中口令登陆设置
和自己赛跑的人
中文词频统计

原文地址：https://www.cnblogs.com/dwenwen/p/7749572.html