Python 的字符编码

Python 的字符编码
一、 py2的string编码

在py2中，有两种字符串类型：str类型和unicode类型；str和unicode分别存的是字节数据(编码不确定解释器解释）和unicode数据
```
Python 2.7.10 (default, Oct  6 2017, 22:29:07)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a='中国'
>>> print a
中国
>>> print type(a)                     #查看类型为字符串 str 
<type 'str'>
>>> print repr(a)                     #str 中默认存的是字节数据，str 的编码是通过解释器解释出来的。指定了什么解释出来就是啥编码，默认acill
'xe4xb8xadxe5x9bxbd'
```
py2的编码(encode)和解码(decode)：是str 和 unicode 之间的转换 encode ： unicode --> str , decode: str--> unicode

encode 是将unicode编码的数据转化成指定编码的字符串str

decode 是将指定编码的字符串str 转换为unicode

二、 py3的string编码

py3也有两种数据类型：str和bytes； str类型存unicode编码的数据，bytse类型存bytes数据，与py2比只是换了一下名字而已。
```
import json

s='苑昊'
print(type(s))       #<class 'str'>
print(json.dumps(s)) #  "u82d1u660a"

b=s.encode('utf8')
print(type(b))      # <class 'bytes'>
print(b)            # b'xe8x8bx91xe6x98x8a'


u=b.decode('utf8')
print(type(u))       #<class 'str'>
print(u)             #苑昊
print(json.dumps(u)) #"u82d1u660a"


print(len('苑昊')) # 2
```
py3的编码(encode)和解码(decode)：是str 和 bytes 之间的转换 encode ： str --> bytes , decode: bytes--> str

encode 是将unicode编码的str 字符串转化成指定编码的bytes

decode 是将指定编码的bytes 转换为unicode 编码的str字符串

三、文件通过py解释器读取到内存的编码

查看解释器的默认编码：

py2默认ASCII码，py3默认的utf8，可以通过如下方式查询

1

2

import sys

print(sys.getdefaultencoding())

py2解释器要通过如下的指定，才会以utf8编码去读取文件。 py3不需要默认就是utf8

#coding:utf8

演示 py2 解释器读取文件到内存后的编码：

（1）指定utf8
```
#a.py
#coding=utf8

import sys
print sys.getdefaultencoding()

a='中国'
print type(a),repr(a)

执行： 
#python a.py
ascii
<type 'str'> 'xe4xb8xadxe5x9bxbd'

说明str 存的是utf8编码的字节
```
（2）指定gbk
```
#a.py
#coding=gbk 
那么str 存的是gbk编码的字节 
```
open文件如何告诉解释器通过啥编码打开文件？解决encoding='utf8' 。python2 没有这个参数，py3才有
相关阅读:
九度oj 题目1525：子串逆序打印
 九度oj 题目1516：调整数组顺序使奇数位于偶数前面
 九度oj 题目1490：字符串链接
 九度oj 题目1438：最小公倍数
 九度oj 题目1181：遍历链表
 九度oj 题目1179：阶乘
 九度oj 题目1077：最大序列和
 九度oj 题目1075：斐波那契数列
 九度oj 题目1074：对称平方数
 九度oj 题目1073：杨辉三角形
原文地址：https://www.cnblogs.com/mao3714/p/8900173.html

Python 的字符编码

一、 py2的string编码

二、 py3的string编码

三、文件通过py解释器读取到内存的编码

演示 py2 解释器读取文件到内存后的编码：