Python3中转换字符串编码

在使用subprocess调用Windows命令时，遇到了字符串不显示中文的问题，源码如下：#-*-coding:utf-8-*-__author__ = '$USER'

#-*-coding:utf-8-*-
__author__ = '$USER'

import subprocess
p = subprocess.Popen('nslookup www.qq.com', stdout=subprocess.PIPE)
p.wait()
print('returncode：%d' % p.returncode)
out = p.communicate()
for i in out:
    if i is not None:
        s = str(i, encoding='utf-8')
        print(s)

输出如下：

returncode：0
File "F:/TECH/python/LearnPython100Days/subprocessSample.py", line 11, in <module>
s = str(i, encoding='utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb7 in position 0: invalid start byte

结果显示，输出变量在编码为UT-8时出错。这是因为Windows命令行使用的是GBK编码格式（可在命令行属性中查看），而不是UTF-8，因此直接进行转换是不行的。因此，将代码修改为：

s = str(i, encoding='GBK')

即可得到正确输出：

returncode：0
服务器:  UnKnown
Address:  211.137.130.3

名称:    https.qq.com
Addresses:  2402:4e00:8030:1::7d
	  121.51.142.21
Aliases:  www.qq.com

在项目中，为了避免出现乱码，最好将所有的输出全部统一为UTF-8格式。那么，如何实现呢？

1.使GBK将字节串编码为中文；

2.使用UTF-8将中文字符串编码为字节串；

3.使用UTF-8将该字节串解码为字符串，即得到一串中文。

相关代码如下：

for i in out:
    if i is not None:
        print('原始字节串(%s)：
%s' %(chardet.detect(i)['encoding'],i))
        s = str(i, encoding='GBK')
        print('中文字符串：
%s' %s)
        utf8_bytes = s.encode('UTF-8', 'ignore')
        print('转码后的字节串(%s)：
%s' % (chardet.detect(utf8_bytes)['encoding'], utf8_bytes))
        utf8_str = utf8_bytes.decode('UTF-8')
        print('转码后的中文字符串：
%s' %utf8_str)

输出如下：

returncode：0
原始字节串(ISO-8859-9)：
b'xb7xfexcexf1xc6xf7:  UnKnown
Address:  211.137.130.3

xc3xfbxb3xc6:    https.qq.com
Addresses:  2402:4e00:8030:1::7d
	  121.51.142.21
Aliases:  www.qq.com

'
中文字符串：
服务器:  UnKnown
Address:  211.137.130.3

名称:    https.qq.com
Addresses:  2402:4e00:8030:1::7d
	  121.51.142.21
Aliases:  www.qq.com


转码后的字节串(utf-8)：
b'xe6x9cx8dxe5x8axa1xe5x99xa8:  UnKnown
Address:  211.137.130.3

xe5x90x8dxe7xa7xb0:    https.qq.com
Addresses:  2402:4e00:8030:1::7d
	  121.51.142.21
Aliases:  www.qq.com

'
转码后的中文字符串：
服务器:  UnKnown
Address:  211.137.130.3

名称:    https.qq.com
Addresses:  2402:4e00:8030:1::7d
	  121.51.142.21
Aliases:  www.qq.com

注意：

1.字节串转为GBK，再使用UTF-8转为字节串后，其值发生了变化；

2.使用chardet模块能够检测字节串的编码类型，但是它的结果不保证准确，仅供参考。它将第一个字节串检测成了‘ISO-8859-9’

3.在phthon3中，字符串的encode()方法能够得到字节串，没有decode方法；相应地，字节串bytes.decode()方法将其解码为字符串，没有encode方法。这里与python2不一样。

相关阅读:
Centos7 下 PHP 添加缺少的组件 sockets 和 openssl
vue使用 video.js动态切换视频源视频源不刷新问题
 vue+js清除定时器
 获取ip地址，并根据ip获取当前省份
 html页面引用video.js播放m3u8格式视频
 uniapp每隔几秒执行一下网络请求（h5端亲测可以，其他端未测试）
关于uniapp获取当前距离屏幕顶部的距离
 按值传递与按值引用详解（java版）
JavaScript的深入理解（1）
vue-cli3配置webpack-bundle-analyzer插件
原文地址：https://www.cnblogs.com/pzy4447/p/11144338.html