python3：爬取的内容包含中文，输出后乱码的问题

需求：想要实现这样的功能：用户输入喜欢的电影名字，程序即可在电影天堂https://www.ygdy8.com爬取电影所对应的下载链接，并将下载链接打印出来

遇到的问题：获取磁力的链接中包含中文，打印出来后乱码

解决办法：手动指定编码方式：

if res.encoding == 'ISO-8859-1':
    encodings = requests.utils.get_encodings_from_content(res.text)
    if encodings:
        encoding = encodings[0]
    else:
        encoding = res.apparent_encoding
else:
    encoding = res.encoding
encode_content = res.content.decode(encoding, 'replace').encode('utf-8', 'replace')

# 想要实现这样的功能：用户输入喜欢的电影名字，程序即可在电影天堂https://www.ygdy8.com爬取电影所对应的下载链接，并将下载链接打印出来

import requests
from bs4 import BeautifulSoup
from urllib.request import pathname2url

# 为躲避反爬机制，伪装成浏览器的请求头
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36 OPR/65.0.3467.78 (Edition Baidu)'}

# 获取电影磁力链接
def getMovieDownloadLink(filmlink):
    res = requests.get(filmlink, headers=headers)
    if res.status_code == 200:

        # 请求后的内容中文乱码处理办法：
        # 当response编码是‘ISO-8859-1’，我们应该首先查找response header设置的编码；如果此编码不存在，查看返回的Html的header设置的编码
        if res.encoding == 'ISO-8859-1':
            encodings = requests.utils.get_encodings_from_content(res.text)
            if encodings:
                encoding = encodings[0]
            else:
                encoding = res.apparent_encoding
        else:
            encoding = res.encoding
        encode_content = res.content.decode(encoding, 'replace').encode('utf-8', 'replace')

        soup = BeautifulSoup(encode_content, 'html.parser')
        Zoom = soup.select_one('#Zoom')
        fileurl = Zoom.find('table').find('a').text
        with open('./17-电影天堂磁力.txt','a', newline='') as file:
            file.write(fileurl + '
')

    else:
        print('电影链接：{}请求失败！'.format(filmlink))

def main():
    dyurl = 'https://www.ygdy8.com'
    # movie = input('请输入电影名称：')
    movie = '沉睡魔咒'
    movie = movie.encode('gbk')
    url = 'http://s.ygdy8.com/plus/s0.php?typeid=1&keyword={0}'.format(pathname2url(movie))
    res = requests.get(url, headers=headers)
    if res.status_code == 200:
        htmltext = res.text
        soup = BeautifulSoup(htmltext, 'html.parser')
        co_content8 = soup.find('div', class_='co_content8')
        tables = co_content8.find('ul').find_all('table')
        if len(tables) <= 0:
            print('没有找到相关的资源，可到站点上搜索 {0}'.format(dyurl))
        else:
            for table in tables:
                filmlink = dyurl + table.find('a')['href']
                getMovieDownloadLink(filmlink)

    else:
        print('请求失败！')

main()

结果：

参考：

https://blog.csdn.net/guoxinian/article/details/82978067

http://blog.csdn.net/a491057947/article/details/47292923

http://docs.python-requests.org/en/latest/user/quickstart/#response-content

相关阅读:
PythonStudy——数据类型总结 Data type summary
PythonStudy——可变与不可变 Variable and immutable
PythonStudy——列表操作 List operatio
PythonStudy——列表的常用操作 List of common operations
PythonStudy——列表类型 List type
PythonStudy——字符串扩展方法 String extension method
PythonStudy——字符串重要方法 String important method
AWT,Swing,RCP 开发
 JQuery插件机制
 最新知识网站
原文地址：https://www.cnblogs.com/KeenLeung/p/12160712.html