• python 爬取整理


    请求部分

    url解析

    from urllib import parse
    url = "http://www.baidu.com/s?"
    info = {"wd":"kidd"}
    url = url + parse.urlencode(info)
    print(url) #http://www.baidu.com/s?wd=kidd

    url的编码与解码

    为何要这需要使用呢?

    如果一个请求中包含?=  / + 等特殊符号时可能会发生冲突。如果你直接 http://www.baidu.com/s?wd=/a+b=?/ 搜过内容肯定会有差别。

    from urllib import parse
    # 编码
    url = "http://www.baidu.com/s?wd="
    info = parse.quote("/a+b=?/")
    url += info
    print(url) # http://www.baidu.com/s?wd=/a%2Bb%3D%3F/
    
    # 解码
    parse_url = parse.unquote(url)
    print(parse_url) # http://www.baidu.com/s?wd=/a+b=?/

    requests好像不能实现,如果能实现麻烦告诉我。

    requests的post请求

    data数据不是字典

    data = "name=kidd"
    response = requests.post("http://httpbin.org/post",data=data)
    print(response.text)

    返回结果,放在data中

    "{
      "args": {}, 
      "data": "name=kidd", 
      "files": {}, 
      "form": {}, 
      "headers": {
        "Accept": "*/*", 
        "Accept-Encoding": "gzip, deflate", 
        "Content-Length": "9", 
        "Host": "httpbin.org", 
        "User-Agent": "python-requests/2.23.0", 
        "X-Amzn-Trace-Id": "Root=1-5edeee36-d00dd8b083c14254ec60605a"
      }, 
      "json": null, 
      "origin": "39.77.220.193", 
      "url": "http://httpbin.org/post"
    }"

    data是字典

    data = {"name":"kidd"}
    response = requests.post("http://httpbin.org/post",data=data)
    print(response.text)

    返回数据,放在form中,数据在form才算成功

    {
      "args": {}, 
      "data": "", 
      "files": {}, 
      "form": {
        "name": "kidd"
      }, 
      "headers": {
        "Accept": "*/*", 
        "Accept-Encoding": "gzip, deflate", 
        "Content-Length": "9", 
        "Content-Type": "application/x-www-form-urlencoded", 
        "Host": "httpbin.org", 
        "User-Agent": "python-requests/2.23.0", 
        "X-Amzn-Trace-Id": "Root=1-5edeeee5-f0544530bbb1b22824acd930"
      }, 
      "json": null, 
      "origin": "39.77.220.193", 
      "url": "http://httpbin.org/post"
    }
  • 相关阅读:
    python struct详解
    python 二维矩阵及转byte知识点
    c# HttpListener拒绝访问
    c# 捕获一般获取不到的异常
    查看dll依赖项
    Javascript 进阶 作用域 作用域链
    【Android进阶】Gson解析json字符串的简单应用
    做web项目时对代码修改后浏览器端不生效的应对方法(持续更新)
    异常Exception in thread "AWT-EventQueue-XX" java.lang.StackOverflowError
    玩转web之json(五)---将表单通过serialize()方法获取的值转成json
  • 原文地址:https://www.cnblogs.com/py-peng/p/13070837.html
Copyright © 2020-2023  润新知