• PHP爬取itunes页面信息


     

    实验目的:通过PHP curl爬取某个package 的itunes页面数据,格式化输出。

     

    实验步骤:构造curl (demo写于laravel,不同框架的自行更改)

     

    //$url是一条合法的ITunes链接
    public static function creativeGet($url,$post_data=false,$ignore_ssl=true, $dataType='text')
    {
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_USERAGENT, 'Chrome 42.0.2311.135 Pentamob');
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($curl, CURLOPT_URL, $url);

    if($ignore_ssl){
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); //信任任何证书
    curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0); // 检查证书中是否设置域名,0不验证
    }

    $proxy = config('app.proxy'); //我的例子是'proxy' => ['host' => '127.0.0.1', 'port' => '1080', 'type' => CURLPROXY_SOCKS5],
        if($proxy){
    curl_setopt($curl, CURLOPT_HTTPPROXYTUNNEL, true);
    curl_setopt($curl, CURLOPT_PROXYAUTH, CURLAUTH_BASIC);
    curl_setopt($curl, CURLOPT_PROXYTYPE, $proxy['type']);
    curl_setopt($curl, CURLOPT_PROXY, $proxy['host']);
    curl_setopt($curl, CURLOPT_PROXYPORT, $proxy['port']);

    }

    if($post_data){
    curl_setopt($curl, CURLOPT_POST, 1);
    curl_setopt($curl, CURLOPT_POSTFIELDS, $post_data);
    }

    $data = curl_exec($curl);
    $status = curl_getinfo($curl);
    $error_info = [ //组装错误信息
    'error_no' => curl_errno($curl),
    'error_info' => curl_getinfo($curl),
    'error_msg' => curl_error($curl),
    'result' => $data
    ];

    curl_close($curl);
    if (isset($status[ 'http_code' ]) && $status[ 'http_code' ] == 200) {
    if ($dataType == 'json') {
    $data = json_decode($data, true);
    }
    return $data;
    } else {
    return $error_info;
    }
    }

    构造格式化输出文件:

      public static function dealiTune($package_name = '')
      {
        //我测试的包名是 297606951
          if (!$package_name) {
       return 'package_name is not null';
       }

       $url = 'https://itunes.apple.com/us/lookup?id='.$package_name;
       $html_doc = self::creativeGet($url);
       $html_json_data = json_decode($html_doc, true);

       $result = [];
       if ($html_json_data['resultCount'] < 1) {
       return 'This '.$package_name.'name was not found';
       }
       $result['name'] = $html_json_data['results'][0]['trackCensoredName'];
       $result['icon'] = $html_json_data['results'][0]['artworkUrl100'];
       $result['description'] = $html_json_data['results'][0]['description'];
       $result['min_os_vs'] = $html_json_data['results'][0]['minimumOsVersion'];
       $result['category'] = $html_json_data['results'][0]['primaryGenreName'];

       return response()->json($result); //laravel的response()->json
      }
    实验结果:

      {
        "offer_name": "Amazon - Shopping made easy",
        "icon": "https://is3-ssl.mzstatic.com/image/thumb/Purple118/v4/43/5e/87/435e87d8-d948-1678-7027-21f1570a1b41/source/100x100bb.jpg",
        "des": "International Shopping Browse.............",
        "min_os_vs": "9.0",
        "category": "Shopping"
      }


    写在最后:这样就可以简单的通过curl 抓取iTunes上APP的数据,下一篇将会实现goole store的抓取。

    注明* 如转载请务必注明来源
  • 相关阅读:
    微微一笑很倾城(1)
    微微一笑很倾城(1)
    陈先生与程太太
    陈先生与程太太
    我在这
    我在这
    曾有一个人,爱我如生命(2)
    曾有一个人,爱我如生命(2)
    周末情妇
    [转载]黄泉嫁衣
  • 原文地址:https://www.cnblogs.com/onlyzc/p/9519330.html
Copyright © 2020-2023  润新知