• libcurl programming


    Compiling

    On windows platform, goto the unpack folder, such as d:/libcurl/curl, find the winbuild diretory. Open the vs command line window and use “nmake makefile.vc” to compile the code, here is the sample to compile the libcurl library to x86 debug static library.

    nmake makefile.vc vc=10 mode=static machine=x86 debug=yes

    Linking

    If you want to link the static library of libcurl to your project, you need to define the macro “CURL_STATICLIB” before including relate header, such as “<curl/curl.h>”, “<curl/easy.h>”.
    #define CURL_STATICLIB
    #include <curl/curl.h>
     

    Macro

    LIBCURL_VERSION_NUM: version for libcurl, value “0x070c03” means version 7.12.3
    CURL_STATICLIB: using static library

    With C++

    There's basically only one thing to keep in mind when using C++ instead of C when interfacing libcurl: 
    The callbacks CANNOT be non-static class member functions .
     

    API

    There are three category for libcurl, easy, mult and share.
    #include <curl/easy.h>
    #include <curl/multi.h>

    Easy interafce

    curl_easy_init

    This function must be the first function to call, and it returns a CURL easy handle that you must use as input to other easy-functions. curl_easy_init contains curl_global_init invoking automatically if you have not yet called it. But this is lethal in multi-threaded cases, since curl_global_init is not thread-safe.

    You are strongly advised to not allow this automatic behaviour, by calling curl_global_init yourself properly.

    If it returns NULL to curl pointer, something went wrong and you can not use the other funcitons.

    curl_easy_cleanup

    It is oppsoite of the curl_easy_init function and it should be the last function call for an easy sesstion. Any uses of the handlel after this function has been called and hanve returned, are illegal. This function return None.

    curl_easy_escape
    URL encoding. For the character in the url string inputed as parameter, that is not a-z, A-Z, 0-9, '-', '.', '_' or '~' are converted to their "URL escaped" version (%NN where NN is a two-digit hexadecimal number). Here is sample code for it. 
    CURL * curl = nullptr;
    char * esc = nullptr;
    CURLcode ret;
    ret = curl_global_init(CURL_GLOBAL_ALL);
    curl = curl_easy_init();
    esc = curl_easy_escape(curl, "http://lcoalhost", 0);
    cout<<esc<<endl;
    curl_free((void*)esc);
    curl_easy_cleanup(curl);
    Output is: http%3A%2F%2Flcoalhost

    curl_easy_getinfo

    curl_easy_setopt
    CURLOPT_URL 
    Pass in a pointer to the actual URL to deal with. The parameter should be a char * to a zero terminated string which must be URL-encoded in the following format:
    scheme://host:port/path
     

    Practise

    Using C++ non-static functions for callbacks?

    libcurl is a C library, it doesn't know anything about C++ member functions. 
    You can overcome this "limitation" with a relative ease using a static member function that is passed a pointer to the class:

    // f is the pointer to your object.
    static YourClass::func(void *buffer, size_t sz, size_t n, void *f)
    {
      // Call non-static member function.
      static_cast<YourClass*>(f)->nonStaticFunction();
    }
    
    // This is how you pass pointer to the static function:
    curl_easy_setopt(hcurl, CURLOPT_WRITEFUNCTION, YourClass::func);
    curl_easy_setopt(hcurl, CURLOPT_WRITEDATA, this);

    Get the remote file’s size

    option setting
    curl_easy_setopt(handle, CURLOPT_URL, url);
    curl_easy_setopt(handle, CURLOPT_PROXY, szProxy); // for proxy
    curl_easy_setopt(handle, CURLOPT_NOBODY, 1);
    curl_easy_setopt(handle, CURLOPT_ERRORBUFFER, szErr);
    ret = curl_easy_perform(handle);
    get information
    double dval;
    if (CURLE_OK != curl_easy_getinfo(curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &dval))
    {
        cerr<<"Failed to get CURLINFO_CONTENT_LENGTH_DOWNLOAD"<<endl;
        curl_easy_cleanup(curl);
        return;
    }
    cout<<"CURLINFO_CONTENT_LENGTH_DOWNLOAD :"<<dval<<endl;

    Get the remote file’s create time

    option setting
    curl_easy_setopt(handle, CURLOPT_FILETIME, 1L); // for curl_easy_getinfo() for CURLINFO_FILETIME
    get information
    long lval;
    if (CURLE_OK != curl_easy_getinfo(curl, CURLINFO_FILETIME, &lval) || lval == -1)
    {
        cerr<<"Failed to get CURLINFO_FILETIME"<<endl;
        curl_easy_cleanup(curl);
        return;
    }
    time_t t(lval);
    tm * ptm = localtime((const time_t*)&t);
    cout<<"CURLINFO_FILETIME :"<<asctime(ptm)<<endl;

    Show progress for downloading

    option setting using default display by cURL, 0L means showing progress, 1L means not to show it.
    curl_easy_setopt(handle, CURLOPT_NOPROGRESS, 0L);
    change to use 
     
    Check/Confirm the remote file exist
    Two flags could show whether the remote file exists by the url, CURLINFO_RESPONSE_CODE andCURLINFO_CONTENT_LENGTH_DOWNLOAD
    If the file doesn’t exist by the url, 404 will be set as CURLINFO_RESPONSE_CODE and –1 for CURLINFO_CONTENT_LENGTH_DOWNLOAD. Otherwise, 200 for CURLINFO_RESPONSE_CODE.
     
    Resume uploading file
    the option CURLOPT_RESUME_FROM by curl_easy_setopt
    CURLOPT_RESUME_FROM_LARGE
    Pass a curl_off_t as parameter. It contains the offset in number of bytes that you want the transfer to start from. (Added in 7.11.0)
     

    AGet is similar to FlashGet for win32. It can download the large file by multithread. Here is the theory for aget:
    If you're downloading a file of size less than 512K, I suggest using small segments, i.e 4-5 however, if you are downloading large files, you can increase the segmentation Aget first sends a HEAD request to retrieve the length of the file, and divides it into equal segments according to the number user has requested. Then for each segment, it connects to the server and gets only the part, which it is to download. Therefore, if you're downloading a small file, it is suggested that you decrease the number of segmentation, likewise, if it's a large file, you're urged to increase the number of threads.
     

    Todo
    using libcurl as spider and 爬虫
  • 相关阅读:
    Linux 多线程编程 实例 1
    面试题-链表反转c实现
    information_schema.TABLES
    mongodb遇到的错误
    MySQL优化的奇技淫巧之STRAIGHT_JOIN
    mongodb安装
    XtraBackup安装
    提高mysql千万级大数据SQL查询优化30条经验(Mysql索引优化注意)
    我用 TypeScript 语言的七个月
    Grunt之添加文件监视:Grunt-watch (已备份)
  • 原文地址:https://www.cnblogs.com/shenlanzifa/p/5288770.html
Copyright © 2020-2023  润新知