• doubleclick cookie、动态脚本、用户画像、用户行为分析和海量数据存取 推荐词 京东 电商 信息上传 黑洞 https://blackhole.m.jd.com/getinfo


    doubleclick cookie 

    https://mp.weixin.qq.com/s/vZUj-Z9FGSSWXOodGqbYkA

    揭密Google的网络广告技术:基于互联网大数据视角

     相信每个人在上网时都被各种网络广告所困扰,不断地消耗着我们的流量。如果稍微细心观察,或许会发现不同网站推送过来的广告也比较适合自己的偏好,看来其中的技术手段并非简单之事。涉及到互联网大数据技术包括:cookie、动态脚本、用户画像、用户行为分析和海量数据存取等。

          假如你在京东上点击笔记本电脑,过几天以后当你浏览一个从未访问过的网站时,你很可能发现页面上竟然有笔记本的广告。

    图1 

           作为一个互联网大数据技术研究者,本能反应当然是看看页面的源代码,确实可以找到相应的脚本,其中的“-ad-”大概表明了这里嵌入了广告。

     

    图2

            但由于是动态脚本,无法看出广告具体在哪个网站上。为此,可以通过浏览器的设置功能,进入开发者模式(Source),找到广告条对应的脚本结构。

    图3 

            然后查看这段动态脚本执行完成后对应的URL,从下图可以看出这个广告URL指向了googleads.g.doubleclick.net,从域名看就是google的广告。

    图4

            没错,doubleclick是一家互联网广告公司,在2008年被Google收购。它提供了多种广告管理和广告投放解决方案,帮助企业购买、制作或销售在线广告,允许用户对网络广告活动进行集中策划、执行、监控和追踪。由此我们可以画出Google的网络广告技术平台架构图。

     

    图5

    整个流程按图中标注的序号1-5。

    1 需要做广告的客户到doubleclick上进行注册、登记;

    2 加入广告联盟的网站从doubleclick获得嵌入广告的动态脚本,即类似于图2所示。并将这些代码嵌入到页面中;

    3 互联网用户大众通过浏览器访问页面,动态脚本在用户浏览器上执行,获得指向doubleclick的URL;

    4 连接doubleclick时,doubleclick生成用户的唯一标识,并写入到本地cookie文件;

    5 以后我们每次访问含有广告脚本的页面时,自动读取doubleclick的cookie,并由doubleclick抽取合适的广告。这样每个人的唯一身份就记录到它的数据库中了。而这个步骤,显然是基于我们点击广告、浏览页面的行为数据,是一个海量数据。精准的广告推送需要进行大数据挖掘、用户画像。

           在这个流程中,cookie起到了很大作用,在每台电脑上几乎都有doubleclick的cookie文件。对于win7下的IE,一般是在C:UsersAdministratorAppDataLocalMicrosoftWindowsTemporary Internet Files中;Chrome浏览器可以Chrome设置->隐私设置->内容设置。找到后可以清除。

      1. Request URL:
        https://blackhole.m.jd.com/getinfo
      2. Request Method:
        POST
      3. Status Code:
        200 OK
      4. Remote Address:
        124.200.54.26:443
      5. Referrer Policy:
        no-referrer-when-downgrade
    1. Response Headers
      1. Access-Control-Allow-Origin:
        *
      2. Connection:
        keep-alive
      3. Content-Length:
        95
      4. Content-Type:
        text/plain
      5. Date:
        Sun, 28 Apr 2019 02:17:39 GMT
      6. Server:
        jfe
    2. Request Headers
      1. Provisional headers are shown
      2. Content-Type:
        application/x-www-form-urlencoded
      3. Origin:
        https://www.jd.com
      4. Referer:
        https://www.jd.com/
      5. User-Agent:
        Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36
    3. Form Dataview sourceview URL encoded
      1. body:
        {"appname":"jdwebm_hf","jdkey":"","whwswswws":"","businness":"pcHome","body":{"browser_info":"5a6de6eb4239d72a591cd732fcf557bc","client_time":1556417862694,"period":24,"shshshfpa":"93048091-c96c-4aff-40ec-0c8bb237d983-1556417862","whwswswws":"","cookie_pin":"","jdu":"1556417860906768591982","mba_muid":"","visitkey":"","msdk_version":"2.3.4","wid":"","language":"en-US","color_depth":24,"pixel_ratio":1,"resolution":"1280;800","available_resolution":"1227;800","session_storage":1,"local_storage":1,"indexed_db":1,"open_database":1,"cpu_class":"unknown","navigator_platform":"Win32","regular_plugins":"Chrome PDF Plugin::Portable Document Format::application/x-google-chrome-pdf~pdf;Chrome PDF Viewer::::application/pdf~pdf;Native Client::::application/x-nacl~,application/x-pnacl~","adblock":false,"touch_support":0,"app_code_name":"Mozilla","app_name":"Netscape","app_version":"5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36","cookie_enabled":true,"regular_mimetypes":"::::;::Portable Document Format::;::Native Client Executable::;::Portable Native Client Executable::","online":"unknown","hardwareConcurrency":2,"product":"Gecko","productSub":"20030107","vendor":"Google Inc.","vendorSub":"unknown","devicePixelRatio":1,"updateInterval":"unknown","orientationType":"landscape-primary","doNotTrack":0,"canvas":"canvas winding:yes~canvas fp:231bea7a22d38c7771b1fd991affdfc6","webgl":"fp:cba0abf4b20cd68cd9fb42b1524d0708~extensions:ANGLE_instanced_arrays;EXT_blend_minmax;EXT_color_buffer_half_float;EXT_frag_depth;EXT_shader_texture_lod;EXT_texture_filter_anisotropic;WEBKIT_EXT_texture_filter_anisotropic;OES_element_index_uint;OES_standard_derivatives;OES_texture_float;OES_texture_half_float;OES_texture_half_float_linear;OES_vertex_array_object;WEBGL_color_buffer_float;WEBGL_compressed_texture_s3tc;WEBKIT_WEBGL_compressed_texture_s3tc;WEBGL_debug_renderer_info;WEBGL_debug_shaders;WEBGL_depth_texture;WEBKIT_WEBGL_depth_texture;WEBGL_lose_context;WEBKIT_WEBGL_lose_context~aliased line width range:[1, 1]~aliased point size range:[1, 256]~alpha bits:8~antialiasing:yes~blue bits:8~depth bits:24~green bits:8~max anisotropy:16~max combined texture image units:20~max cube map texture size:4096~max fragment uniform vectors:221~max render buffer size:4096~max texture image units:16~max texture size:4096~max varying vectors:9~max vertex attribs:16~max vertex texture image units:4~max vertex uniform vectors:253~max viewport dims:[4096, 4096]~red bits:8~renderer:WebKit WebGL~shading language version:WebGL GLSL ES 1.0 (OpenGL ES GLSL ES 1.0 Chromium)~stencil bits:0~vendor:WebKit~version:WebGL 1.0 (OpenGL ES 2.0 Chromium)~unmasked vendor:Google Inc.~unmasked renderer:ANGLE (Mobile Intel(R) 4 Series Express Chipset Family Direct3D9Ex vs_3_0 ps_3_0)~vertex high float:23(127,127)~vertex medium float:23(127,127)~vertex low float:23(127,127)~fragment high float:23(127,127)~fragment medium float:23(127,127)~fragment low float:23(127,127)~vertex high int:0(24,24)~vertex medium int:0(24,24)~vertex low int:0(24,24)~fragment high int:0(24,24)~fragment medium int:0(24,24)~fragment low int:0(24,24)","device_memory":8,"is_headless_browser":0}}

     在保存有uuid情况下

    Request URL:https://floor.jd.com/user/hotwords/get?pin=&uuid=1550390246822668439123&callback=jsonpHotWords
    Request Method:GET
    Status Code:200 OK
    Remote Address:211.144.24.170:443
    Referrer Policy:no-referrer-when-downgrade
    Response Headers
    view source
    Connection:close
    Content-Encoding:gzip
    Content-Type:text/html; charset=utf-8
    Date:Sun, 28 Apr 2019 02:31:52 GMT
    Server:jfe
    Transfer-Encoding:chunked
    Vary:Accept-Encoding
    Request Headers
    view source
    Accept:*/*
    Accept-Encoding:gzip, deflate, sdch, br
    Accept-Language:zh-CN,zh;q=0.8
    Connection:keep-alive
    Cookie:shshshfpa=e27a5d69-e1c0-282d-ea23-9695b1e69510-1550390256; TrackID=1SQI5uj2G1r220dR6ifodPmI8KRO5dZs3OmsiX1SfPCYPCDefRrnEfXWxtXJXAoVZxMHISD56FXkht7-BTmb0iK9S9AT1-UppuX4Q7Pf0u1M; pinId=aR72BDnyHaxs0LHNbV6fLg; __jdv=122270672|direct|-|none|-|1556382041032; areaId=11; ipLoc-djd=11-799-0; PCSYCityID=1137; __jda=122270672.1550390246822668439123.1550390247.1556382041.1556417842.9; __jdb=122270672.3.1550390246822668439123|9.1556417842; __jdc=122270672; shshshfp=8e7ec1c7e67d1ed4944e451a3574168a; shshshsID=6c6284d43d4431f049e3a6f3152e5d03_3_1556418662038; shshshfpb=z8d2uPw45jLHaI6jyaglJIw%3D%3D; __jdu=1550390246822668439123
    Host:floor.jd.com
    Referer:https://www.jd.com/
    User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0
    Query String Parameters
    view source
    view URL encoded
    pin:
    uuid:1550390246822668439123
    callback:jsonpHotWords

     
     
     
     
  • 相关阅读:
    想你了
    新华都总裁兼CEO唐骏演讲
    中国99%的白领以及他们的家庭即将面临破产
    你的英语水平就可以达到跟美国人交流的水平啦
    经验语录
    荒谬的加息传言
    人生格言
    不要为油荒找借口
    房地产调控失利 三次调控势不可免
    假设你的月收入2000元,你应该这样用
  • 原文地址:https://www.cnblogs.com/rsapaper/p/10782334.html
Copyright © 2020-2023  润新知