• clickhouse的windowFunnel(漏斗)


    1、WindowFunnel

    关于官网的解释:

    Returned value:Integer. The maximum number of consecutive triggered conditions from the chain within the sliding time window. All the chains in the selection are analyzed.
    返回值:int类型。返回满足在指定滑动窗口内的连续触发条件的最大值。所有被选择的条件链都会被分析(这句翻译的不准确,主要看前面一句翻译即可)
    具体实例分析:
    建如下一张表,

    CREATE TABLE funnel.funnel_test ( uid String, eventid String, eventTime UInt64) ENGINE = MergeTree PARTITION BY (uid, eventTime) ORDER BY (uid, eventTime) SETTINGS index_granularity = 8192

    有三个字段:
    uid:用户id
    eventid:事件id
    eventTime:事件发生时间(秒)


    插入如下数据作为测试数据:
    uid1 event1 1551398404
    uid1 event2 1551398406
    uid1 event3 1551398408
    uid2 event2 1551398412
    uid2 event3 1551398415
    uid3 event3 1551398410
    uid3 event4 1551398413


    1.

    select uid,windowFunnel(4)(toDateTime(eventTime),eventid = 'event2',eventid = 'event3') as funnel from funnel_test group by uid;

    当我们设置的滑动窗口为4秒,条件链为event2->event3时,上述查询得到的结果为:
    uid funnel
    uid1 2
    uid2 2
    uid3 0


    下面我们看看他是怎么得到这个结果的,首先将所有的数据根据uid聚合和排序(排序是windowFunnel里自己实现的),得到:
    uid1: (event1,1551398404) -> (event2,1551398406) -> (event3,1551398408)
    uid2: (event2,1551398412) -> (event3,1551398415)
    uid3: (event3 ,1551398410) -> (event4,1551398413)
    由上述聚合和排序之后的条件链中,只有uid1和uid2有event2->event3的条件链,且时间差分别为2(1551398408-1551398406) 和 3(1551398415-1551398412),小于滑动窗口4,所以满足条件,故uid1和uid2的结果都为2(event2,event3),而uid3为0(没有满足条件的条件链)

    2、 如果滑动窗口改为2

    select uid,windowFunnel(2)(toDateTime(eventTime),eventid = 'event2',eventid = 'event3') as funnel from funnel_test group by uid;

    则由上述得到的条件链知道,结果为
    uid funnel
    uid1 2
    uid2 1
    uid3 0
    为什么uid2变成了1,因为uid2的条件链中的event3和event2的时间差是3,大于了滑动窗口时间2,所以只有第一个条件event2满足查询,故结果为1

    3、 如果滑动窗口为4,条件链改为event3,event4,

    select uid,windowFunnel(4)(toDateTime(eventTime),eventid = 'event3',eventid = 'event4') as funnel from funnel_test group by uid;

    则查询结果为
    uid funnel
    uid1 1
    uid2 1
    uid3 2


    因为uid1和uid2只有事件event3,没有事件event4.
    而uid3既有event3,也有event4,且两个事件的时间差小于滑动窗口4,故uid3的结果为2

    人生如修仙,岂是一日间。何时登临顶,上善若水前。
  • 相关阅读:
    jenkins+newman+postman实现api自动化
    数据库的关闭方法
    获取2台linux机器的时间差
    性能测试与其分析
    todo:云数据库的元信息
    todo:trove命令使用
    syslog协议及rsyslog服务全解析
    C++Primer学习日记
    excel-填充
    excel-删除
  • 原文地址:https://www.cnblogs.com/f-society/p/12505843.html
Copyright © 2020-2023  润新知