设计思路
基于输入的标签表达式转换成clickhouse的查询sql进行bitmap查询
1.1. 实体关系分析
同4.2.2
1.2 实现分析
标签查询:在标签服务前端,根据标签目录筛选标签,对标签进行组合,提交标签查询,返回查询结果。
查询语法定义:(基础标签 optA value optB 基础标签 optA value)optB [NOT] 派生标签
a) optA 取值‘>,<,=,>=,<=,IN’
b) optB 取值 ‘AND,OR’
c) value取值 数字,‘字符串’,[v1,v2,v3]
d) 基础标签必须配套optA value使用
e) 派生标签没有取值,可以通过前缀NOT表示取反
f) 优先级:()> optA > optB
g) 标签名称约束不允许包含optA、optB、NOT、空白符、()
举例:
派生标签:女工程师 :性别=‘女’ AND 职业=‘工程师’
查询:女工程师 OR((年龄<55 and 年龄>17)AND 收入 > 40000 AND 职业=‘医生’)
查询逻辑:
a) 派生标签转换成基础标签表达式。
(性别=‘女’ AND 职业=‘工程师’)OR((年龄<55 and 年龄>17)AND 收入 > 40000 AND 职业=‘医生’)
b) 解析表达式,生成clickhouse的SQL。
派生标签: ext_tag_male_engineer <======>(sex='F' and duty='engineer')
上述表达式为:
(ext_tag_male_engineer or ((age<55 and age>17) and salary>40000 and duty='doctor'))
即:
( (sex='F' and duty='engineer') or ((age<55 and age>17) and salary>40000 and duty='doctor') )
转成clickhouse语法
select bitmapToArray(bitmapOr(bitmapAnd(a, b), bitmapAnd(bitmapAnd(c, d), e))) as users
from (
select a AS j1,
groupBitmapMergeState(bitmap_value) AS a
from user_labels
where (label = 'sex')
AND (value = 'F')
)
inner join
(
select 1 AS j2
groupBitmapMergeState(bitmap_value) AS b
from user_labels
where (label = 'duty')
and (value = 'engineer')
)
on j1 = j2
inner join
(
select 1 AS j3,
groupBitmapMergeState(bitmap_value) AS c
from user_labels
where (label = 'age')
and ((value > 17) AND (value < 55))
) on j2 = j3
inner join
(
select 1 AS j4,
groupBitmapMergeState(bitmap_value) AS d
from user_labels
where (label = 'salary')
AND (value > 40000)
) on j3 = j4
inner join
(
select 1 AS j5
groupBitmapMergeState(bitmap_value) AS e
from user_labels
where (label = 'duty')
and (value = 'doctor')
) on j4 = j5
根据每一个基础标签的数据周期查询指定分区的数据。
根据标签间的关系进行bitmap操作;