• saiku执行速度慢


    使用saiku的过程中发现一个重要问题,速度慢!下面是跟踪和优化过程

    一、首先抓包,发现ajax请求:http://l-tdata2.tkt.cn6.qunar.com:8080/saiku/rest/saiku/api/query/execute

    里面的参数不少,下面是截屏

    二、看日志:发现了mdx语句

    WITH
    SET [~ROWS_create_date_create_date] AS
        {[create_date].[create_date].[2016-04-12]}
    SET [~ROWS_dimPartner_dimPartner] AS
        Hierarchize({{[dimPartner].[dimPartner].[All dimPartners]}, {[dimPartner].[dimPartner].[name].Members}})
    SET [~ROWS_in_track_in_track] AS
        {[in_track].[in_track].[All in_tracks]}
    SET [~ROWS_product_product] AS
        {[product].[product].[All products]}
    SET [~ROWS_self_self] AS
        {[self].[self].[All selfs]}
    SET [~ROWS_sight_sight] AS
        {[sight].[sight].[All sights]}
    SET [~ROWS_ticket_type_ticket_type] AS
        {[ticket_type].[ticket_type].[All ticket_types]}
    SET [~ROWS_order_status_order_status] AS
        {[order_status].[order_status].[All order_statuss]}
    SET [~ROWS_refund_status_refund_status] AS
        {[refund_status].[refund_status].[All refund_statuss]}
    SELECT
    NON EMPTY {[Measures].[money], [Measures].[quantity], [Measures].[qunar_income], [Measures].[order_num]} ON COLUMNS,
    NON EMPTY Order(NonEmptyCrossJoin([~ROWS_create_date_create_date], NonEmptyCrossJoin([~ROWS_dimPartner_dimPartner], NonEmptyCrossJoin([~ROWS_in_track_in_track], NonEmptyCrossJoin([~ROWS_product_product], NonEmptyCrossJoin([~ROWS_self_self], NonEmptyCrossJoin([~ROWS_sight_sight], NonEmptyCrossJoin([~ROWS_ticket_type_ticket_type], NonEmptyCrossJoin([~ROWS_order_status_order_status], [~ROWS_refund_status_refund_status])))))))), [Measures].[money], BDESC) ON ROWS
    FROM [com_order_detail_cube]
    2016-04-15 17:02:55,948 INFO  [org.saiku.datasources.connection.SaikuOlapConnection] Clearing cache
    2016-04-15 17:03:26,603 WARN  [mondrian.rolap.RolapSchema] Model is in legacy format
    2016-04-15 17:04:09,714 INFO  [org.saiku.datasources.connection.SaikuOlapConnection] Catalogs:1
    2016-04-15 17:06:03,660 DEBUG [org.saiku.service.olap.ThinQueryService] Query End
    2016-04-15 17:06:03,661 INFO  [org.saiku.service.olap.ThinQueryService] RUN#:84 Size: 14/7      Execute:        190420ms        Format: 0ms     Totals: 0ms      Total: 190420ms

    观察日志,发现前端一直执行不返回。分析主要原因是执行mdx需要很长时间,190秒

    3、找代码:org.saiku.web.rest.resources.Query2Resource的execute方法

    继续追踪代码:org.saiku.service.olap.ThinQueryService的execute方法()。下面是核心重点:

        private CellDataSet execute(ThinQuery tq, ICellSetFormatter formatter) {
            try {
    
                Long start = (new Date()).getTime();
                log.debug("Query Start");
                CellSet cellSet =  executeInternalQuery(tq); //这是执行mdx语句的地方,需要较长时间
                log.debug("Query End");
                String runId = "RUN#:" + ID_GENERATOR.get();
                Long exec = (new Date()).getTime();
    
                CellDataSet result = OlapResultSetUtil.cellSet2Matrix(cellSet,formatter);
                Long format = (new Date()).getTime();
    
                if (ThinQuery.Type.QUERYMODEL.equals(tq.getType()) && formatter instanceof FlattenedCellSetFormatter && tq.hasAggregators()) {
                    calculateTotals(tq, result, cellSet, formatter);
                }
                Long totals = (new Date()).getTime();
                log.info(runId + "	Size: " + result.getWidth() + "/" + result.getHeight() + "	Execute:	" + (exec - start)
                        + "ms	Format:	" + (format - exec) + "ms	Totals:	" + (totals - format) + "ms	 Total: " + (totals - start) + "ms");
    
                result.setRuntime(new Double(format - start).intValue());
                return result;
            } catch (Exception | Error e) {
                throw new SaikuServiceException("Can't execute query: " + tq.getName(),e);
            }
        }

    4、查看数据执行的sql,看看为什么执行的很慢

    4.1 选择情况

    首先任何的筛选都是对立方体内的字段进行全表的扫描,比如我的立方体对应的数据表是:com_order_detail_view,时间对应的字段是create_date,那么选择时间的时候,捕获执行的sql如下:

     select "com_order_detail_view"."create_date" as "c0" from "com_order_detail_view" as "com_order_detail_view" group by "com_order
    _detail_view"."create_date" order by "com_order_detail_view"."create_date" ASC NULLS LAST

    发现根本没有where条件。好吧,这个可以理解!

    4.2 执行情况

    筛选的时候,为了提升效率,选择了一个日期,并且只是选择了name字段作为区分。执行时间:190s

    抓取的sql如下:

    4.2.1

    select "dim_partner"."name" as "c0", sum("com_order_detail_view"."money") as "m0", sum("com_order_detail_view"."quantity") as "m1", sum("com_order_detail_view"."qunar_income") as "m2", count(distinct "com_order_detail_view"."display_id") as "m3" from "com_order_detail_view" as "com_order_detail_view", "dim_partner" as "dim_partner" where "com_order_detail_view"."partner" = "dim_partner"."code" group by "dim_partner"."name"

    4.2.2

     select sum("com_order_detail_view"."money") as "m0", sum("com_order_detail_view"."quantity") as "m1", sum("com_order_detail_view"."qunar_income") as "m2", count(distinct "com_order_detail_view"."display_id") as "m3" from "com_order_detail_view" as "com_order_detail_view"

    没有发现where条件。猜测可能是选择日期没有在过滤条件里面,所以全表扫描,那么将日期放入过滤条件,mdx被修改为:

    WITH
    SET [~FILTER] AS
        {[create_date].[create_date].[2016-04-01]}
    SET [~ROWS_dimPartner_dimPartner] AS
        Hierarchize({{[dimPartner].[dimPartner].[All dimPartners]}, {[dimPartner].[dimPartner].[name].Members}})
    SET [~ROWS_in_track_in_track] AS
        {[in_track].[in_track].[All in_tracks]}
    SET [~ROWS_product_product] AS
        {[product].[product].[All products]}
    SET [~ROWS_self_self] AS
        {[self].[self].[All selfs]}
    SET [~ROWS_sight_sight] AS
        {[sight].[sight].[All sights]}
    SET [~ROWS_ticket_type_ticket_type] AS
        {[ticket_type].[ticket_type].[All ticket_types]}
    SET [~ROWS_order_status_order_status] AS
        {[order_status].[order_status].[All order_statuss]}
    SET [~ROWS_refund_status_refund_status] AS
        {[refund_status].[refund_status].[All refund_statuss]}
    SELECT
    NON EMPTY {[Measures].[money], [Measures].[quantity], [Measures].[qunar_income], [Measures].[order_num]} ON COLUMNS,
    NON EMPTY Order(NonEmptyCrossJoin([~ROWS_dimPartner_dimPartner], NonEmptyCrossJoin([~ROWS_in_track_in_track], NonEmptyCrossJoin([~ROWS_product_product], NonEmptyCrossJoin([~ROWS_self_self], NonEmptyCrossJoin([~ROWS_sight_sight], NonEmptyCrossJoin([~ROWS_ticket_type_ticket_type], NonEmptyCrossJoin([~ROWS_order_status_order_status], [~ROWS_refund_status_refund_status]))))))), [Measures].[money], BDESC) ON ROWS
    FROM [com_order_detail_cube]
    WHERE [~FILTER]
    2016-04-15 17:13:02,448 DEBUG [org.saiku.service.olap.ThinQueryService] Query End
    2016-04-15 17:13:02,449 INFO  [org.saiku.service.olap.ThinQueryService] RUN#:86 Size: 13/8      Execute:        20679ms Format: 1ms     Totals: 0ms      Total: 20680ms

    发现有了效果,执行时间:20s。下面是抓取的sql

    4.2.3

    select "com_order_detail_view"."create_date" as "c0", "dim_partner"."name" as "c1", sum("com_order_detail_view"."money") as "m0", sum("com_order_detail_view"."quantity") as "m1", sum("com_order_detail_view"."qunar_income") as "m2", count(distinct "com_order_detail_view"."display_id") as "m3" from "com_order_detail_view" as "com_order_detail_view", "dim_partner" as "dim_partner" where "com_order_detail_view"."create_date" = DATE '2016-04-01' and "com_order_detail_view"."partner" = "dim_partner"."code" group by "com_order_detail_view"."create_date", "dim_partner"."name"

    4.2.4

    select "com_order_detail_view"."create_date" as "c0", sum("com_order_detail_view"."money") as "m0", sum("com_order_detail_view"."quantity") as "m1", sum("com_order_detail_view"."qunar_income") as "m2", count(distinct "com_order_detail_view"."display_id") as "m3" from "com_order_detail_view" as "com_order_detail_view" where "com_order_detail_view"."create_date" = DATE '2016-04-01' group by "com_order_detail_view"."create_date"

    总结:使用saiku的时候,将时间条件放在《行》或者《列》里面,基本不起作用。最好放入在《过滤》里面

  • 相关阅读:
    深入理解TCP协议及其源代码
    Socket与系统调用深度分析
    构建调试Linux内核网络代码的环境MenuOS系统
    创新产品的需求分析:未来的图书会是什么样子?
    socket通信的原理与实践
    案例分析:设计模式与代码的结构特性
    网络相关的命令工具研究报告:ssh
    业务领域建模Domain Modeling
    ubuntu 设置静态ip,但显示scope global secondary ens33
    解决Python查询Mysql中文乱码问题
  • 原文地址:https://www.cnblogs.com/liqiu/p/5396169.html
Copyright © 2020-2023  润新知