• Skywalking-06:OAL基础


    OAL 基础知识

    基本介绍

    OAL(Observability Analysis Language) 是一门用来分析流式数据的语言。

    因为 OAL 聚焦于度量 Service 、 Service Instance 和 Endpoint 的指标,所以它学习和使用起来非常简单。

    OAL 基于 altlr 与 javassist 将 oal 脚本转化为动态生成的类文件。

    自从 6.3 版本后, OAL 引擎内置在 OAP 服务器中,可以看做 oal-rt(OAL Runtime) 。 OAL 脚本位置 OAL 配置目录下( /config/oal ),使用者能够更改脚本并重启生效。注意: OAL 脚本仍然是一门编译语言, oal-rt 动态的生成 Java 代码。

    如果你配置了环境变量 SW_OAL_ENGINE_DEBUG=Y,能在工作目录下的 oal-rt 目录下找到生成的 Class 文件。

    语法

    // 声明一个指标
    METRICS_NAME = from(SCOPE.(* | [FIELD][,FIELD ...])) // 从某一个SCOPE中获取数据
    [.filter(FIELD OP [INT | STRING])] // 可以过滤掉部分数据
    .FUNCTION([PARAM][, PARAM ...]) // 使用某个聚合函数将数据聚合
    
    // 禁用一个指标
    disable(METRICS_NAME);
    

    语法案例

    oap-server/server-bootstrap/src/main/resources/oal/java-agent.oal

    // 从ServiceInstanceJVMMemory的used获取数据,只需要 heapStatus 为 true的数据,并取long型的平均值
    instance_jvm_memory_heap = from(ServiceInstanceJVMMemory.used).filter(heapStatus == true).longAvg();
    

    org.apache.skywalking.oap.server.core.source.ServiceInstanceJVMMemory

    @ScopeDeclaration(id = SERVICE_INSTANCE_JVM_MEMORY, name = "ServiceInstanceJVMMemory", catalog = SERVICE_INSTANCE_CATALOG_NAME)
    @ScopeDefaultColumn.VirtualColumnDefinition(fieldName = "entityId", columnName = "entity_id", isID = true, type = String.class)
    public class ServiceInstanceJVMMemory extends Source {
        @Override
        public int scope() {
            return DefaultScopeDefine.SERVICE_INSTANCE_JVM_MEMORY;
        }
    
        @Override
        public String getEntityId() {
            return String.valueOf(id);
        }
    
        @Getter @Setter
        private String id;
        @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "name", requireDynamicActive = true)
        private String name;
        @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_name", requireDynamicActive = true)
        private String serviceName;
        @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_id")
        private String serviceId;
        @Getter @Setter
        private boolean heapStatus;
        @Getter @Setter
        private long init;
        @Getter @Setter
        private long max;
        @Getter @Setter
        private long used;
        @Getter @Setter
        private long committed;
    }
    

    可供参考的官方文档:Observability Analysis Language

    从一个案例开始分析 OAL 原理

    缺少的类加载信息监控

    默认的 APM/Instance 页面,缺少关于 JVM Class 的信息(如下图所示),故这次将相关信息补齐。由这次案例来分析 OAL 的原理。

    file

    Skywalking-04:扩展Metric监控信息 中,讲到了如何在已有 Source 类的情况下,增加一些指标。

    这次直接连 Source 类以及 OAL 词法语法关键字都自己定义。

    可供参考的官方文档:Source and Scope extension for new metrics

    确定增加的指标

    通过Java ManagementFactory解析这篇文章,可以确定监控指标为“当前加载类的数量”、“已卸载类的数量”、“一共加载类的数量”三个指标

    ClassLoadingMXBean classLoadingMXBean = ManagementFactory.getClassLoadingMXBean();
    // 当前加载类的数量
    int loadedClassCount = classLoadingMXBean.getLoadedClassCount();
    // 已卸载类的数量
    long unloadedClassCount = classLoadingMXBean.getUnloadedClassCount();
    // 一共加载类的数量
    long totalLoadedClassCount = classLoadingMXBean.getTotalLoadedClassCount();
    

    定义 agent 与 oap server 通讯类

    apm-protocol/apm-network/src/main/proto/language-agent/JVMMetric.proto 协议文件中增加如下定义。

    apm-protocol/apm-network 目录下执行 mvn clean package -DskipTests=true 会生成新的相关 Java 类,org.apache.skywalking.apm.network.language.agent.v3.Class 该类就是我们在代码中实际操作的类。

    message Class {
      int64 loadedClassCount = 1;
      int64 unloadedClassCount = 3;
      int64 totalLoadedClassCount = 2;
    }
    
    message JVMMetric {
        int64 time = 1;
        CPU cpu = 2;
        repeated Memory memory = 3;
        repeated MemoryPool memoryPool = 4;
        repeated GC gc = 5;
        Thread thread = 6;
        // 在JVM指标中添加Class的定义
        Class clazz = 7;
    }
    

    收集 agent 的信息后,将信息发送至 oap server

    收集 Class 相关的指标信息

    package org.apache.skywalking.apm.agent.core.jvm.clazz;
    
    import org.apache.skywalking.apm.network.language.agent.v3.Class;
    
    import java.lang.management.ClassLoadingMXBean;
    import java.lang.management.ManagementFactory;
    
    public enum ClassProvider {
        /**
         * instance
         */
        INSTANCE;
    
        private final ClassLoadingMXBean classLoadingMXBean;
    
        ClassProvider() {
            this.classLoadingMXBean = ManagementFactory.getClassLoadingMXBean();
        }
    	
        // 构建class的指标信息
        public Class getClassMetrics() {
            int loadedClassCount = classLoadingMXBean.getLoadedClassCount();
            long unloadedClassCount = classLoadingMXBean.getUnloadedClassCount();
            long totalLoadedClassCount = classLoadingMXBean.getTotalLoadedClassCount();
            return Class.newBuilder().setLoadedClassCount(loadedClassCount)
                    .setUnloadedClassCount(unloadedClassCount)
                    .setTotalLoadedClassCount(totalLoadedClassCount)
                    .build();
        }
    
    }
    

    org.apache.skywalking.apm.agent.core.jvm.JVMService#run 方法中,将 class 相关指标设置到 JVM 指标类中

        @Override
        public void run() {
            long currentTimeMillis = System.currentTimeMillis();
            try {
                JVMMetric.Builder jvmBuilder = JVMMetric.newBuilder();
                jvmBuilder.setTime(currentTimeMillis);
                jvmBuilder.setCpu(CPUProvider.INSTANCE.getCpuMetric());
                jvmBuilder.addAllMemory(MemoryProvider.INSTANCE.getMemoryMetricList());
                jvmBuilder.addAllMemoryPool(MemoryPoolProvider.INSTANCE.getMemoryPoolMetricsList());
                jvmBuilder.addAllGc(GCProvider.INSTANCE.getGCList());
                jvmBuilder.setThread(ThreadProvider.INSTANCE.getThreadMetrics());
                // 设置class的指标
                jvmBuilder.setClazz(ClassProvider.INSTANCE.getClassMetrics());
    			// 将JVM的指标放在阻塞队列中
                // org.apache.skywalking.apm.agent.core.jvm.JVMMetricsSender#run方法,会将相关信息发送至oap server
                sender.offer(jvmBuilder.build());
            } catch (Exception e) {
                LOGGER.error(e, "Collect JVM info fail.");
            }
        }
    

    创建 Source 类

    public class DefaultScopeDefine {
        public static final int SERVICE_INSTANCE_JVM_CLASS = 11000;
    
        /** Catalog of scope, the metrics processor could use this to group all generated metrics by oal rt. */
        public static final String SERVICE_INSTANCE_CATALOG_NAME = "SERVICE_INSTANCE";
    }
    
    package org.apache.skywalking.oap.server.core.source;
    
    import lombok.Getter;
    import lombok.Setter;
    
    import static org.apache.skywalking.oap.server.core.source.DefaultScopeDefine.SERVICE_INSTANCE_CATALOG_NAME;
    import static org.apache.skywalking.oap.server.core.source.DefaultScopeDefine.SERVICE_INSTANCE_JVM_CLASS;
    
    @ScopeDeclaration(id = SERVICE_INSTANCE_JVM_CLASS, name = "ServiceInstanceJVMClass", catalog = SERVICE_INSTANCE_CATALOG_NAME)
    @ScopeDefaultColumn.VirtualColumnDefinition(fieldName = "entityId", columnName = "entity_id", isID = true, type = String.class)
    public class ServiceInstanceJVMClass extends Source {
        @Override
        public int scope() {
            return SERVICE_INSTANCE_JVM_CLASS;
        }
    
        @Override
        public String getEntityId() {
            return String.valueOf(id);
        }
    
        @Getter @Setter
        private String id;
        @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "name", requireDynamicActive = true)
        private String name;
        @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_name", requireDynamicActive = true)
        private String serviceName;
        @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_id")
        private String serviceId;
        @Getter @Setter
        private long loadedClassCount;
        @Getter @Setter
        private long unloadedClassCount;
        @Getter @Setter
        private long totalLoadedClassCount;
    }
    

    将从 agent 获取到的信息,发送至 SourceReceive

    org.apache.skywalking.oap.server.analyzer.provider.jvm.JVMSourceDispatcher 进行如下修改

        public void sendMetric(String service, String serviceInstance, JVMMetric metrics) {
            long minuteTimeBucket = TimeBucket.getMinuteTimeBucket(metrics.getTime());
    
            final String serviceId = IDManager.ServiceID.buildId(service, NodeType.Normal);
            final String serviceInstanceId = IDManager.ServiceInstanceID.buildId(serviceId, serviceInstance);
    
            this.sendToCpuMetricProcess(
                service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getCpu());
            this.sendToMemoryMetricProcess(
                service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getMemoryList());
            this.sendToMemoryPoolMetricProcess(
                service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getMemoryPoolList());
            this.sendToGCMetricProcess(
                service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getGcList());
            this.sendToThreadMetricProcess(
                    service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getThread());
            // class指标处理
            this.sendToClassMetricProcess(
                    service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getClazz());
        }
    
        private void sendToClassMetricProcess(String service,
                String serviceId,
                String serviceInstance,
                String serviceInstanceId,
                long timeBucket,
                Class clazz) {
            // 拼装Source对象
            ServiceInstanceJVMClass serviceInstanceJVMClass = new ServiceInstanceJVMClass();
            serviceInstanceJVMClass.setId(serviceInstanceId);
            serviceInstanceJVMClass.setName(serviceInstance);
            serviceInstanceJVMClass.setServiceId(serviceId);
            serviceInstanceJVMClass.setServiceName(service);
            serviceInstanceJVMClass.setLoadedClassCount(clazz.getLoadedClassCount());
            serviceInstanceJVMClass.setUnloadedClassCount(clazz.getUnloadedClassCount());
            serviceInstanceJVMClass.setTotalLoadedClassCount(clazz.getTotalLoadedClassCount());
            serviceInstanceJVMClass.setTimeBucket(timeBucket);
            // 将Source对象发送至SourceReceive进行处理
            sourceReceiver.receive(serviceInstanceJVMClass);
        }
    

    OAL 词法定义和语法定义中加入 Source 相关信息

    oap-server/oal-grammar/src/main/antlr4/org/apache/skywalking/oal/rt/grammar/OALLexer.g4 定义 Class 关键字

    // Keywords
    
    FROM: 'from';
    FILTER: 'filter';
    DISABLE: 'disable';
    SRC_ALL: 'All';
    SRC_SERVICE: 'Service';
    SRC_SERVICE_INSTANCE: 'ServiceInstance';
    SRC_ENDPOINT: 'Endpoint';
    SRC_SERVICE_RELATION: 'ServiceRelation';
    SRC_SERVICE_INSTANCE_RELATION: 'ServiceInstanceRelation';
    SRC_ENDPOINT_RELATION: 'EndpointRelation';
    SRC_SERVICE_INSTANCE_JVM_CPU: 'ServiceInstanceJVMCPU';
    SRC_SERVICE_INSTANCE_JVM_MEMORY: 'ServiceInstanceJVMMemory';
    SRC_SERVICE_INSTANCE_JVM_MEMORY_POOL: 'ServiceInstanceJVMMemoryPool';
    SRC_SERVICE_INSTANCE_JVM_GC: 'ServiceInstanceJVMGC';
    SRC_SERVICE_INSTANCE_JVM_THREAD: 'ServiceInstanceJVMThread';
    SRC_SERVICE_INSTANCE_JVM_CLASS:'ServiceInstanceJVMClass'; // 在OAL词法定义中添加Class的关键字
    SRC_DATABASE_ACCESS: 'DatabaseAccess';
    SRC_SERVICE_INSTANCE_CLR_CPU: 'ServiceInstanceCLRCPU';
    SRC_SERVICE_INSTANCE_CLR_GC: 'ServiceInstanceCLRGC';
    SRC_SERVICE_INSTANCE_CLR_THREAD: 'ServiceInstanceCLRThread';
    SRC_ENVOY_INSTANCE_METRIC: 'EnvoyInstanceMetric';
    

    oap-server/oal-grammar/src/main/antlr4/org/apache/skywalking/oal/rt/grammar/OALParser.g4 添加 Class 关键字

    source
        : SRC_ALL | SRC_SERVICE | SRC_DATABASE_ACCESS | SRC_SERVICE_INSTANCE | SRC_ENDPOINT |
          SRC_SERVICE_RELATION | SRC_SERVICE_INSTANCE_RELATION | SRC_ENDPOINT_RELATION |
          SRC_SERVICE_INSTANCE_JVM_CPU | SRC_SERVICE_INSTANCE_JVM_MEMORY | SRC_SERVICE_INSTANCE_JVM_MEMORY_POOL | 
          SRC_SERVICE_INSTANCE_JVM_GC | SRC_SERVICE_INSTANCE_JVM_THREAD | SRC_SERVICE_INSTANCE_JVM_CLASS |// 在OAL语法定义中添加词法定义中定义的关键字
          SRC_SERVICE_INSTANCE_CLR_CPU | SRC_SERVICE_INSTANCE_CLR_GC | SRC_SERVICE_INSTANCE_CLR_THREAD |
          SRC_ENVOY_INSTANCE_METRIC |
          SRC_BROWSER_APP_PERF | SRC_BROWSER_APP_PAGE_PERF | SRC_BROWSER_APP_SINGLE_VERSION_PERF |
          SRC_BROWSER_APP_TRAFFIC | SRC_BROWSER_APP_PAGE_TRAFFIC | SRC_BROWSER_APP_SINGLE_VERSION_TRAFFIC
        ;
    

    oap-server/oal-grammar 目录下执行 mvn clean package -DskipTests=true 会生成新的相关 Java

    定义 OAL 指标

    oap-server/server-bootstrap/src/main/resources/oal/java-agent.oal 中添加基于 OAL 语法的 Class 相关指标定义

    // 当前加载类的数量
    instance_jvm_class_loaded_class_count = from(ServiceInstanceJVMClass.loadedClassCount).longAvg();
    // 已卸载类的数量
    instance_jvm_class_unloaded_class_count = from(ServiceInstanceJVMClass.unloadedClassCount).longAvg();
    // 一共加载类的数量
    instance_jvm_class_total_loaded_class_count = from(ServiceInstanceJVMClass.totalLoadedClassCount).longAvg();
    

    配置 UI 面板

    将如下界面配置导入 APM 面板中

    {
      "name": "Instance",
      "children": [{
          "width": "3",
          "title": "Service Instance Load",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "service_instance_cpm",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "unit": "CPM - calls per minute"
        },
        {
          "width": 3,
          "title": "Service Instance Throughput",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "service_instance_throughput_received,service_instance_throughput_sent",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "unit": "Bytes"
        },
        {
          "width": "3",
          "title": "Service Instance Successful Rate",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "service_instance_sla",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "unit": "%",
          "aggregation": "/",
          "aggregationNum": "100"
        },
        {
          "width": "3",
          "title": "Service Instance Latency",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "service_instance_resp_time",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "unit": "ms"
        },
        {
          "width": 3,
          "title": "JVM CPU (Java Service)",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "instance_jvm_cpu",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "unit": "%",
          "aggregation": "+",
          "aggregationNum": ""
        },
        {
          "width": 3,
          "title": "JVM Memory (Java Service)",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "instance_jvm_memory_heap, instance_jvm_memory_heap_max,instance_jvm_memory_noheap, instance_jvm_memory_noheap_max",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "unit": "MB",
          "aggregation": "/",
          "aggregationNum": "1048576"
        },
        {
          "width": 3,
          "title": "JVM GC Time",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "instance_jvm_young_gc_time, instance_jvm_old_gc_time",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "unit": "ms"
        },
        {
          "width": 3,
          "title": "JVM GC Count",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartBar",
          "metricName": "instance_jvm_young_gc_count, instance_jvm_old_gc_count"
        },
        {
          "width": 3,
          "title": "JVM Thread Count (Java Service)",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "metricName": "instance_jvm_thread_live_count, instance_jvm_thread_daemon_count, instance_jvm_thread_peak_count,instance_jvm_thread_deadlocked,instance_jvm_thread_monitor_deadlocked"
        },
        {
          "width": 3,
          "title": "JVM Thread State Count (Java Service)",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "instance_jvm_thread_new_thread_count,instance_jvm_thread_runnable_thread_count,instance_jvm_thread_blocked_thread_count,instance_jvm_thread_wait_thread_count,instance_jvm_thread_time_wait_thread_count,instance_jvm_thread_terminated_thread_count",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartBar"
        },
        {
          "width": 3,
          "title": "JVM Class Count (Java Service)",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "instance_jvm_class_loaded_class_count,instance_jvm_class_unloaded_class_count,instance_jvm_class_total_loaded_class_count",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartArea"
        },
        {
          "width": 3,
          "title": "CLR CPU  (.NET Service)",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "instance_clr_cpu",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "unit": "%"
        },
        {
          "width": 3,
          "title": "CLR GC (.NET Service)",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "instance_clr_gen0_collect_count, instance_clr_gen1_collect_count, instance_clr_gen2_collect_count",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartBar"
        },
        {
          "width": 3,
          "title": "CLR Heap Memory (.NET Service)",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "metricName": "instance_clr_heap_memory",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "unit": "MB",
          "aggregation": "/",
          "aggregationNum": "1048576"
        },
        {
          "width": 3,
          "title": "CLR Thread (.NET Service)",
          "height": "250",
          "entityType": "ServiceInstance",
          "independentSelector": false,
          "metricType": "REGULAR_VALUE",
          "queryMetricType": "readMetricsValues",
          "chartType": "ChartLine",
          "metricName": "instance_clr_available_completion_port_threads,instance_clr_available_worker_threads,instance_clr_max_completion_port_threads,instance_clr_max_worker_threads"
        }
      ]
    }
    

    结果校验

    可以看到导入的界面中,已经有 Class 相关指标了

    file

    代码贡献

    参考文档

    分享并记录所学所见

  • 相关阅读:
    用Visual C#创建Windows服务程序
    C# WinForm窗口最小化到系统托盘
    C# ?? 运算符是什么?
    linux中守护进程启停工具start-stop-daemon
    linux shell脚本中 mode=${1:-sart} filename=${fileuser:-"filename"}
    Unix/Linux 脚本中 “set -e” 的作用
    利用Sonar定制自定义扫描规则
    docker 镜像详解
    Docker Compose使用
    docker搭建gitlab、Redmine
  • 原文地址:https://www.cnblogs.com/switchvov/p/15146092.html
Copyright © 2020-2023  润新知