• Hadoop Metrics2


    来源:Hadoop Metrics2

       Metrics are collections of information about Hadoop daemons, events and measurements; for example, data nodes collect metrics such as the number of blocks replicated, number of read requests from clients, and so on. For that reason, metrics are an invaluable resource for monitoring Apache Hadoop services and an indispensable tool for debugging system problems. 

    This blog post focuses on the features and use of the Metrics2 system for Hadoop, which allows multiple metrics output plugins to be used in parallel, supports dynamic reconfiguration of metrics plugins, provides metrics filtering, and allows all metrics to be exported via JMX.

    Metrics vs. MapReduce Counters

    When speaking about metrics, a question about their relationship to MapReduce counters usually arises. This differences can be described in two ways: First, Hadoop daemons and services are generally the scope for metrics, whereas MapReduce applications are the scope for MapReduce counters (which are collected for MapReduce tasks and aggregated for the whole job). Second, whereas Hadoop administrators are the main audience for metrics, MapReduce users are the audience for MapReduce counters.

    Contexts and Prefixes

    For organizational purposes metrics are grouped into named contexts – e.g., jvm for java virtual machine metrics or dfs for the distributed file system metric. There are different sets of contexts supported by Hadoop-1 and Hadoop-2; the table below highlights the ones supported for each of them.  

    Branch-1

    Branch-2

    – jvm
    – rpc
    – rpcdetailed
    – metricssystem
    – mapred
    – dfs
    – ugi
    – yarn
    – jvm
    – rpc
    – rpcdetailed
    – metricssystem
    – mapred
    – dfs
    – ugi

    A Hadoop daemon collects metrics in several contexts. For example, data nodes collect metrics for the “dfs”, “rpc” and “jvm” contexts. The daemons that collect different metrics in Hadoop (for Hadoop-1 and Hadoop-2) are listed below:

    Branch-1 Daemons/Prefixes Branch-2 Daemons/Prefixes

    – namenode
    – datanode
    – jobtracker
    – tasktracker
    – maptask
    – reducetask

    – namenode
    – secondarynamenode
    – datanode
    – resourcemanager
    – nodemanager
    – mrappmaster
    – maptask
    – reducetask

    System Design

    The Metrics2 framework is designed to collect and dispatch per-process metrics to monitor the overall status of the Hadoop system. Producers register the metrics sources with the metrics system, while consumers register the sinks. The framework marshals metrics from sources to sinks based on (per source/sink) configuration options. This design is depicted below.

     

    Here is an example class implementing the MetricsSource:

    class MyComponentSource implements MetricsSource {
       @Override
        public void getMetrics(MetricsCollector collector, boolean all) {
        collector.addRecord("MyComponentSource")
                 .setContext("MyContext")
                  .addGauge(info("MyMetric", "My metric description"), 42);
            }
         }

    The “MyMetric” in the listing above could be, for example, the number of open connections for a specific server.

    Here is an example class implementing the MetricsSink:

    public class MyComponentSink implements MetricsSink {
        public void putMetrics(MetricsRecord record) {
        System.out.print(record);
          }
           public void init(SubsetConfiguration conf) {}
            public void flush() {}
         }

    To use the Metric2s framework, the system needs to be initialized and sources and sinks registered. Here is an example initialization:

    DefaultMetricsSystem.initialize(”datanode");
    MetricsSystem.register(source1, “source1 description”, new MyComponentSource());
    MetricsSystem.register(sink2, “sink2 description”, new MyComponentSink())

    The Metrics2 framework uses the PropertiesConfiguration from the apache commons configuration library.

    Sinks are specified in a configuration file (e.g., “hadoop-metrics2-test.properties”), as:

    test.sink.mysink0.class=com.example.hadoop.metrics.MySink
    [prefix].[source|sink|jmx|].[instance].[option]

     In the previous example, test is the prefix and mysink0 is an instance name. DefaultMetricsSystem would try to load hadoop-metrics2-[prefix].properties first, and if not found, try the default hadoop-metrics2.properties in the class path. Note, the [instance] is an arbitrary name to uniquely identify a particular sink instance. The asterisk (*) can be used to specify default options.

    Here is an example with inline comments to identify the different configuration sections:

     Here is an example set of NodeManager metrics that are dumped into the NodeManager sink file:

     Each line starts with a time followed by the context and metrics name and the corresponding value for each metric.

    Filtering

    By default, filtering can be done by source, context, record and metrics. More discussion of different filtering strategies can be found in the Javadoc and wiki.

    Example:

    Conclusion

    The Metrics2 system for Hadoop provides a gold mine of real-time and historical data that help monitor and debug problems associated with the Hadoop services and jobs. 

    Ahmed Radwan is a software engineer at Cloudera, where he contributes to various platform tools and open-source projects.

  • 相关阅读:
    e-icon-picker 基于element-ui图标和fontawesome图标选择器组件
    js 前端将平级数据转为树形数据的方法
    发送邮件报User does not have send-as privilege for错误的解决办法
    Dynamics 365利用email实体的DeliverIncomingEmail来做抓取邮件的进一步处理
    Dynamics 365中邮件模板的使用
    导入解决方案报错:Unable to retrieve customActivityInfo using RetrieveCustomActivityInfoWithSandboxPlugin
    Dynamics 365组织服务使用Query Expression查询数据时候请谨慎使用ConditionOperator.Contains
    【代码审计】ESPCMSP8(易思企业建站管理系统)漏洞报告
    MS16-072域内中间人攻击
    域控权限提升PTH攻击
  • 原文地址:https://www.cnblogs.com/wxquare/p/6511138.html
Copyright © 2020-2023  润新知