• 【原创】大叔经验分享(21)yarn中查看每个应用实时占用的内存和cpu资源


    在yarn中的application详情页面

    http://resourcemanager/cluster/app/$applicationId

    或者通过application命令

    yarn application -status $applicationId

    只能看到应用启动以来占用的资源*时间统计,比如:

    Aggregate Resource Allocation : 3962853 MB-seconds, 1466 vcore-seconds

    到处都找不到这个应用当前实时的资源占用情况,比如当前占用了多少内存多少核,跟进yarn代码发现其实是有这个统计的:

    org.apache.hadoop.yarn.api.records.ApplicationResourceUsageReport

      public static ApplicationResourceUsageReport newInstance(
          int numUsedContainers, int numReservedContainers, Resource usedResources,
          Resource reservedResources, Resource neededResources, long memorySeconds,
          long vcoreSeconds) {
        ApplicationResourceUsageReport report =
            Records.newRecord(ApplicationResourceUsageReport.class);
        report.setNumUsedContainers(numUsedContainers);
        report.setNumReservedContainers(numReservedContainers);
        report.setUsedResources(usedResources);
        report.setReservedResources(reservedResources);
        report.setNeededResources(neededResources);
        report.setMemorySeconds(memorySeconds);
        report.setVcoreSeconds(vcoreSeconds);
        return report;
      }

    其中usedResources就是当前的实时占用资源情况,包括内存和cpu,这个统计是在YarnScheduler的接口中返回:

    org.apache.hadoop.yarn.server.resourcemanager.scheduler.YarnScheduler

      /**
       * Get a resource usage report from a given app attempt ID.
       * @param appAttemptId the id of the application attempt
       * @return resource usage report for this given attempt
       */
      @LimitedPrivate("yarn")
      @Evolving
      ApplicationResourceUsageReport getAppResourceUsageReport(
          ApplicationAttemptId appAttemptId);

    getAppResourceUsageReport方法被RMAppAttemptImpl.getApplicationResourceUsageReport调用:

    org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl

      @Override
      public ApplicationResourceUsageReport getApplicationResourceUsageReport() {
        this.readLock.lock();
        try {
          ApplicationResourceUsageReport report =
              scheduler.getAppResourceUsageReport(this.getAppAttemptId());
          if (report == null) {
            report = RMServerUtils.DUMMY_APPLICATION_RESOURCE_USAGE_REPORT;
          }
          AggregateAppResourceUsage resUsage =
              this.attemptMetrics.getAggregateAppResourceUsage();
          report.setMemorySeconds(resUsage.getMemorySeconds());
          report.setVcoreSeconds(resUsage.getVcoreSeconds());
          return report;
        } finally {
          this.readLock.unlock();
        }
      }

    RMAppAttemptImpl.getApplicationResourceUsageReport被两个地方调用:

    第一个调用

    org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl

      public ApplicationReport createAndGetApplicationReport(String clientUserName,
          boolean allowAccess) {
    ...
              appUsageReport = currentAttempt.getApplicationResourceUsageReport();
    ...

    RMAppImpl.createAndGetApplicationReport会被ClientRMService.getApplications和ClientRMService.getApplicationReport调用,这两个方法分别对应命令

    yarn application -list
    yarn application -status $applicationId

    这两个地方展示信息的时候都没展示usedResources,可能作者觉得这个实时资源占用统计没那么重要。

    详见:
    org.apache.hadoop.yarn.server.resourcemanager.ClientRMService

    第二个调用

    org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.AppInfo

      public AppInfo(RMApp app, Boolean hasAccess, String schemePrefix) {
    ...
              ApplicationResourceUsageReport resourceReport = attempt
                  .getApplicationResourceUsageReport();
              if (resourceReport != null) {
                Resource usedResources = resourceReport.getUsedResources();
                allocatedMB = usedResources.getMemory();
                allocatedVCores = usedResources.getVirtualCores();
                runningContainers = resourceReport.getNumUsedContainers();
              }
    ...

    这个构造函数会在RMWebServices.getApp和RMWebServices.getApps时被调用,这是个service接口,对应url分别为:

    http://resourcemanager/ws/v1/cluster/apps/$applicationId
    http://resourcemanager/ws/v1/cluster/apps?state=RUNNING

    这两个接口的返回值中有实时资源占用情况如下:

    <allocatedMB>56320</allocatedMB>
    <allocatedVCores>21</allocatedVCores>

    分别对应实时内存占用和实时CPU占用;

    详见:
    org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices

    如果你发现spark应用内存的占用比你分配的要多,可以参考这里:https://www.cnblogs.com/barneywill/p/10102353.html

  • 相关阅读:
    软件杯-题目和插件
    基于《河北省重大技术需求征集系统》的可用性和可修改性战术分析
    基于淘宝网的系统质量属性六大场景
    架构漫谈读后感
    06掌握需求过程阅读笔记之一
    大道至简读后感以及JAVA伪代码
    K8S学习笔记
    事务的七种传播类型、及案例
    香港身份证规则
    oracle函数
  • 原文地址:https://www.cnblogs.com/barneywill/p/10251010.html
Copyright © 2020-2023  润新知