• 2.1.5、SparkEnv中创建MapOutputTracker


    SparkEnv中创建MapOutputTracker

        def registerOrLookupEndpoint(
            name: String, endpointCreator: => RpcEndpoint):
          RpcEndpointRef = {
          if (isDriver) {
            logInfo("Registering " + name)
            rpcEnv.setupEndpoint(name, endpointCreator)
          } else {
            RpcUtils.makeDriverRef(name, conf, rpcEnv)
          }
        }
    
        val broadcastManager = new BroadcastManager(isDriver, conf, securityManager)
    
        //创建MapOutputTracker 区分Driver, Executor
        val mapOutputTracker = if (isDriver) {
          //Driver需要BroadcastManager
          new MapOutputTrackerMaster(conf, broadcastManager, isLocal)
        } else {
          new MapOutputTrackerWorker(conf)
        }
    
        // Have to assign trackerEndpoint after initialization as MapOutputTrackerEndpoint
        // requires the MapOutputTracker itself
        mapOutputTracker.trackerEndpoint = registerOrLookupEndpoint(MapOutputTracker.ENDPOINT_NAME,
          new MapOutputTrackerMasterEndpoint(
            rpcEnv, mapOutputTracker.asInstanceOf[MapOutputTrackerMaster], conf))
    View Code

    MapOutputTracker是用于跟踪map阶段任务的输出状态, 此状态便于reduce阶段获取地址及中间输出结果,每个map或reduce都有唯一的标识(mapId, reduceId),

    MapOutputTracker是基于Master/Slave的架构,Master(Driver)负责存储当前Application上所有Shuffle的Map输出元数据信息,而Slave(Executor)可以通过rpc对Master上的Map输出状态信息进行查询。

    区分Driver, executor的MapOutputTracker

        //创建MapOutputTracker 区分Driver, Executor
        val mapOutputTracker = if (isDriver) {
          //Driver需要BroadcastManager
          new MapOutputTrackerMaster(conf, broadcastManager, isLocal)
        } else {
          new MapOutputTrackerWorker(conf)
        }

    两者都实现了MapOutputTracker

    /**
     * Class that keeps track of the location of the map output of
     * a stage. This is abstract because different versions of MapOutputTracker
     * (driver and executor) use different HashMap to store its metadata.
     */
    private[spark] abstract class MapOutputTracker(conf: SparkConf) extends Logging
  • 相关阅读:
    Nginx中如何配置中文域名?
    VS2012找不到EF框架实体模型的解决方法
    来自一位家长的电话
    孩子大了真是不好管了
    springboot项目不加端口号也可以访问项目的方法
    分享几个上机案例题
    今晚在学校值班……
    3班的第二次模拟面试
    Sword 09
    Sword 06
  • 原文地址:https://www.cnblogs.com/chengbao/p/10625501.html
Copyright © 2020-2023  润新知