• Akka源码分析-local-DeathWatch


      生命周期监控,也就是死亡监控,是akka编程中常用的机制。比如我们有了某个actor的ActorRef之后,希望在该actor死亡之后收到响应的消息,此时我们就可以使用watch函数达到这一目的。

    class WatchActor extends Actor {
      val child = context.actorOf(Props.empty, "child")
      context.watch(child) // <-- this is the only call needed for registration
      var lastSender = context.system.deadLetters
    
      def receive = {
        case "kill" ⇒
          context.stop(child); lastSender = sender()
        case Terminated(`child`) ⇒ lastSender ! "finished"
      }
    }
    

       我们从官网的一个例子入手,其实DeathWatch用起来还是非常方便的,就是调用context.watch,在对应的actor由于某种原因stop之后,就会收到Terminated消息,该消息只有一个参数,那就是stop的ActorRef。看起来简单,那具体是怎么实现的呢?

      /**
       * Registers this actor as a Monitor for the provided ActorRef.
       * This actor will receive a Terminated(subject) message when watched
       * actor is terminated.
       *
       * `watch` is idempotent if it is not mixed with `watchWith`.
       *
       * It will fail with an [[IllegalStateException]] if the same subject was watched before using `watchWith`.
       * To clear the termination message, unwatch first.
       *
       * *Warning*: This method is not thread-safe and must not be accessed from threads other
       * than the ordinary actor message processing thread, such as [[java.util.concurrent.CompletionStage]] and [[scala.concurrent.Future]] callbacks.
       *
       * @return the provided ActorRef
       */
      def watch(subject: ActorRef): ActorRef
    

       上面是ActorContex关于watch的官方注释,非常简单,就是watch一个actor,然后就会收到对应的Terminated消息,还说这个方法不是线程安全的。

      如果读者看过我之前的源码分析文章的话,一定知道context就是ActorContext的实例,而ActorContext是ActorCell的一个功能截面,那么watch函数的具体实现应该就是在ActorCell里面了。由于ActorCell实现的接口比较多,就不再具体分析如何找到watch实现在哪个类了,直接告诉答案:dungeon.DeathWatch。

    private[akka] trait DeathWatch { this: ActorCell ⇒
    

       首先它是一个自我类型限定的trait,这种方式我之前吐槽过这里就不展开说了,来看看watch如何实现的。

    override final def watch(subject: ActorRef): ActorRef = subject match {
        case a: InternalActorRef ⇒
          if (a != self) {
            if (!watchingContains(a))
              maintainAddressTerminatedSubscription(a) {
                a.sendSystemMessage(Watch(a, self)) // ➡➡➡ NEVER SEND THE SAME SYSTEM MESSAGE OBJECT TO TWO ACTORS ⬅⬅⬅
                updateWatching(a, None)
              }
            else
              checkWatchingSame(a, None)
          }
          a
      }
    

       从上面源码可以分析出几个简单的技术点:1、不能watch自身;2、如果已经被监控则调用checkWatchingSame;3、没有被监控过,就给被监控的actor发送Watch整个系统消息;4、没有监控过则更新监控信息。

    /**
       * This map holds a [[None]] for actors for which we send a [[Terminated]] notification on termination,
       * ``Some(message)`` for actors for which we send a custom termination message.
       */
      private var watching: Map[ActorRef, Option[Any]] = Map.empty
    
      //   when all actor references have uid, i.e. actorFor is removed
      private def watchingContains(subject: ActorRef): Boolean =
        watching.contains(subject) || (subject.path.uid != ActorCell.undefinedUid &&
          watching.contains(new UndefinedUidActorRef(subject)))
    

       判断是否已经监控过,这个具体实现比较有意思,watching是一个Map,首先判断Map中是否需包含该ActorRef;如果不包含该ActorRef,就去判断有没有UID,有UID则创建一个UndefinedUidActorRef,再去watching中判断是否包含。难道不奇怪么?既然都不包含了,创建一个UndefinedUidActorRef就有可能包含了?谁说不是呢,哈哈。其实也不是。我们来看看ActorRef是如何定义equals的。

    /**
       * Equals takes path and the unique id of the actor cell into account.
       */
      final override def equals(that: Any): Boolean = that match {
        case other: ActorRef ⇒ path.uid == other.path.uid && path == other.path
        case _               ⇒ false
      }
    

       上面源码逻辑比较清晰,如果两个ActorRef相等,则一定是path相等,且对应的uid相等。ActorPath的判等就不再分析了,肯定是各个层次相同喽。

      那么有没有可能path相同,而uid不同呢?当然可能了,如果一个actor被stop之后,再用相同的actorOf参数创建呢?此时uid是不同的,而path是相同的。

    private[akka] class UndefinedUidActorRef(ref: ActorRef) extends MinimalActorRef {
      override val path = ref.path.withUid(ActorCell.undefinedUid)
      override def provider = throw new UnsupportedOperationException("UndefinedUidActorRef does not provide")
    }
    

       UndefinedUidActorRef就是与原ActorRef路径相同,而uid是ActorCell.undefinedUid的一个新的ActorRef。

      maintainAddressTerminatedSubscription,它会判断是不是本地actor,如果是本地actor则调用后面的block,对于远程actor会有一些特殊操作,这里不再分析。

      private def updateWatching(ref: InternalActorRef, newMessage: Option[Any]): Unit =
        watching = watching.updated(ref, newMessage)
    

       updateWatching比较简单,就是把要watch的actorRef插入到watching这个Map中去。你要问我这个ActorRef在Map中对应的value是啥,我也是拒绝回答的,你可以看看watchWith的用法,这里不再分析。下面我们来分析一下被监控的Actor收到Watching之后是如何做响应的。

    case Watch(watchee, watcher) ⇒ addWatcher(watchee, watcher)
    

       它命中了ActorCell.systemInvoke中的以上分支。

    protected def addWatcher(watchee: ActorRef, watcher: ActorRef): Unit = {
        val watcheeSelf = watchee == self
        val watcherSelf = watcher == self
    
        if (watcheeSelf && !watcherSelf) {
          if (!watchedBy.contains(watcher)) maintainAddressTerminatedSubscription(watcher) {
            watchedBy += watcher
            if (system.settings.DebugLifecycle) publish(Debug(self.path.toString, clazz(actor), s"now watched by $watcher"))
          }
        } else if (!watcheeSelf && watcherSelf) {
          watch(watchee)
        } else {
          publish(Warning(self.path.toString, clazz(actor), "BUG: illegal Watch(%s,%s) for %s".format(watchee, watcher, self)))
        }
      }
    

       正常情况下,会命中第一个if的第一个分支的代码,其实也比较简答,就是去watchedBy里面查找是否保存过watcher,如果没有就把它加到watchedBy里面。

    private var watchedBy: Set[ActorRef] = ActorCell.emptyActorRefSet
    

       watchedBy是一个set,也就是里面的ActorRef不重复。那如果这个actor被stop之后,啥时候通知对应的watchedBy呢?这个问题其实还是满复杂的。

      如果想知道什么时候通知了watchedBy,就需要知道stop的逻辑,那么ActorCell的stop是如何实现的呢?

    // ➡➡➡ NEVER SEND THE SAME SYSTEM MESSAGE OBJECT TO TWO ACTORS ⬅⬅⬅
      final def stop(): Unit = try dispatcher.systemDispatch(this, Terminate()) catch handleException
    

       stop在Dispatch这个trait里面实现,很简单,它又用当前dispatcher发送了一个Terminate消息给自己。

    case Terminate() ⇒ terminate()
    

       收到Terminate消息后,调用了terminate方法。

    protected def terminate() {
        setReceiveTimeout(Duration.Undefined)
        cancelReceiveTimeout
    
        // prevent Deadletter(Terminated) messages
        unwatchWatchedActors(actor)
    
        // stop all children, which will turn childrenRefs into TerminatingChildrenContainer (if there are children)
        children foreach stop
    
        if (systemImpl.aborting) {
          // separate iteration because this is a very rare case that should not penalize normal operation
          children foreach {
            case ref: ActorRefScope if !ref.isLocal ⇒ self.sendSystemMessage(DeathWatchNotification(ref, true, false))
            case _                                  ⇒
          }
        }
    
        val wasTerminating = isTerminating
    
        if (setChildrenTerminationReason(ChildrenContainer.Termination)) {
          if (!wasTerminating) {
            // do not process normal messages while waiting for all children to terminate
            suspendNonRecursive()
            // do not propagate failures during shutdown to the supervisor
            setFailed(self)
            if (system.settings.DebugLifecycle) publish(Debug(self.path.toString, clazz(actor), "stopping"))
          }
        } else {
          setTerminated()
          finishTerminate()
        }
      }
    

       terminate方法,逻辑清晰,它会通知子actor进行stop。那么子actor是如何stop的呢?

    final def stop(actor: ActorRef): Unit = {
        if (childrenRefs.getByRef(actor).isDefined) {
          @tailrec def shallDie(ref: ActorRef): Boolean = {
            val c = childrenRefs
            swapChildrenRefs(c, c.shallDie(ref)) || shallDie(ref)
          }
    
          if (actor match {
            case r: RepointableRef ⇒ r.isStarted
            case _                 ⇒ true
          }) shallDie(actor)
        }
        actor.asInstanceOf[InternalActorRef].stop()
      }
    

       其实比较简单,就是判断当前actor是否存在,若存在且已经启动则调用swapChildrenRefs,最后调用这个子actor的stop()方法,进行递归stop。

    override def shallDie(actor: ActorRef): ChildrenContainer = TerminatingChildrenContainer(c, Set(actor), UserRequest)
    

       shallDie其实就是创建一个TerminatingChildrenContainer,然后去替换childrenRefs。

    @tailrec final protected def setChildrenTerminationReason(reason: ChildrenContainer.SuspendReason): Boolean = {
        childrenRefs match {
          case c: ChildrenContainer.TerminatingChildrenContainer ⇒
            swapChildrenRefs(c, c.copy(reason = reason)) || setChildrenTerminationReason(reason)
          case _ ⇒ false
        }
      }
    

       最后一个if语句会调用setChildrenTerminationReason,此时childrenRefs已经是TerminatingChildrenContainer类型的了,所以会返回true。

    private def finishTerminate() {
        val a = actor
        /* The following order is crucial for things to work properly. Only change this if you're very confident and lucky.
         *
         * Please note that if a parent is also a watcher then ChildTerminated and Terminated must be processed in this
         * specific order.
         */
        try if (a ne null) a.aroundPostStop()
        catch handleNonFatalOrInterruptedException { e ⇒ publish(Error(e, self.path.toString, clazz(a), e.getMessage)) }
        finally try dispatcher.detach(this)
        finally try parent.sendSystemMessage(DeathWatchNotification(self, existenceConfirmed = true, addressTerminated = false))
        finally try stopFunctionRefs()
        finally try tellWatchersWeDied()
        finally try unwatchWatchedActors(a) // stay here as we expect an emergency stop from handleInvokeFailure
        finally {
          if (system.settings.DebugLifecycle)
            publish(Debug(self.path.toString, clazz(a), "stopped"))
    
          clearActorFields(a, recreate = false)
          clearActorCellFields(this)
          actor = null
        }
      }
    

       所以最终会调用finishTerminate,在finishTerminate代码中会去调用tellWatchersWeDied

    protected def tellWatchersWeDied(): Unit =
        if (!watchedBy.isEmpty) {
          try {
            // Don't need to send to parent parent since it receives a DWN by default
            def sendTerminated(ifLocal: Boolean)(watcher: ActorRef): Unit =
              if (watcher.asInstanceOf[ActorRefScope].isLocal == ifLocal && watcher != parent)
                watcher.asInstanceOf[InternalActorRef].sendSystemMessage(DeathWatchNotification(self, existenceConfirmed = true, addressTerminated = false))
    
            /*
             * It is important to notify the remote watchers first, otherwise RemoteDaemon might shut down, causing
             * the remoting to shut down as well. At this point Terminated messages to remote watchers are no longer
             * deliverable.
             *
             * The problematic case is:
             *  1. Terminated is sent to RemoteDaemon
             *   1a. RemoteDaemon is fast enough to notify the terminator actor in RemoteActorRefProvider
             *   1b. The terminator is fast enough to enqueue the shutdown command in the remoting
             *  2. Only at this point is the Terminated (to be sent remotely) enqueued in the mailbox of remoting
             *
             * If the remote watchers are notified first, then the mailbox of the Remoting will guarantee the correct order.
             */
            watchedBy foreach sendTerminated(ifLocal = false)
            watchedBy foreach sendTerminated(ifLocal = true)
          } finally {
            maintainAddressTerminatedSubscription() {
              watchedBy = ActorCell.emptyActorRefSet
            }
          }
        }
    

       tellWatchersWeDied做了什么呢?其实就是给watchedBy对应的actorRef发送DeathWatchNotification消息。请注意DeathWatchNotification的第一个参数是self,就是要stop的actor。

    case DeathWatchNotification(a, ec, at) ⇒ watchedActorTerminated(a, ec, at)
    

       而watcher收到DeathWatchNotification如何响应呢?

    /**
       * When this actor is watching the subject of [[akka.actor.Terminated]] message
       * it will be propagated to user's receive.
       */
      protected def watchedActorTerminated(actor: ActorRef, existenceConfirmed: Boolean, addressTerminated: Boolean): Unit = {
        watchingGet(actor) match {
          case None ⇒ // We're apparently no longer watching this actor.
          case Some(optionalMessage) ⇒
            maintainAddressTerminatedSubscription(actor) {
              watching = removeFromMap(actor, watching)
            }
            if (!isTerminating) {
              self.tell(optionalMessage.getOrElse(Terminated(actor)(existenceConfirmed, addressTerminated)), actor)
              terminatedQueuedFor(actor)
            }
        }
        if (childrenRefs.getByRef(actor).isDefined) handleChildTerminated(actor)
      }
    

       很明显watchedActorTerminated在当前actor处于正常状态,且已经监控了对应的actor时,会给自己发送一个Terminated(actor),或者Terminated(actor,msg)的消息。这样监控者就收到了被监控actor的Terminated消息了。

      其实吧,抛开子actor状态的维护以及其他复杂的操作,简单来说就是,监控者保存自己监控了哪些actor,被监控者保存了自己被哪些actor监控了,在被监控者stop的最后一刻发送Terminated消息给监控者就好了。当然了,这还涉及到remote模式,此时就比较复杂,后面再分析。

  • 相关阅读:
    C#发送邮件简单例子
    ABAP随笔
    日期格式转换
    正则校验金额,整数8位,小数3位。
    angular语法运用技巧
    Oracle中连接与加号(+)的使用
    含有代码分析的面试题
    面试的java题目
    递归查询
    本地没有ORACLE远程登录oracle服务器
  • 原文地址:https://www.cnblogs.com/gabry/p/9441736.html
Copyright © 2020-2023  润新知