• Akka源码分析-Akka-Streams-Materializer(1)


      本博客逐步分析Akka Streams的源码,当然必须循序渐进,且估计会分很多篇,毕竟Akka Streams还是比较复杂的。

    implicit val system = ActorSystem("QuickStart")
    
    implicit val materializer = ActorMaterializer()
    

       在使用Streams相关的API时,上面两个对象是必须创建的。ActorSystem不再说了,我们来看ActorMaterializer。

    /**
     * An ActorMaterializer takes a stream blueprint and turns it into a running stream.
     */
    abstract class ActorMaterializer extends Materializer with MaterializerLoggingProvider 
    

       ActorMaterializer把一个流式计算的blueprint(大纲、蓝本?)转换成一个运行的流,简单来说这就是用来编译akka提供的流式API的。它继承一个类和一个特质。MaterializerLoggingProvider就不看了,就是提供日志相关的功能的。

    /**
     * Materializer SPI (Service Provider Interface)
     *
     * Binary compatibility is NOT guaranteed on materializer internals.
     *
     * Custom materializer implementations should be aware that the materializer SPI
     * is not yet final and may change in patch releases of Akka. Please note that this
     * does not impact end-users of Akka streams, only implementors of custom materializers,
     * with whom the Akka team co-ordinates such changes.
     *
     * Once the SPI is final this notice will be removed.
     */
    abstract class Materializer {
    
      /**
       * The `namePrefix` shall be used for deriving the names of processing
       * entities that are created during materialization. This is meant to aid
       * logging and failure reporting both during materialization and while the
       * stream is running.
       */
      def withNamePrefix(name: String): Materializer
    
      /**
       * This method interprets the given Flow description and creates the running
       * stream. The result can be highly implementation specific, ranging from
       * local actor chains to remote-deployed processing networks.
       */
      def materialize[Mat](runnable: Graph[ClosedShape, Mat]): Mat
    
      /**
       * This method interprets the given Flow description and creates the running
       * stream using an explicitly provided [[Attributes]] as top level (least specific) attributes that
       * will be defaults for the materialized stream.
       * The result can be highly implementation specific, ranging from local actor chains to remote-deployed
       * processing networks.
       */
      def materialize[Mat](
        runnable:                                              Graph[ClosedShape, Mat],
        @deprecatedName('initialAttributes) defaultAttributes: Attributes): Mat
    
      /**
       * Running a flow graph will require execution resources, as will computations
       * within Sources, Sinks, etc. This [[scala.concurrent.ExecutionContextExecutor]]
       * can be used by parts of the flow to submit processing jobs for execution,
       * run Future callbacks, etc.
       *
       * Note that this is not necessarily the same execution context the stream operator itself is running on.
       */
      implicit def executionContext: ExecutionContextExecutor
    
      /**
       * Interface for operators that need timer services for their functionality. Schedules a
       * single task with the given delay.
       *
       * @return A [[akka.actor.Cancellable]] that allows cancelling the timer. Cancelling is best effort, if the event
       *         has been already enqueued it will not have an effect.
       */
      def scheduleOnce(delay: FiniteDuration, task: Runnable): Cancellable
    
      /**
       * Interface for operators that need timer services for their functionality. Schedules a
       * repeated task with the given interval between invocations.
       *
       * @return A [[akka.actor.Cancellable]] that allows cancelling the timer. Cancelling is best effort, if the event
       *         has been already enqueued it will not have an effect.
       */
      def schedulePeriodically(initialDelay: FiniteDuration, interval: FiniteDuration, task: Runnable): Cancellable
    
    }
    

       Materializer非常重要,所以这里贴出完整代码。Materializer是一个SPI(服务提供接口),Materializer内部不保证二进制兼容,也就是说版本可能不兼容。但这个定义比较奇怪,既然所有的方法都没有实现,那是不是用trait比较合适呢?为啥是一个抽象类呢?

      Materializer一共6个方法,我们一个个看。

      withNamePrefix在派生处理实体的时候用到,简单来说就是流中的每个计算实体(Source/Sink/Flow/RunnableGraph)等映射成一个actor的时候,这些actor名称需要一个前缀,withNamePrefix就是用来设置这个前缀的。

      materialize方法有两个实现。这个方法用来解释给定的Flow定义,并创建运行的流的。创建的结果会高度依赖具体的实现,可能是本地的actor链也可能是远程部署的加工过程。

      executionContext用来提供异步执行环境。

      scheduleOnce/schedulePeriodically用来提供定时调度的功能,毕竟还是有一些操作需要时间服务的。

      Materializer貌似很简单,就只有上面几个接口,但其核心的接口就是materialize方法,毕竟这是用来编译akka流的。如果你使用过storm的话,就一定知道Storm有一个拓扑编译器。其功能跟这个差不多。

      我们再来看ActorMaterializer。

      ActorMaterializer还提供了三个比较重要的接口:actorOf/system/supervisor。其中actorOf接收MaterializationContext、Props作为参数,创建一个Actor;system返回与ActorMaterializer关联的ActorSystem;supervisor功能后面再研究。MaterializationContext这个参数还是比较有意思的,可以理解成物理化(编译流)时的上下文。

    /**
     * Context parameter to the `create` methods of sources and sinks.
     *
     * INTERNAL API
     */
    @InternalApi
    private[akka] case class MaterializationContext(
      materializer:        Materializer,
      effectiveAttributes: Attributes,
      islandName:          String)
    

       MaterializationContext一共三个变量,materializer不再说了,就是当前上下文关联的Materializer。effectiveAttributes就是用来提供参数的,只不过又封装成了Attributes。islandName比较有意思,单纯从命名上来翻译,它是“岛名称”。那什么是“岛”(island)呢?后面分析时,也会遇到这个“island”,大家这里稍微留意一下就行了。

      另外在Materializer这个抽象类中materialize方法有一个参数需要我们研究下:runnable: Graph[ClosedShape, Mat]。

    /**
     * Not intended to be directly extended by user classes
     *
     * @see [[akka.stream.stage.GraphStage]]
     */
    trait Graph[+S <: Shape, +M]
    

       Graph是一个trait,它有两个类型参数:S/M。能不能起个好点的名字,SM,哈哈。其中S是Shape的子类,从名称来看,好像是graph的形状。简单分析一下Graph的主体代码,就会发现它有两个方法特别重要:shape、traversalBuilder。其中shape返回图的形状,虽然我们还不知道形状是什么,但ClosedShape是它的一个形式;traversalBuilder

    /**
       * INTERNAL API.
       *
       * Every materializable element must be backed by a stream layout module
       */
      private[stream] def traversalBuilder: TraversalBuilder
    

       traversalBuilder返回一个TraversalBuilder,注释说每一个可物理化的元素都必须被流布局模块支持。TraversalBuilder从名称来看,是一个可遍历的编译器。估计就是一个拓扑排序。

      下面我们来看Shape究竟是什么。

    /**
     * A Shape describes the inlets and outlets of a [[Graph]]. In keeping with the
     * philosophy that a Graph is a freely reusable blueprint, everything that
     * matters from the outside are the connections that can be made with it,
     * otherwise it is just a black box.
     */
    abstract class Shape 
    

       一个Shape描述了Graph的入口和出口,按照Graph是自由重用的蓝本这样的哲学,从外部来看Graph是可被链接的就非常重要了,否则它就只是一个黑盒子。我们姑且认为它只是用来定义Graph的输入输出的吧。而且这个trait最重要的两个方法也就是inlets/outlets。

      /**
       * Scala API: get a list of all input ports
       */
      def inlets: immutable.Seq[Inlet[_]]
    
      /**
       * Scala API: get a list of all output ports
       */
      def outlets: immutable.Seq[Outlet[_]]
    

      那Inlet和Outlet又是什么呢?

    final class Inlet[T] private (val s: String) extends InPort {
      def carbonCopy(): Inlet[T] = {
        val in = Inlet[T](s)
        in.mappedTo = this
        in
      }
      /**
       * INTERNAL API.
       */
      def as[U]: Inlet[U] = this.asInstanceOf[Inlet[U]]
    
      override def toString: String = s + "(" + this.hashCode + s")" +
        (if (mappedTo eq this) ""
        else s" mapped to $mappedTo")
    }
    
    /**
     * An input port of a StreamLayout.Module. This type logically belongs
     * into the impl package but must live here due to how `sealed` works.
     * It is also used in the Java DSL for “untyped Inlets” as a work-around
     * for otherwise unreasonable existential types.
     */
    sealed abstract class InPort { self: Inlet[_] ⇒
      final override def hashCode: Int = super.hashCode
      final override def equals(that: Any): Boolean = this eq that.asInstanceOf[AnyRef]
    
      /**
       * INTERNAL API
       */
      @volatile private[stream] var id: Int = -1
    
      /**
       * INTERNAL API
       */
      @volatile private[stream] var mappedTo: InPort = this
    
      /**
       * INTERNAL API
       */
      private[stream] def inlet: Inlet[_] = this
    }
    

       Inlet好像也没有提供什么方法啊,貌似目前来看比较重要的就只有一个id字段,其他的都是变量返回、复制的。

      好像分析到这里,Materializer、Shape、Graph都还比较抽象,还得来看Materializer的具体实现,毕竟它只是一个trait。

     def apply(materializerSettings: ActorMaterializerSettings, namePrefix: String)(implicit context: ActorRefFactory): ActorMaterializer = {
        val haveShutDown = new AtomicBoolean(false)
        val system = actorSystemOf(context)
    
        new PhasedFusingActorMaterializer(
          system,
          materializerSettings,
          system.dispatchers,
          actorOfStreamSupervisor(materializerSettings, context, haveShutDown),
          haveShutDown,
          FlowNames(system).name.copy(namePrefix))
      }
    

       通过分析ActorMaterializer的apply方法,我们发现最终调用了上面这个版本的apply,可以看到,最终创建了PhasedFusingActorMaterializer,而且是new出来的,所以这个类一定是一个具体类,也就是说所有的方法和字段都有对应的实现和定义。PhasedFusingActorMaterializer的名称其实还挺有意思的,直译的话就是分段熔断actor物化器。分段好理解,熔断就不知道怎么理解了,或者翻译错了?哈哈,我也不知道。

    @InternalApi private[akka] case class PhasedFusingActorMaterializer(
      system:                ActorSystem,
      override val settings: ActorMaterializerSettings,
      dispatchers:           Dispatchers,
      supervisor:            ActorRef,
      haveShutDown:          AtomicBoolean,
      flowNames:             SeqActorName) extends ExtendedActorMaterializer 
    

       PhasedFusingActorMaterializer居然是一个case class,那还new干啥。它继承了ExtendedActorMaterializer。ExtendedActorMaterializer源码不再贴出来,它就是重新覆盖了ActorMaterializer的几个方法,并实现了一个方法actorOf。我们就来看这个actorOf

    @InternalApi private[akka] override def actorOf(context: MaterializationContext, props: Props): ActorRef = {
        val effectiveProps = props.dispatcher match {
          case Dispatchers.DefaultDispatcherId ⇒
            props.withDispatcher(context.effectiveAttributes.mandatoryAttribute[ActorAttributes.Dispatcher].dispatcher)
          case ActorAttributes.IODispatcher.dispatcher ⇒
            // this one is actually not a dispatcher but a relative config key pointing containing the actual dispatcher name
            props.withDispatcher(settings.blockingIoDispatcher)
          case _ ⇒ props
        }
    
        actorOf(effectiveProps, context.islandName)
      }
    

       其实它就是用特定的dispatcher替换了Props中的值,然后调用另一个actorOf创建actor。

    @InternalApi private[akka] def actorOf(props: Props, name: String): ActorRef = {
        supervisor match {
          case ref: LocalActorRef ⇒
            ref.underlying.attachChild(props, name, systemService = false)
          case ref: RepointableActorRef ⇒
            if (ref.isStarted)
              ref.underlying.asInstanceOf[ActorCell].attachChild(props, name, systemService = false)
            else {
              implicit val timeout = ref.system.settings.CreationTimeout
              val f = (supervisor ? StreamSupervisor.Materialize(props, name)).mapTo[ActorRef]
              Await.result(f, timeout.duration)
            }
          case unknown ⇒
            throw new IllegalStateException(s"Stream supervisor must be a local actor, was [${unknown.getClass.getName}]")
        }
      }
    

       这个actorOf我们也不深入分析了,反正就是在创建actor。下面我们来看materialize在PhasedFusingActorMaterializer中的实现。

    override def materialize[Mat](
        graph:             Graph[ClosedShape, Mat],
        defaultAttributes: Attributes,
        defaultPhase:      Phase[Any],
        phases:            Map[IslandTag, Phase[Any]]): Mat = {
        if (isShutdown) throw new IllegalStateException("Trying to materialize stream after materializer has been shutdown")
        val islandTracking = new IslandTracking(phases, settings, defaultAttributes, defaultPhase, this, islandNamePrefix = createFlowName() + "-")
    
        var current: Traversal = graph.traversalBuilder.traversal
    
        val attributesStack = new java.util.ArrayDeque[Attributes](8)
        attributesStack.addLast(defaultAttributes and graph.traversalBuilder.attributes)
    
        val traversalStack = new java.util.ArrayDeque[Traversal](16)
        traversalStack.addLast(current)
    
        val matValueStack = new java.util.ArrayDeque[Any](8)
    
        if (Debug) {
          println(s"--- Materializing layout:")
          TraversalBuilder.printTraversal(current)
          println(s"--- Start materialization")
        }
    
        // Due to how Concat works, we need a stack. This probably can be optimized for the most common cases.
        while (!traversalStack.isEmpty) {
          current = traversalStack.removeLast()
    
          while (current ne EmptyTraversal) {
            var nextStep: Traversal = EmptyTraversal
            current match {
              case MaterializeAtomic(mod, outToSlot) ⇒
                if (Debug) println(s"materializing module: $mod")
                val matAndStage = islandTracking.getCurrentPhase.materializeAtomic(mod, attributesStack.getLast)
                val logic = matAndStage._1
                val matValue = matAndStage._2
                if (Debug) println(s"  materialized value is $matValue")
                matValueStack.addLast(matValue)
    
                val stageGlobalOffset = islandTracking.getCurrentOffset
    
                wireInlets(islandTracking, mod, logic)
                wireOutlets(islandTracking, mod, logic, stageGlobalOffset, outToSlot)
    
                if (Debug) println(s"PUSH: $matValue => $matValueStack")
    
              case Concat(first, next) ⇒
                if (next ne EmptyTraversal) traversalStack.add(next)
                nextStep = first
              case Pop ⇒
                val popped = matValueStack.removeLast()
                if (Debug) println(s"POP: $popped => $matValueStack")
              case PushNotUsed ⇒
                matValueStack.addLast(NotUsed)
                if (Debug) println(s"PUSH: NotUsed => $matValueStack")
              case transform: Transform ⇒
                val prev = matValueStack.removeLast()
                val result = transform(prev)
                matValueStack.addLast(result)
                if (Debug) println(s"TRFM: $matValueStack")
              case compose: Compose ⇒
                val second = matValueStack.removeLast()
                val first = matValueStack.removeLast()
                val result = compose(first, second)
                matValueStack.addLast(result)
                if (Debug) println(s"COMP: $matValueStack")
              case PushAttributes(attr) ⇒
                attributesStack.addLast(attributesStack.getLast and attr)
                if (Debug) println(s"ATTR PUSH: $attr")
              case PopAttributes ⇒
                attributesStack.removeLast()
                if (Debug) println(s"ATTR POP")
              case EnterIsland(tag) ⇒
                islandTracking.enterIsland(tag, attributesStack.getLast)
              case ExitIsland ⇒
                islandTracking.exitIsland()
              case _ ⇒
            }
            current = nextStep
          }
        }
    
        def shutdownWhileMaterializingFailure =
          new IllegalStateException("Materializer shutdown while materializing stream")
        try {
          islandTracking.getCurrentPhase.onIslandReady()
          islandTracking.allNestedIslandsReady()
    
          if (Debug) println("--- Finished materialization")
          matValueStack.peekLast().asInstanceOf[Mat]
    
        } finally {
          if (isShutdown) throw shutdownWhileMaterializingFailure
        }
    
      }
    

       由于这个方法过于重要,所以我还是贴出了整段代码。我们发现它有两个参数没有见过:defaultPhase: Phase[Any]、phases: Map[IslandTag, Phase[Any]]。这涉及到两个类型IslandTag、Phase。Phase比较好理解就是阶段,那IslandTag是啥呢?“岛”的标签?“岛”是什么?!

    @DoNotInherit private[akka] trait Phase[M] {
      def apply(
        settings:            ActorMaterializerSettings,
        effectiveAttributes: Attributes,
        materializer:        PhasedFusingActorMaterializer,
        islandName:          String): PhaseIsland[M]
    }
    

       Phase这个trait就只有一个apply,他就是根据各个参数,返回了PhaseIsland,阶段岛?

    @DoNotInherit private[akka] trait PhaseIsland[M] {
    
      def name: String
    
      def materializeAtomic(mod: AtomicModule[Shape, Any], attributes: Attributes): (M, Any)
    
      def assignPort(in: InPort, slot: Int, logic: M): Unit
    
      def assignPort(out: OutPort, slot: Int, logic: M): Unit
    
      def createPublisher(out: OutPort, logic: M): Publisher[Any]
    
      def takePublisher(slot: Int, publisher: Publisher[Any]): Unit
    
      def onIslandReady(): Unit
    
    }
    

       PhaseIsland有两个方法非常重要:createPublisher、takePublisher。为啥重要?因为它好像在创建Publisher啊,Publisher是啥?翻翻我上一篇博客喽?这是reactivestreams里面的接口,之所以重要因为它涉及到了底层。

      还有一个IslandTag,这又是啥呢?

    @DoNotInherit private[akka] trait IslandTag
    

       啥元素也没有,难道仅仅是用来做类型匹配的?鬼知道。估计是给Island打标签的。

      materialize方法代码很多,逻辑很复杂,但有几个重要的点需要注意。它创建了一个IslandTracking,并通过一个while循环访问了graph.traversalBuilder,对IslandTracking进行了某些操作,最后调用了islandTracking的两个方法。

    islandTracking.getCurrentPhase.onIslandReady()
    islandTracking.allNestedIslandsReady()
    

       所以IslandTracking和这两个方法就非常重要了,因为这个类的这两个方法调用完之后,流就可以正常运行了啊。

    @InternalApi private[akka] class IslandTracking(
      val phases:       Map[IslandTag, Phase[Any]],
      val settings:     ActorMaterializerSettings,
      attributes:       Attributes,
      defaultPhase:     Phase[Any],
      val materializer: PhasedFusingActorMaterializer,
      islandNamePrefix: String) 
    

       IslandTracking岛跟踪,居然一行注释都没有,这让我怎么分析啊,麻蛋!

      由于时间关系,今天就先分析到这里吧,很显然Akka Streams的代码很复杂,看源码非常有难度,希望我还能看下去。哈哈。

      

  • 相关阅读:
    Servlet编程寄语
    filter常用功能
    Javascript的自动、定时执行和取消
    CentOS 5安装GIT的基本命令
    EF调用执行Oracle中序列
    WCF使用IIS发布服务的配置
    linux 自学系列:debian更新软件列表、更改源
    shell编程笔记五:select
    linux 自学系列: 改IP地址,主机名及DNS
    shell编程笔记四:case in
  • 原文地址:https://www.cnblogs.com/gabry/p/9524201.html
Copyright © 2020-2023  润新知