• 如何使用TVM Pass红外线


    如何使用TVM Pass红外线

    随着Relay / tir中优化遍数的增加,执行并手动维护其依赖关系变得很棘手。引入了一个基础结构来管理优化过程,将其应用于TVM堆栈中IR的不同层。

    Relay / tir程序的优化可以以各种粒度应用,分别使用tvm.relay.transform.FunctionPasstvm.tir.transform.PrimFuncPass和的功能级别和模块级别tvm.transform.ModulePass 。用户可以依靠在tvm.transform.Sequential relay/ tir程序上应用一系列Pass,其中Pass之间的依赖性可以passPass下文解决。

    本文主要说明开发人员如何使用pass infra进行特定的优化,创建用于Relay程序的优化管道。同样的方法也可以用于tir。

    import numpy as np

    import tvm

    from tvm import te

    import tvm.relay as relay

    创建一个示例 relay程序

    创建一个简单的Relay程序。该程序将用于本文中示例的各种优化。用户可以编写一个tir基本函数并应用tirPass。

    def example():

        shape = (1, 64, 54, 54)

        c_data = np.empty(shape).astype("float32")

        c = relay.const(c_data)

        weight = relay.var("weight", shape=(64, 64, 3, 3))

        x = relay.var("x", relay.TensorType((1, 64, 56, 56), "float32"))

        conv = relay.nn.conv2d(x, weight)

        y = relay.add(c, c)

        y = relay.multiply(y, relay.const(2, "float32"))

        y = relay.add(conv, y)

        z = relay.add(y, c)

        z1 = relay.add(y, c)

        z2 = relay.add(z, z1)

        return relay.Function([x, weight], z2)

    为conv2d op注册布局更改,在示例中应用布局更改通道。alter layout pass如何工作不在本文的讨论范围之内。

    @relay.op.register_alter_op_layout("nn.conv2d", level=101)

    def alter_conv2d(attrs, inputs, tinfos, out_type):

        data, weight = inputs

        new_attrs = dict(attrs)

        new_attrs["data_layout"] = "NCHW16c"

        return relay.nn.conv2d(data, weight, **new_attrs)

    优化程序

    现在要优化程序。 relay具有许多优化功能。将选择其中一些以应用于此示例程序。

    有多种优化 relay程序的方法。下面将为每个示例提供示例。

    手动应用优化Pass

    # Let's first create a relay Module which contains one or multiple Relay

    # functions for optimization.

    f = example()

    mod = tvm.IRModule.from_expr(f)

     

    # Now we can apply constant folding on the module.

    # fold_const here is a callback that doesn't take any parameters.

    fold_const = relay.transform.FoldConstant()

    # Then, we can invoke the pass on the given module. Note that the constant

    # folding pass works at the function-level. That being said, each function in

    # the module will be applied with the optimization. Users don't need to iterate

    # through individual functions manually to apply this pass.

    mod = fold_const(mod)

    # We can see from the updated program that the constants are folded.

    print(mod)

    输出:

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

    以类似方式应用更多优化。例如,消除zz1使用的通用表达式。

    mod = relay.transform.EliminateCommonSubexpr()(mod)

    print(mod)

    输出:

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

    一些优化(例如融合)也是参数化的。例如,选择级别0不允许将算子融合在一起。用户可以传递 fuse_opt_level来启用此功能。

    mod = relay.transform.FuseOps(fuse_opt_level=0)(mod)

     

    # We can observe that the optimized module contains functions that only have

    # a signle primitive op.

    print(mod)

    输出:

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %1 = %0(%x, %weight) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = fn (%p01: Tensor[(1, 64, 54, 54), float32], %p11: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        add(%p01, %p11) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %3 = %2(%1, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %4 = fn (%p02: Tensor[(1, 64, 54, 54), float32], %p12: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        add(%p02, %p12) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %5 = %4(%3, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %6 = fn (%p03: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        add(%p03, %p03) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %6(%5) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

    使用顺序来应用Pass序列

    应用Pass实际上是乏味的,需要用户更好地了解之间的依赖性。例如,融合目前不适用于let绑定。如果relay.transform.ToANormalForm()在融合之前应用算子,无法融合在一起,此过程为每个表达式生成let绑定,以规范化Relay程序。

    Relaytvm.transform.Sequentialpass指定每个遍历,将打包为整体来减轻开发人员显式处理这些问题的负担。例如,现在可以使用以下顺序样式应用。tvm.transform.Sequentialtorch.nn.sequential 和mxnet.gluon.block类似。例如,torch.nn.sequential用于包含一系列PyTorch模块,这些模块将被添加,以构建网络,着重于网络层。取而代之的是tvm.transform.Sequential,下面的过程中的基础工作于优化过程。

    # Now let's execute some passes through :py:class:`tvm.transform.Sequential`

    f = example()

    mod = tvm.IRModule.from_expr(f)

    # Glob the interested passes.

    seq = tvm.transform.Sequential(

        [

            relay.transform.FoldConstant(),

            relay.transform.EliminateCommonSubexpr(),

            relay.transform.FuseOps(fuse_opt_level=2),

        ]

    )

    mod1 = seq(mod)

    print(mod1)

    输出:

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %4 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        %0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %3 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %4(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

    从转换后的Relay程序中,可以看到仍然有两个相同的加法运算。这是EliminateCommonSubexpr 未实际执行。只有优化级别小于或等于2的过程才被执行 tvm.transform.Sequential。下面的pass提供了一个配置界面,供用户自定义要执行的优化级别。

    with tvm.transform.PassContext(opt_level=3):

        mod2 = seq(mod)

    print(mod2)

    输出:

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        %0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

    可以看到仅保留了两个相同的加法之一。

    用户可以使用disabled_pa​​ss配置有选择地禁用某些pass,这类似于通用编译器(例如Clang和GCC)使用的-fno-xxx选项。例如,可以禁用EliminateCommonSubexpr,如下所示。打印的模块将再次显示两个相同的加法运算。

    with tvm.transform.PassContext(opt_level=3, disabled_pass=["EliminateCommonSubexpr"]):

        mod3 = seq(mod)

    print(mod3)

    输出:

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %4 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        %0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %3 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %4(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

    应用的Pass与目标无关。下文的Pass还提供了具有目标意识的方法。例如,布局变更阶段属于这种类别。

    with tvm.transform.PassContext(opt_level=3):

        mod4 = seq(mod)

    print(mod4)

     

    seq1 = tvm.transform.Sequential([relay.transform.AlterOpLayout()])

    with tvm.transform.PassContext(opt_level=3):

        with tvm.target.Target("llvm"):

            mod5 = seq1(mod)

    print(mod5)

    输出:

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        %0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = layout_transform(%x, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;

      %1 = nn.conv2d(%0, %weight, padding=[0, 0, 0, 0], data_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

      %2 = add(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %3 = multiply(%2, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %4 = layout_transform(%3, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

      %5 = add(%1, %4) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

      %6 = layout_transform(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

      %7 = add(%5, %6) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

      %8 = add(%5, %6) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

      %9 = add(%7, %8) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

      layout_transform(%9, src_layout="NCHW16c", dst_layout="NCHW") /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

    使用Python Decorator实施Pass

    下一个示例说明了如何使用Python装饰器pass传递基础流程来编排定制的优化管道。极大地简化了Pass的实施。例如,用户可以简单地定义一个修饰的类,进行功能级别的优化,如以下示例所示。transform_function包装一个类,以用c的倍数替换所有常量。调用自定义过程时,将访问给定模块中的每个函数,并且将替换函数中的每个常量。

    @relay.transform.function_pass(opt_level=1)

    class CustomPipeline:

        """Simple test function to replace one argument to another."""

     

        def __init__(self, multiplier):

            self.multiplier = multiplier

     

        # This function can define a pass.

        def transform_function(self, func, mod, ctx):

            obj = self

     

            class ReplaceConstant(tvm.relay.ExprMutator):

                def visit_constant(self, c):

                    return relay.multiply(obj.multiplier, c)

     

            return ReplaceConstant().visit(func)

     

     

    f = example()

    mod = tvm.IRModule.from_expr(f)

    custom_pass = CustomPipeline(multiplier=relay.const(3, "float32"))

    assert custom_pass.info.name == "CustomPipeline"

    mod3 = custom_pass(mod)

    print(mod3)

    输出:

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %1 = multiply(3f /* ty=float32 */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = add(%1, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %3 = multiply(3f /* ty=float32 */, 2f /* ty=float32 */) /* ty=float32 */;

      %4 = multiply(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %5 = add(%0, %4) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %6 = add(%5, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %7 = add(%5, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      add(%6, %7) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

    调试Pass

    TVM为用户提供了一个插用式的调试通道,在pass特殊通道(PrintIR)来转储整个模块的IR之后,将IR打印出来。顺序传递示例的略微修改版本,类似于以下内容,以启用IR转储以进行FoldConstant优化。

    f = example()

    mod = tvm.IRModule.from_expr(f)

    seq = tvm.transform.Sequential(

        [

            relay.transform.FoldConstant(),

            tvm.transform.PrintIR(),

            relay.transform.EliminateCommonSubexpr(),

            relay.transform.FuseOps(),

            relay.transform.AlterOpLayout(),

        ]

    )

     

    # By inserting the ``PrintIR`` pass after ``FoldConstant``, the pass infra will

    # dump out the module IR when ``FoldConstant`` is done. Users can plug in this

    # pass after any pass they want to debug for viewing the optimization effect.

    #

    # There is a more flexible debugging mechanism also exposed by the build configuration

    # object. One can pass a tracing function which can be used to execute arbitrary code

    # before and/or after each pass. A tracing function will receive a :py::class:`tvm.IRModule`,

    # a :py:class:`tvm.transform.PassInfo` object,

    # and a boolean indicating whether you are executing before, or after a pass.

    # An example is below.

     

     

    def print_ir(mod, info, is_before):

        """Print the name of the pass, the IR, only before passes execute."""

        if is_before:

            print("Running pass: {}", info)

            print(mod)

     

     

    with tvm.transform.PassContext(opt_level=3, trace=print_ir):

        with tvm.target.Target("llvm"):

            # Perform the optimizations.

            mod = seq(mod)

    print(mod)

     

    print("done")

    输出:

    Running pass: {} The meta data of the pass: pass name: FoldConstantopt_level: 2required passes: [

    ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]);

      %1 = add(meta[relay.Constant][0], meta[relay.Constant][0]);

      %2 = multiply(%1, 2f);

      %3 = add(%0, %2);

      %4 = add(%3, meta[relay.Constant][0]);

      %5 = add(%3, meta[relay.Constant][0]);

      add(%4, %5)

    }

     

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main() {

      add(meta[relay.Constant][0], meta[relay.Constant][0])

    }

     

    Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

    InferType, ]

     

    def @main() -> Tensor[(1, 64, 54, 54), float32] {

      add(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main() -> Tensor[(1, 64, 54, 54), float32] {

      %0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        add(%p0, %p0)

      };

      %0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */)

    }

     

     

    Running pass: {} The meta data of the pass: pass name: ToANormalFormopt_level: 1required passes: [

    ]

     

    def @main() -> Tensor[(1, 64, 54, 54), float32] {

      %0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        add(%p0, %p0) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main() -> Tensor[(1, 64, 54, 54), float32] {

      let %x = meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */;

      let %x1 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        add(%p0, %p0) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      let %x2 = %x1(%x);

      %x2

    }

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main() {

      multiply(meta[relay.Constant][0], 2f)

    }

     

    Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

    InferType, ]

     

    def @main() -> Tensor[(1, 64, 54, 54), float32] {

      multiply(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main() -> Tensor[(1, 64, 54, 54), float32] {

      %0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        multiply(%p0, %p1)

      };

      %0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */)

    }

     

    Running pass: {} The meta data of the pass: pass name: ToANormalFormopt_level: 1required passes: [

    ]

     

    def @main() -> Tensor[(1, 64, 54, 54), float32] {

      %0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        multiply(%p0, %p1) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main() -> Tensor[(1, 64, 54, 54), float32] {

      let %x = meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */;

      let %x1 = 2f /* ty=float32 */;

      let %x2 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        multiply(%p0, %p1) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      let %x3 = %x2(%x, %x1);

      %x3

    }

     

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]);

      %1 = add(%0, meta[relay.Constant][0]);

      %2 = add(%1, meta[relay.Constant][1]);

      %3 = add(%1, meta[relay.Constant][1]);

      add(%2, %3)

    }

     

    Running pass: {} The meta data of the pass: pass name: PrintIRopt_level: 0required passes: [

    ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    Running pass: {} The meta data of the pass: pass name: EliminateCommonSubexpropt_level: 3required passes: [

    InferType, ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      add(%2, %2)

    }

     

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

    InferType, ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      %2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

      add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        %0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]);

        %1 = add(%0, %p2);

        %2 = add(%1, %p3);

        add(%2, %2)

      };

      %3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */)

    }

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        %0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    Running pass: {} The meta data of the pass: pass name: AlterOpLayoutopt_level: 3required passes: [

    InferType, ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        %0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        %2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

        add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

    ]

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %7 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        %0 = layout_transform(%p0, src_layout="NCHW", dst_layout="NCHW16c");

        %1 = nn.conv2d(%0, %p1, padding=[0, 0, 0, 0], data_layout="NCHW16c");

        %2 = layout_transform(%p2, src_layout="NCHW", dst_layout="NCHW16c");

        %3 = add(%1, %2);

        %4 = layout_transform(%p3, src_layout="NCHW", dst_layout="NCHW16c");

        %5 = add(%3, %4);

        %6 = add(%5, %5);

        layout_transform(%6, src_layout="NCHW16c", dst_layout="NCHW")

      };

      %7(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */)

    }

     

    def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

      %7 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

        %0 = layout_transform(%p0, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;

        %1 = nn.conv2d(%0, %p1, padding=[0, 0, 0, 0], data_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

        %2 = layout_transform(%p2, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

        %3 = add(%1, %2) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

        %4 = layout_transform(%p3, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

        %5 = add(%3, %4) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

        %6 = add(%5, %5) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

        layout_transform(%6, src_layout="NCHW16c", dst_layout="NCHW") /* ty=Tensor[(1, 64, 54, 54), float32] */

      };

      %7(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

    }

     

    done

    概括

    本文介绍了如何使用Pass基础更加方便地在TVM中编写和调用Pass。讨论了调用Pass的不同方法。使用tvm.transform.Sequential可以极大地帮助用户简化处理多个优化过程及其依赖项的工作。提供了一个示例来说明如何使用PrintIR和跟踪调试过程。

    人工智能芯片与自动驾驶
  • 相关阅读:
    新建安卓项目后,manifest.xml中会出现大段的黄色警告
    TextView设置setCompoundDrawables不生效解决办法
    数据结构->队列->顺序循环队列ADT代码
    数据结构->栈->顺序栈ADT代码
    JAVA环境变量的配置
    计算机科学的范围-----18.12.08
    字符串复制
    扔鸡蛋问题和找零钱问题
    动态规划
    最小二乘法
  • 原文地址:https://www.cnblogs.com/wujianming-110117/p/14532537.html
Copyright © 2020-2023  润新知