• Troubleshooting Apache Flink with Byteman


    Introduction

    What would you do if you need to see more details of some Apache Flink application logic at runtime, but there's no logging in that code path? An option is modifying the Flink source code or the application code, recompiling and redeploying it, which is time-consuming and error-prone. A quicker and more straightforward approach is to use Byteman. It can inject Java code into JVM and retrieve the runtime details you need.

    What is Byteman

    byteman, apache flink, debugging, troubleshooting

    Byteman is a tool that makes it easy to trace, monitor, and test Java applications and JDK runtime code behavior. It can inject Java code into the application methods or into Java runtime methods without the need for you to recompile, repackage or even redeploy your application. The injected code can access any of your data and call any application methods, including the ones that are private.

    To fully unleash the power of Byteman, you can use a simple scripting language based on a formalism called Event Condition Action (ECA) rule to specify where, when, and how the original Java code should be transformed. A rule specifies a trigger point and a location where you want code to be injected. When the execution reaches the trigger point, the rule's condition, a Java boolean expression, is evaluated. The Java expression (or sequence of expressions) in the rule's action is executed only when the condition is true.

    In the next section, I will use an example to show how to leverage Byteman to retrieve more details of the underlying logic within a Flink application.


    Apache Flink Troubleshooting Case Study

    Purpose

    Checkpointing is the fault tolerance mechanism in the Apache Flink framework. When using S3 as the checkpointing destination, Flink usually leverages the Hadoop or Presto libraries for any underlying communication. However, it can sometimes be challenging to troubleshoot issues in that code path because the 3rd-party libraries don’t always contain sufficient logging. In this example, I’ll demonstrate some of Byteman’s capabilities on the logging code injection in a Flink application running on Ververica Platform.


    Preparation

    Download the latest version (which is 4.0.15 at the time of this writing) of Byteman from the official website. After decompression, you can find the required byteman.jar in the byteman-download-4.0.15/lib directory.

    important-clipart-general-information-1 Note that the bytemand-install.jar and byteman-submit.jar in the same directory are not sufficient for this use case.


    Rules

    To achieve the goal, let’s write some rules for Byteman in a plain text file (the lines start with ‘#’ are comments):

    
    # File name: rules_v1.btm
    # Start of a rule (naming the rule)
    RULE rule_example_1
    
    # Target class
    CLASS ^org.apache.flink.fs.s3.common.FlinkS3FileSystem
    
    # Target method (i.e. the constructor method)
    METHOD 
    
    # Injection position in the method (e.g. ENTRY/EXIT/LINE number/...)
    AT ENTRY
    
    # Bind a parameter for logging (same as a local variable)
    BIND myLOG:org.slf4j.Logger = org.slf4j.LoggerFactory.getLogger($0.getClass());
    
    # Trigger condition (no need in this case, so I put 'true')
    IF true
    
    # Actions to take when the rule got triggered (print the total number of parameters for the constructor method together with the values for the first and fifth parameters)
    DO myLOG.info("Hello, FlinkS3FileSystem! -- Byteman");
       myLOG.info("Total number of parameters: " + $#);
       myLOG.info("#1 hadoopS3FileSystem: " + $1);
       myLOG.info("#5 S3AccessHelper: " + $5);
    
    # End of rule
    ENDRULE
    
    # ----------------------------------
    # Another rule. (A single byteman rule file can contain multiple rule definitions.)
    
    RULE rule_example_2
    
    CLASS ^com.facebook.presto.hive.s3.PrestoS3FileSystem
    
    METHOD initialize
    
    AT EXIT
    
    BIND myLOG:org.slf4j.Logger = org.slf4j.LoggerFactory.getLogger($0.getClass());
    
    IF true
    
    DO myLOG.info("Hello, PrestoS3FileSystem! -- Byteman");
       myLOG.info("AmazonS3: [" + $0.s3 + "]");
       myLOG.info("TransferManagerConfiguration: [" + $0.transferConfig + "]");
    
    ENDRULE
    # End of file rules_v1.btm
    
    Java

    The full explanation of the rule language can be found here

    Ververica Platform Configuration

    The version of Veverica Platform in this demo is 2.5.0 and the corresponding Flink version is Flink 1.13.1. The Flink application in the demo is Top Speed Windowing from the Ververica Platform documentation.

    Firstly, we need to upload both the byteman.jar and the rules file to Ververica Platform as two artifacts:

    artifacts, ververica platform, apache flink, flink, stream processing


    Then, the next step is to add both files to the Additional Dependencies and configure env.java.opts for the Apache Flink application:

    top speeding window, flink, apache flink, ververica platform

    top speeding window, flink, apache flink, ververica platform 

    important-clipart-general-information-1 Note that the references to byteman files in env.java.opts must be absolute paths. Otherwise, it will fail. In VVP Deployment, the default local path for additional dependencies is /flink/usrlib, so the complete configuration looks like below:

    
    env.java.opts: >-
    -javaagent:/flink/usrlib/byteman.jar=script:/flink/usrlib/rules_v1.btm,boot:/flink/usrlib/byteman.jar
    
    Java

    Results

    After implementing the above changes, the Deployment will automatically restart. To check the results, click the Flink UI button at the top of the Deployment page and then click the Task Manager tab on the left of the Flink page. 

    Flink UI, top speeding window, Flink, byteman, Flink troubleshooting, Flink debugging

    flink task manager, apache flink, flink debugging, flink troubleshooting, flink UI


    We should be able to see the TM logging after going into the TM page and clicking the Logs tab. Now the customized logging information will be shown after searching the ‘Byteman’ in the logging area:

    flink task manager, apache flink, flink debugging, flink troubleshooting, flink UI,n2


    Summary

    Byteman is a very powerful tool that can diagnose most Java-related application issues. If you are willing to learn quickly how to best utilize Byteman, please refer to this quick tutorial. It can save you many hours when troubleshooting complex issues in your application, especially in cases when other troubleshooting methods might have failed. This article provided only a brief introduction to Byteman, and the example above only reveals a tiny fraction of its capabilities. To get more information, please check the Byteman website. For additional troubleshooting and debugging tips make sure to check our Ververica Troubleshooting & Operations training and sign up for the next available training date on our website. 

  • 相关阅读:
    省市联级(DataReader绑定)
    中国六大最忙和六大最懒城市
    JavaScript极品小日历
    人生最重要的十个健康伴侣
    JavaScript 中的replace方法
    在VBScript中使用类
    使用嵌套触发器
    MM上街前的折腾(有趣)
    浅谈ASP中Web页面间的数据传递
    图片容错处理
  • 原文地址:https://www.cnblogs.com/felixzh/p/16174584.html
Copyright © 2020-2023  润新知