• spark-submit提交方式测试Demo


    写一个小小的Demo测试一下Spark提交程序的流程

    Maven的pom文件

    <properties>
            <maven.compiler.source>1.7</maven.compiler.source>
            <maven.compiler.target>1.7</maven.compiler.target>
            <encoding>UTF-8</encoding>
            <spark.version>1.6.1</spark.version>
      </properties>
    
      <dependencies>
               <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-core_2.10</artifactId>
                <version>${spark.version}</version>
            </dependency>
    
            <dependency>
                <groupId>redis.clients</groupId>
                <artifactId>jedis</artifactId>
                <version>2.7.1</version>
            </dependency>
    
      </dependencies>
      
       <build>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <configuration>
                        <source>1.7</source>
                        <target>1.7</target>
                    </configuration>
                </plugin>
            
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-shade-plugin</artifactId>
                    <version>2.4.3</version>
                    <executions>
                        <execution>
                            <phase>package</phase>
                            <goals>
                                <goal>shade</goal>
                            </goals>
                            <configuration>
                                <filters>
                                    <filter>
                                        <artifact>*:*</artifact>
                                        <excludes>
                                            <exclude>META-INF/*.SF</exclude>
                                            <exclude>META-INF/*.DSA</exclude>
                                            <exclude>META-INF/*.RSA</exclude>
                                        </excludes>
                                    </filter>
                                </filters>
                            </configuration>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>

    编写一个蒙特卡罗求PI的代码

    import java.util.ArrayList;
    import java.util.List;
    
    import org.apache.spark.SparkConf;
    import org.apache.spark.api.java.JavaRDD;
    import org.apache.spark.api.java.JavaSparkContext;
    import org.apache.spark.api.java.function.Function;
    import org.apache.spark.api.java.function.Function2;
    
    import redis.clients.jedis.Jedis;
    
    /** 
     * Computes an approximation to pi
     * Usage: JavaSparkPi [slices]
     */
    public final class JavaSparkPi {
    
      public static void main(String[] args) throws Exception {
        SparkConf sparkConf = new SparkConf().setAppName("JavaSparkPi")/*.setMaster("local[2]")*/;
        JavaSparkContext jsc = new JavaSparkContext(sparkConf);
        
        Jedis jedis = new Jedis("192.168.49.151",19000);
        int slices = (args.length == 1) ? Integer.parseInt(args[0]) : 2;
        int n = 100000 * slices;
        List<Integer> l = new ArrayList<Integer>(n);
        for (int i = 0; i < n; i++) {
          l.add(i);
        }
    
        JavaRDD<Integer> dataSet = jsc.parallelize(l, slices);
    
        int count = dataSet.map(new Function<Integer, Integer>() {
          @Override
          public Integer call(Integer integer) {
            double x = Math.random() * 2 - 1;
            double y = Math.random() * 2 - 1;
            return (x * x + y * y < 1) ? 1 : 0;
          }
        }).reduce(new Function2<Integer, Integer, Integer>() {
          @Override
          public Integer call(Integer integer, Integer integer2) {
            return integer + integer2;
          }
        });
    
        jedis.set("Pi", String.valueOf(4.0 * count / n));
        System.out.println("Pi is roughly " + 4.0 * count / n);
        
        jsc.stop();
      }
    }

    前提条件的setMaster("local[2]") 没有在代码中hard code


    本地模式测试情况:# Run application locally on 8 cores

    spark-submit
    --master local[8]
    --class com.spark.test.JavaSparkPi
    --executor-memory 4g
    --executor-cores 4
    /home/dinpay/test/Spark-SubmitTest.jar 100

    运行结果在本地:运行在本地一起提交8个Task,不会在WebUI的8080端口上看见提交的任务

    -------------------------------------

    spark-submit
    --master local[8]
    --class com.spark.test.JavaSparkPi
    --executor-memory 8G
    --total-executor-cores 8
    hdfs://192.168.46.163:9000/home/test/Spark-SubmitTest.jar 100

    运行报错:java.lang.ClassNotFoundException: com.spark.test.JavaSparkPi

    ------------------------------------

    spark-submit
    --master local[8]
    --deploy-mode cluster
    --supervise
    --class com.spark.test.JavaSparkPi
    --executor-memory 8G
    --total-executor-cores 8
    /home/dinpay/test/Spark-SubmitTest.jar 100

    运行报错:Error: Cluster deploy mode is not compatible with master "local"


    ====================================================================


    Standalone模式client模式 # Run on a Spark standalone cluster in client deploy mode

    spark-submit
    --master spark://hadoop-namenode-02:7077
    --class com.spark.test.JavaSparkPi
    --executor-memory 8g
    --tital-executor-cores 8
    /home/dinpay/test/Spark-SubmitTest.jar 100

    运行结果如下:

    -------------------------------------------
    spark-submit
    --master spark://hadoop-namenode-02:7077
    --class com.spark.test.JavaSparkPi
    --executor-memory 4g
    --executor-cores 4g
    hdfs://192.168.46.163:9000/home/test/Spark-SubmitTest.jar 100

    运行报错:java.lang.ClassNotFoundException: com.spark.test.JavaSparkPi

    =======================================================================

    standalone模式下的cluster模式 # Run on a Spark standalone cluster in cluster deploy mode with supervise

    spark-submit
    --master spark://hadoop-namenode-02:7077
    --class com.spark.test.JavaSparkPi
    --deploy-mode cluster
    --supervise
    --executor-memory 4g
    --executor-cores 4
    /home/dinpay/test/Spark-SubmitTest.jar 100

    运行报错:java.io.FileNotFoundException: /home/dinpay/test/Spark-SubmitTest.jar (No such file or directory)

    -------------------------------------------

    spark-submit
    --master spark://hadoop-namenode-02:7077
    --class com.spark.test.JavaSparkPi
    --deploy-mode cluster
    --supervise
    --driver-memory 4g
    --driver-cores 4
    --executor-memory 2g
    --total-executor-cores 4
    hdfs://192.168.46.163:9000/home/test/Spark-SubmitTest.jar 100

    运行结果如下:

    =============================================

    如果代码中写定了.setMaster("local[2]");
    则提交的集群模式也会运行driver,但是不会有对应的application并行运行

    spark-submit --deploy-mode cluster
    --master spark://hadoop-namenode-02:6066
    --class com.dinpay.bdp.rcp.service.Window12HzStat
    --driver-memory 2g
    --driver-cores 2
    --executor-memory 1g
    --total-executor-cores 2
    hdfs://192.168.46.163:9000/home/dinpay/RCP-HZ-TASK-0.0.1-SNAPSHOT.jar
    如果代码中限定了.setMaster("local[2]");
    则提交方式还是本地模式,会找一台worker进行本地化运行任务

  • 相关阅读:
    朴素贝叶斯算法
    生成模型与判别模型
    决策树 ID3,C4.5 CART
    KNN 分类程序
    php的空间命名
    模式设计六大原则
    javascript 的设计模式(更新中)
    typescript 学习笔记
    PHP 关于smarty模板引擎的使用
    简易封装PDO的操作类
  • 原文地址:https://www.cnblogs.com/atomicbomb/p/6999594.html
Copyright © 2020-2023  润新知