• Linux单机安转Spark


    安装Spark需要先安装jdk安装Scala

    1. 创建目录

    > mkdir  /opt/spark

    > cd  /opt/spark

    2. 解压缩、创建软连接

    > tar  zxvf  spark-2.3.0-bin-hadoop2.7.tgz

    > link -s spark-2.3.0-bin-hadoop2.7  spark

    4. 编辑 /etc/profile

    > vi /etc/profile

    输入下面内容

    export SPARK_HOME=/opt/spark/spark
    export PATH=$PATH:$SPARK_HOME/bin

    > source  /etc/profile

    5. 进入配置文件夹

    > cd /opt/spark/spark/conf

    6. 配置spark-env.sh

    > cp spark-env.sh.template spark-env.sh

    spark-env.sh 中输入以下内容

    export SCALA_HOME=/opt/scala/scala
    export JAVA_HOME=/opt/java/jdk
    export SPARK_HOME=/opt/spark/spark
    export SPARK_MASTER_IP=hserver1
    export SPARK_EXECUTOR_MEMORY=1G

    注意:上面的路径应该根据自己的路径配置

    7. 配置slaves

    > cp slaves.template  slaves

    slaves 中输入以下内容

    localhost

    8. 运行spark示例

    > cd /opt/spark/spark

    > ./bin/run-example SparkPi 10

    会显示下面信息

    [aston@localhost spark]$ ./bin/run-example SparkPi 10
    2018-06-04 22:37:25 WARN  Utils:66 - Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 192.168.199.150 instead (on interface wlp8s0b1)
    2018-06-04 22:37:25 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
    2018-06-04 22:37:25 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    2018-06-04 22:37:25 INFO  SparkContext:54 - Running Spark version 2.3.0
    2018-06-04 22:37:25 INFO  SparkContext:54 - Submitted application: Spark Pi
    2018-06-04 22:37:26 INFO  SecurityManager:54 - Changing view acls to: aston
    2018-06-04 22:37:26 INFO  SecurityManager:54 - Changing modify acls to: aston
    2018-06-04 22:37:26 INFO  SecurityManager:54 - Changing view acls groups to: 
    2018-06-04 22:37:26 INFO  SecurityManager:54 - Changing modify acls groups to: 
    2018-06-04 22:37:26 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(aston); groups with view permissions: Set(); users  with modify permissions: Set(aston); groups with modify permissions: Set()
    2018-06-04 22:37:26 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 34729.
    2018-06-04 22:37:26 INFO  SparkEnv:54 - Registering MapOutputTracker
    2018-06-04 22:37:26 INFO  SparkEnv:54 - Registering BlockManagerMaster
    2018-06-04 22:37:26 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
    2018-06-04 22:37:26 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
    2018-06-04 22:37:26 INFO  DiskBlockManager:54 - Created local directory at /tmp/blockmgr-4d51d515-85db-4a8c-bb45-219fd96be3c6
    2018-06-04 22:37:26 INFO  MemoryStore:54 - MemoryStore started with capacity 366.3 MB
    2018-06-04 22:37:26 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
    2018-06-04 22:37:26 INFO  log:192 - Logging initialized @2296ms
    2018-06-04 22:37:26 INFO  Server:346 - jetty-9.3.z-SNAPSHOT
    2018-06-04 22:37:26 INFO  Server:414 - Started @2382ms
    2018-06-04 22:37:26 INFO  AbstractConnector:278 - Started ServerConnector@779dfe55{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
    2018-06-04 22:37:26 INFO  Utils:54 - Successfully started service 'SparkUI' on port 4040.
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f212d84{/jobs,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@27ead29e{/jobs/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4c060c8f{/jobs/job,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@383f3558{/jobs/job/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@49b07ee3{/stages,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@352e612e{/stages/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@65f00478{/stages/stage,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@28486680{/stages/stage/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4d7e7435{/stages/pool,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4a1e3ac1{/stages/pool/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e78fcf5{/storage,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@56febdc{/storage/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3b8ee898{/storage/rdd,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7d151a{/storage/rdd/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@294bdeb4{/environment,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5300f14a{/environment/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1f86099a{/executors,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@77bb0ab5{/executors/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@f2c488{/executors/threadDump,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54acff7d{/executors/threadDump/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7bc9e6ab{/static,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@37d00a23{/,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@433e536f{/api,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@988246e{/jobs/job/kill,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@62515a47{/stages/stage/kill,null,AVAILABLE,@Spark}
    2018-06-04 22:37:26 INFO  SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://192.168.199.150:4040
    2018-06-04 22:37:26 INFO  SparkContext:54 - Added JAR file:///opt/spark/spark/examples/jars/spark-examples_2.11-2.3.0.jar at spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar with timestamp 1528123046779
    2018-06-04 22:37:26 INFO  SparkContext:54 - Added JAR file:///opt/spark/spark/examples/jars/scopt_2.11-3.7.0.jar at spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar with timestamp 1528123046780
    2018-06-04 22:37:26 INFO  Executor:54 - Starting executor ID driver on host localhost
    2018-06-04 22:37:26 INFO  Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 45436.
    2018-06-04 22:37:26 INFO  NettyBlockTransferService:54 - Server created on 192.168.199.150:45436
    2018-06-04 22:37:26 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
    2018-06-04 22:37:26 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, 192.168.199.150, 45436, None)
    2018-06-04 22:37:26 INFO  BlockManagerMasterEndpoint:54 - Registering block manager 192.168.199.150:45436 with 366.3 MB RAM, BlockManagerId(driver, 192.168.199.150, 45436, None)
    2018-06-04 22:37:26 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, 192.168.199.150, 45436, None)
    2018-06-04 22:37:26 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, 192.168.199.150, 45436, None)
    2018-06-04 22:37:27 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@65bcf7c2{/metrics/json,null,AVAILABLE,@Spark}
    2018-06-04 22:37:27 INFO  SparkContext:54 - Starting job: reduce at SparkPi.scala:38
    2018-06-04 22:37:27 INFO  DAGScheduler:54 - Got job 0 (reduce at SparkPi.scala:38) with 10 output partitions
    2018-06-04 22:37:27 INFO  DAGScheduler:54 - Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
    2018-06-04 22:37:27 INFO  DAGScheduler:54 - Parents of final stage: List()
    2018-06-04 22:37:27 INFO  DAGScheduler:54 - Missing parents: List()
    2018-06-04 22:37:27 INFO  DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
    2018-06-04 22:37:27 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 366.3 MB)
    2018-06-04 22:37:28 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1181.0 B, free 366.3 MB)
    2018-06-04 22:37:28 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 192.168.199.150:45436 (size: 1181.0 B, free: 366.3 MB)
    2018-06-04 22:37:28 INFO  SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1039
    2018-06-04 22:37:28 INFO  DAGScheduler:54 - Submitting 10 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9))
    2018-06-04 22:37:28 INFO  TaskSchedulerImpl:54 - Adding task set 0.0 with 10 tasks
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 2.0 in stage 0.0 (TID 2)
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 1.0 in stage 0.0 (TID 1)
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 3.0 in stage 0.0 (TID 3)
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 0.0 in stage 0.0 (TID 0)
    2018-06-04 22:37:28 INFO  Executor:54 - Fetching spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar with timestamp 1528123046780
    2018-06-04 22:37:28 INFO  TransportClientFactory:267 - Successfully created connection to /192.168.199.150:34729 after 34 ms (0 ms spent in bootstraps)
    2018-06-04 22:37:28 INFO  Utils:54 - Fetching spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar to /tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/fetchFileTemp8606784681518533462.tmp
    2018-06-04 22:37:28 INFO  Executor:54 - Adding file:/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/scopt_2.11-3.7.0.jar to class loader
    2018-06-04 22:37:28 INFO  Executor:54 - Fetching spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar with timestamp 1528123046779
    2018-06-04 22:37:28 INFO  Utils:54 - Fetching spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar to /tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/fetchFileTemp8435156876449095794.tmp
    2018-06-04 22:37:28 INFO  Executor:54 - Adding file:/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/spark-examples_2.11-2.3.0.jar to class loader
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 824 bytes result sent to driver
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 1.0 in stage 0.0 (TID 1). 824 bytes result sent to driver
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 2.0 in stage 0.0 (TID 2). 867 bytes result sent to driver
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 4.0 in stage 0.0 (TID 4)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 5.0 in stage 0.0 (TID 5, localhost, executor driver, partition 5, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 5.0 in stage 0.0 (TID 5)
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 3.0 in stage 0.0 (TID 3). 824 bytes result sent to driver
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 6.0 in stage 0.0 (TID 6, localhost, executor driver, partition 6, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 7.0 in stage 0.0 (TID 7, localhost, executor driver, partition 7, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 7.0 in stage 0.0 (TID 7)
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 6.0 in stage 0.0 (TID 6)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 362 ms on localhost (executor driver) (1/10)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 3.0 in stage 0.0 (TID 3) in 385 ms on localhost (executor driver) (2/10)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 418 ms on localhost (executor driver) (3/10)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 2.0 in stage 0.0 (TID 2) in 388 ms on localhost (executor driver) (4/10)
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 5.0 in stage 0.0 (TID 5). 824 bytes result sent to driver
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 8.0 in stage 0.0 (TID 8, localhost, executor driver, partition 8, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 5.0 in stage 0.0 (TID 5) in 79 ms on localhost (executor driver) (5/10)
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 8.0 in stage 0.0 (TID 8)
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 4.0 in stage 0.0 (TID 4). 824 bytes result sent to driver
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Starting task 9.0 in stage 0.0 (TID 9, localhost, executor driver, partition 9, PROCESS_LOCAL, 7853 bytes)
    2018-06-04 22:37:28 INFO  Executor:54 - Running task 9.0 in stage 0.0 (TID 9)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 4.0 in stage 0.0 (TID 4) in 99 ms on localhost (executor driver) (6/10)
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 7.0 in stage 0.0 (TID 7). 824 bytes result sent to driver
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 7.0 in stage 0.0 (TID 7) in 98 ms on localhost (executor driver) (7/10)
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 6.0 in stage 0.0 (TID 6). 824 bytes result sent to driver
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 6.0 in stage 0.0 (TID 6) in 107 ms on localhost (executor driver) (8/10)
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 9.0 in stage 0.0 (TID 9). 824 bytes result sent to driver
    2018-06-04 22:37:28 INFO  Executor:54 - Finished task 8.0 in stage 0.0 (TID 8). 867 bytes result sent to driver
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 9.0 in stage 0.0 (TID 9) in 39 ms on localhost (executor driver) (9/10)
    2018-06-04 22:37:28 INFO  TaskSetManager:54 - Finished task 8.0 in stage 0.0 (TID 8) in 57 ms on localhost (executor driver) (10/10)
    2018-06-04 22:37:28 INFO  TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
    2018-06-04 22:37:28 INFO  DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:38) finished in 0.800 s
    2018-06-04 22:37:28 INFO  DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:38, took 0.945853 s
    Pi is roughly 3.14023914023914
    2018-06-04 22:37:28 INFO  AbstractConnector:318 - Stopped Spark@779dfe55{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
    2018-06-04 22:37:28 INFO  SparkUI:54 - Stopped Spark web UI at http://192.168.199.150:4040
    2018-06-04 22:37:28 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
    2018-06-04 22:37:28 INFO  MemoryStore:54 - MemoryStore cleared
    2018-06-04 22:37:28 INFO  BlockManager:54 - BlockManager stopped
    2018-06-04 22:37:28 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
    2018-06-04 22:37:28 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
    2018-06-04 22:37:28 INFO  SparkContext:54 - Successfully stopped SparkContext
    2018-06-04 22:37:28 INFO  ShutdownHookManager:54 - Shutdown hook called
    2018-06-04 22:37:28 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e
    2018-06-04 22:37:28 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-16300765-9872-4542-91ed-1a7a0f8285d9

    9. 运行spark shell

    > cd /opt/spark/spark

    > ./bin/spark-shell

  • 相关阅读:
    MySql
    027 mysql
    navicat
    基于阿里云资源的分布式部署方案
    translate(50%,50%) 实现水平垂直居中
    SSH2 协议详解
    DNS服务配置篇 巴哥学Server2003
    Java 8 后的新功能梳理
    IBM Cognos BI 最佳实践系列 网站地址
    jsf2.0 入门视频 教程
  • 原文地址:https://www.cnblogs.com/aston/p/9136379.html
Copyright © 2020-2023  润新知