• Spark 1.3.0 单机安装


    一、试验环境:

    CentOS6.6 最小化安装;主机名spark-test,IP:10.10.10.26

    OpenStack虚拟云主机。

    注:安装流程:进入linux->安装JDK->安装scala->安装spark。


    二、安装JDK

    下载JDK:

    版本jdk-6u45-linux-x64.bin,下载见Oracle官网

    建立data文件夹,用来存放数据

    # mkdir /data
    
    [root@spark-test data]# ls 
    jdk-6u45-linux-x64.bin  scala-2.11.6.tgz  spark-1.3.0-bin-hadoop2.4.tgz

    安装jdk

    [root@spark-test data]# chmod u+x jdk-6u45-linux-x64.bin      //增加执行权限 
    [root@spark-test data]# ./jdk-6u45-linux-x64.bin

    配置环境变量

    [root@spark-test data]# vim /etc/profile
    
    #JAVA VARIABLES START 
    export JAVA_HOME=/data/jdk1.6.0_45 
    export PATH=$PATH:$JAVA_HOME/bin 
    #JAVA VARIABLES END
    
    [root@spark-test data]# source /etc/profile 
    [root@spark-test data]# java -version 
    java version "1.6.0_45" 
    Java(TM) SE Runtime Environment (build 1.6.0_45-b06) 
    Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

    三、安装scala

    下载Scala,版本2.11.6  网址:http://www.scala-lang.org/download/2.11.6.html

    image

    安装Scala

    [root@spark-test data]# tar -zxvf  scala-2.11.6.tgz

    配置环境变量

    [root@spark-test data]# vim /etc/profile
    
    #SCALA VARIABLES START 
    export SCALA_HOME=/data/scala-2.11.6 
    export PATH=$PATH:$SCALA_HOME/bin 
    #SCALA VARIABLES END
    
    [root@spark-test data]# source /etc/profile 
    [root@spark-test data]# scala -version 
    Scala code runner version 2.11.6 -- Copyright 2002-2013, LAMP/EPFL

    Scala配置成功

    四、安装Spark

    从官网下载http://spark.apache.org/downloads.html

    image

    下载编译后的版本

    解压安装

    [root@spark-test data]# tar -zxvf spark-1.3.0-bin-hadoop2.4.tgz

    配置Spark环境变量:

    [root@spark-test data]# vim /etc/profile
    
    #SPARK VARIABLES START 
    export SPARK_HOME=/data/spark-1.3.0-bin-hadoop2.4 
    export PATH=$PATH:$SPARK_HOME/bin 
    #SPARK VARIABLES END
    
    [root@spark-test data]# source /etc/profile

    切换到conf目录:

    [root@spark-test conf]# ls 
    fairscheduler.xml.template   slaves.template 
    log4j.properties.template    spark-defaults.conf.template 
    metrics.properties.template  spark-env.sh.template 
    [root@spark-test conf]# mv spark-env.sh.template spark-env.sh
    
    [root@spark-test conf]# vim spark-env.sh 
    
    export SCALA_HOME=/data/scala-2.11.6 
    export JAVA_HOME=/data/jdk1.6.0_45 
    export SPARK_MASTER_IP=10.10.10.26 
    export SPARK_WORKER_MEMORY=1024m 
    export master=spark://10.10.10.26:7070
    
    [root@spark-test conf]# vim slaves 
    
    spark-test

    启动spark集群:

    [root@spark-test sbin]# pwd 
    /data/spark-1.3.0-bin-hadoop2.4/sbin 
    [root@spark-test sbin]# ./start-all.sh

    验证:

    [root@spark-test sbin]# jps 
    22974 Worker 
    23395 Jps 
    22830 Master

    测试:

    切换目录

    [root@spark-test bin]# pwd 
    /data/spark-1.3.0-bin-hadoop2.4/bin

    运行样例:

    [root@spark-test spark-1.3.0-bin-hadoop2.4]# ./bin/run-example org.apache.spark.examples.SparkPi 
    Spark assembly has been built with Hive, including Datanucleus jars on classpath 
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
    15/04/01 11:40:48 INFO SparkContext: Running Spark version 1.3.0 
    15/04/01 11:40:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
    15/04/01 11:40:49 INFO SecurityManager: Changing view acls to: root 
    15/04/01 11:40:49 INFO SecurityManager: Changing modify acls to: root 
    15/04/01 11:40:49 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 
    15/04/01 11:40:49 INFO Slf4jLogger: Slf4jLogger started 
    15/04/01 11:40:49 INFO Remoting: Starting remoting 
    15/04/01 11:40:50 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@spark-test.novalocal:58680] 
    15/04/01 11:40:50 INFO Utils: Successfully started service 'sparkDriver' on port 58680. 
    15/04/01 11:40:50 INFO SparkEnv: Registering MapOutputTracker 
    15/04/01 11:40:50 INFO SparkEnv: Registering BlockManagerMaster 
    15/04/01 11:40:50 INFO DiskBlockManager: Created local directory at /tmp/spark-53cdf980-4803-480f-8936-2b3bb7e2bbfc/blockmgr-c15cfa29-3bfb-4ee8-a0d3-b9735bfe9dea 
    15/04/01 11:40:50 INFO MemoryStore: MemoryStore started with capacity 265.0 MB 
    15/04/01 11:40:50 INFO HttpFileServer: HTTP File server directory is /tmp/spark-22f9b0df-bfdb-435d-b504-ab1c52b73556/httpd-244e5d7f-9c1d-48d8-bd95-2ed985ecb3a0 
    15/04/01 11:40:50 INFO HttpServer: Starting HTTP Server 
    15/04/01 11:40:50 INFO Server: jetty-8.y.z-SNAPSHOT 
    15/04/01 11:40:50 INFO AbstractConnector: Started SocketConnector@0.0.0.0:59040 
    15/04/01 11:40:50 INFO Utils: Successfully started service 'HTTP file server' on port 59040. 
    15/04/01 11:40:50 INFO SparkEnv: Registering OutputCommitCoordinator 
    15/04/01 11:40:50 INFO Server: jetty-8.y.z-SNAPSHOT 
    15/04/01 11:40:50 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 
    15/04/01 11:40:50 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
    15/04/01 11:40:50 INFO SparkUI: Started SparkUI at http://spark-test.novalocal:4040 
    15/04/01 11:40:51 INFO SparkContext: Added JAR file:/data/spark-1.3.0-bin-hadoop2.4/lib/spark-examples-1.3.0-hadoop2.4.0.jar at http://10.10.10.26:59040/jars/spark-examples-1.3.0-hadoop2.4.0.jar with timestamp 1427859651127 
    15/04/01 11:40:51 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@10.10.10.26:7070/user/Master... 
    15/04/01 11:40:51 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@10.10.10.26:7070: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@10.10.10.26:7070 
    15/04/01 11:40:51 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@10.10.10.26:7070]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /10.10.10.26:7070 
    15/04/01 11:41:11 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@10.10.10.26:7070/user/Master... 
    15/04/01 11:41:11 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@10.10.10.26:7070: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@10.10.10.26:7070 
    15/04/01 11:41:11 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@10.10.10.26:7070]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /10.10.10.26:7070 
    15/04/01 11:41:31 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@10.10.10.26:7070/user/Master... 
    15/04/01 11:41:31 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@10.10.10.26:7070: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@10.10.10.26:7070 
    15/04/01 11:41:31 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@10.10.10.26:7070]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /10.10.10.26:7070 
    15/04/01 11:41:51 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up. 
    15/04/01 11:41:51 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up. 
    15/04/01 11:41:51 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet. 
    [root@spark-test spark-1.3.0-bin-hadoop2.4]# ./bin/run-example org.apache.spark.examples.SparkPi 
    Spark assembly has been built with Hive, including Datanucleus jars on classpath 
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
    15/04/01 11:53:22 INFO SparkContext: Running Spark version 1.3.0 
    15/04/01 11:53:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
    15/04/01 11:53:22 INFO SecurityManager: Changing view acls to: root 
    15/04/01 11:53:22 INFO SecurityManager: Changing modify acls to: root 
    15/04/01 11:53:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 
    15/04/01 11:53:23 INFO Slf4jLogger: Slf4jLogger started 
    15/04/01 11:53:23 INFO Remoting: Starting remoting 
    15/04/01 11:53:23 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@spark-test.novalocal:55722] 
    15/04/01 11:53:23 INFO Utils: Successfully started service 'sparkDriver' on port 55722. 
    15/04/01 11:53:23 INFO SparkEnv: Registering MapOutputTracker 
    15/04/01 11:53:23 INFO SparkEnv: Registering BlockManagerMaster 
    15/04/01 11:53:23 INFO DiskBlockManager: Created local directory at /tmp/spark-d70142c7-effd-40c0-b050-f39d727d6e33/blockmgr-6d5699cc-acf8-4ab9-8b39-dfb5385209e5 
    15/04/01 11:53:23 INFO MemoryStore: MemoryStore started with capacity 265.0 MB 
    15/04/01 11:53:23 INFO HttpFileServer: HTTP File server directory is /tmp/spark-5821f748-ecf7-4e24-a593-ff2c2b040b43/httpd-90a05ad6-f73b-4a52-9a61-0ff135f449a9 
    15/04/01 11:53:23 INFO HttpServer: Starting HTTP Server 
    15/04/01 11:53:23 INFO Server: jetty-8.y.z-SNAPSHOT 
    15/04/01 11:53:23 INFO AbstractConnector: Started SocketConnector@0.0.0.0:43969 
    15/04/01 11:53:23 INFO Utils: Successfully started service 'HTTP file server' on port 43969. 
    15/04/01 11:53:23 INFO SparkEnv: Registering OutputCommitCoordinator 
    15/04/01 11:53:23 INFO Server: jetty-8.y.z-SNAPSHOT 
    15/04/01 11:53:23 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 
    15/04/01 11:53:23 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
    15/04/01 11:53:23 INFO SparkUI: Started SparkUI at http://spark-test.novalocal:4040 
    15/04/01 11:53:23 INFO SparkContext: Added JAR file:/data/spark-1.3.0-bin-hadoop2.4/lib/spark-examples-1.3.0-hadoop2.4.0.jar at http://10.10.10.26:43969/jars/spark-examples-1.3.0-hadoop2.4.0.jar with timestamp 1427860403997 
    15/04/01 11:53:24 INFO Executor: Starting executor ID <driver> on host localhost 
    15/04/01 11:53:24 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@spark-test.novalocal:55722/user/HeartbeatReceiver 
    15/04/01 11:53:24 INFO NettyBlockTransferService: Server created on 39015 
    15/04/01 11:53:24 INFO BlockManagerMaster: Trying to register BlockManager 
    15/04/01 11:53:24 INFO BlockManagerMasterActor: Registering block manager localhost:39015 with 265.0 MB RAM, BlockManagerId(<driver>, localhost, 39015) 
    15/04/01 11:53:24 INFO BlockManagerMaster: Registered BlockManager 
    15/04/01 11:53:24 INFO SparkContext: Starting job: reduce at SparkPi.scala:35 
    15/04/01 11:53:24 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:35) with 2 output partitions (allowLocal=false) 
    15/04/01 11:53:24 INFO DAGScheduler: Final stage: Stage 0(reduce at SparkPi.scala:35) 
    15/04/01 11:53:24 INFO DAGScheduler: Parents of final stage: List() 
    15/04/01 11:53:24 INFO DAGScheduler: Missing parents: List() 
    15/04/01 11:53:24 INFO DAGScheduler: Submitting Stage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:31), which has no missing parents 
    15/04/01 11:53:24 INFO MemoryStore: ensureFreeSpace(1848) called with curMem=0, maxMem=277842493 
    15/04/01 11:53:24 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1848.0 B, free 265.0 MB) 
    15/04/01 11:53:24 INFO MemoryStore: ensureFreeSpace(1296) called with curMem=1848, maxMem=277842493 
    15/04/01 11:53:24 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1296.0 B, free 265.0 MB) 
    15/04/01 11:53:24 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:39015 (size: 1296.0 B, free: 265.0 MB) 
    15/04/01 11:53:24 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0 
    15/04/01 11:53:24 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:839 
    15/04/01 11:53:24 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:31) 
    15/04/01 11:53:24 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 
    15/04/01 11:53:24 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1336 bytes) 
    15/04/01 11:53:24 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 1336 bytes) 
    15/04/01 11:53:24 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 
    15/04/01 11:53:24 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 
    15/04/01 11:53:24 INFO Executor: Fetching http://10.10.10.26:43969/jars/spark-examples-1.3.0-hadoop2.4.0.jar with timestamp 1427860403997 
    15/04/01 11:53:24 INFO Utils: Fetching http://10.10.10.26:43969/jars/spark-examples-1.3.0-hadoop2.4.0.jar to /tmp/spark-7cb47603-adb9-45ea-ad91-e5ddc3c6da41/userFiles-86c97d54-c082-4bb8-bcb3-34b97a432674/fetchFileTemp3928503400699723858.tmp 
    15/04/01 11:53:25 INFO Executor: Adding file:/tmp/spark-7cb47603-adb9-45ea-ad91-e5ddc3c6da41/userFiles-86c97d54-c082-4bb8-bcb3-34b97a432674/spark-examples-1.3.0-hadoop2.4.0.jar to class loader 
    15/04/01 11:53:25 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 736 bytes result sent to driver 
    15/04/01 11:53:25 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 736 bytes result sent to driver 
    15/04/01 11:53:25 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1068 ms on localhost (1/2) 
    15/04/01 11:53:25 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1028 ms on localhost (2/2) 
    15/04/01 11:53:25 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
    15/04/01 11:53:25 INFO DAGScheduler: Stage 0 (reduce at SparkPi.scala:35) finished in 1.107 s 
    15/04/01 11:53:25 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:35, took 1.326417 s 
    Pi is roughly 3.13518 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null} 
    15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null} 
    15/04/01 11:53:25 INFO SparkUI: Stopped Spark web UI at http://spark-test.novalocal:4040 
    15/04/01 11:53:25 INFO DAGScheduler: Stopping DAGScheduler 
    15/04/01 11:53:25 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped! 
    15/04/01 11:53:25 INFO MemoryStore: MemoryStore cleared 
    15/04/01 11:53:25 INFO BlockManager: BlockManager stopped 
    15/04/01 11:53:25 INFO BlockManagerMaster: BlockManagerMaster stopped 
    15/04/01 11:53:25 INFO OutputCommitCoordinator$OutputCommitCoordinatorActor: OutputCommitCoordinator stopped! 
    15/04/01 11:53:25 INFO SparkContext: Successfully stopped SparkContext 
    15/04/01 11:53:25 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 
    15/04/01 11:53:25 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 
    15/04/01 11:53:25 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    既然优秀不够,那就让自己无可替代
  • 相关阅读:
    js常用方法收集
    Jquery的常用使用方法
    jQuery css()选择器使用说明
    解决IE6,边框问题
    HTML问题集锦及笔记
    我的第一个chrome扩展(3)——继续读样例
    我的第一个chrome扩展(0)——目标
    我的第一个chrome扩展(2)——基本知识
    我的第一个chrome扩展(1)——读样例,实现时钟
    2の奇妙用法
  • 原文地址:https://www.cnblogs.com/icloud/p/4381470.html
Copyright © 2020-2023  润新知