• Docker 搭建Spark 依赖sequenceiq/spark:1.6镜像


    使用Docker-Hub中Spark排行最高的sequenceiq/spark:1.6.0。

    操作:

    拉取镜像:

    [root@localhost home]# docker pull sequenceiq/spark:1.6.0
    Trying to pull repository docker.io/sequenceiq/spark ... 

    启动容器:

    [root@localhost home]# docker image ls
    REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
    docker.io/sequenceiq/spark           1.6.0               40a687b3cdcc        2 years ago         2.88 GB
    docker.io/sequenceiq/hadoop-docker   2.6.0               140b265bd62a        3 years ago         1.62 GB
    [root@localhost home]# docker run -dit -p 8088:8088 -p 8042:8042 -p 4040:4040 -h sandbox sequenceiq/spark:1.6.0 bash 

    进入容器内部:

    [root@localhost home]# docker ps -a
    CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                                                                                                                                                                       NAMES
    75e3d67806bc        sequenceiq/spark:1.6.0   "/etc/bootstrap.sh..."   4 seconds ago       Up 3 seconds        22/tcp, 8030-8033/tcp, 0.0.0.0:4040->4040/tcp, 0.0.0.0:8042->8042/tcp, 8040/tcp, 49707/tcp, 50010/tcp, 50020/tcp, 50070/tcp, 50075/tcp, 50090/tcp, 0.0.0.0:8088->8088/tcp   thirsty_gates
    [root@localhost home]# docker exec -it 75 /bin/bash

    Spark:

    YARN-client(单机)模式

    在YARN-client模式中,驱动程序在客户机进程中运行,应用程序master仅用于请求来自yarn的资源。

    bash-4.1# spark-shell --master yarn-client --driver-memory 1g --executor-memory 1g --executor-cores 1
    18/08/14 02:37:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    18/08/14 02:37:30 INFO spark.SecurityManager: Changing view acls to: root
    18/08/14 02:37:30 INFO spark.SecurityManager: Changing modify acls to: root
    18/08/14 02:37:30 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
    18/08/14 02:37:30 INFO spark.HttpServer: Starting HTTP Server
    18/08/14 02:37:30 INFO server.Server: jetty-8.y.z-SNAPSHOT
    18/08/14 02:37:30 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:42047
    18/08/14 02:37:30 INFO util.Utils: Successfully started service 'HTTP class server' on port 42047.
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _ / _ / _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_   version 1.6.0
          /_/
    
    Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51)
    Type in expressions to have them evaluated.
    Type :help for more information.
    18/08/14 02:37:38 INFO spark.SparkContext: Running Spark version 1.6.0
    18/08/14 02:37:38 INFO spark.SecurityManager: Changing view acls to: root
    18/08/14 02:37:38 INFO spark.SecurityManager: Changing modify acls to: root
    18/08/14 02:37:38 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
    18/08/14 02:37:39 INFO util.Utils: Successfully started service 'sparkDriver' on port 36773.
    18/08/14 02:37:40 INFO slf4j.Slf4jLogger: Slf4jLogger started
    18/08/14 02:37:40 INFO Remoting: Starting remoting
    18/08/14 02:37:40 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@172.17.0.2:32811]
    18/08/14 02:37:40 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 32811.
    18/08/14 02:37:40 INFO spark.SparkEnv: Registering MapOutputTracker
    18/08/14 02:37:40 INFO spark.SparkEnv: Registering BlockManagerMaster
    18/08/14 02:37:40 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-8c30cc1c-dfea-4ebf-94b9-c45ff3a1b849
    18/08/14 02:37:40 INFO storage.MemoryStore: MemoryStore started with capacity 517.4 MB
    18/08/14 02:37:41 INFO spark.SparkEnv: Registering OutputCommitCoordinator
    18/08/14 02:37:42 INFO server.Server: jetty-8.y.z-SNAPSHOT
    18/08/14 02:37:42 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
    18/08/14 02:37:42 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
    18/08/14 02:37:42 INFO ui.SparkUI: Started SparkUI at http://172.17.0.2:4040
    18/08/14 02:37:42 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
    18/08/14 02:37:43 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
    18/08/14 02:37:43 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
    18/08/14 02:37:43 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
    18/08/14 02:37:43 INFO yarn.Client: Setting up container launch context for our AM
    18/08/14 02:37:43 INFO yarn.Client: Setting up the launch environment for our AM container
    18/08/14 02:37:43 INFO yarn.Client: Preparing resources for our AM container
    18/08/14 02:37:44 WARN yarn.Client: Failed to cleanup staging dir .sparkStaging/application_1534228565880_0001
    java.net.ConnectException: Call From sandbox/172.17.0.2 to sandbox:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy21.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy22.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1988)
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118)
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
        at org.apache.spark.deploy.yarn.Client.cleanupStagingDir(Client.scala:167)
        at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:152)
        at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
        at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
        at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
        at $line3.$read$$iwC$$iwC.<init>(<console>:15)
        at $line3.$read$$iwC.<init>(<console>:24)
        at $line3.$read.<init>(<console>:26)
        at $line3.$read$.<init>(<console>:30)
        at $line3.$read$.<clinit>(<console>)
        at $line3.$eval$.<init>(<console>:7)
        at $line3.$eval$.<clinit>(<console>)
        at $line3.$eval.$print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
        at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
        at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
        at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
        at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
        at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:125)
        at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
        at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
        at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
        at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
        at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
        at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
        at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
        at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
        at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
        at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
        at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
        at org.apache.hadoop.ipc.Client.call(Client.java:1438)
        ... 70 more
    18/08/14 22:07:17 ERROR spark.SparkContext: Error initializing SparkContext.

    特么的,居然失败,尝试了几次都不行,也不知道网上其他人怎么搞的,一样的操作。

    将git里面的docker拉去下来,重新docker build,还是报错,错误信息:

    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.
    
    safemode: Call From 70b4a57bb473/172.17.0.2 to 70b4a57bb473:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

    这明显是hadoop里面的safemode未关闭,但是dockerfile里面已经操作命令关闭了,不知道哪里出问题了。

    参考:

    https://www.jianshu.com/p/4801bb7ab9e0

    https://www.cnblogs.com/ybst/p/9050660.html

    https://github.com/sequenceiq/docker-spark

    https://blog.csdn.net/farawayzheng_necas/article/details/54341036

    https://blog.csdn.net/yeasy/article/details/48654965

    https://blog.csdn.net/hanss2/article/details/78505446

    http://wgliang.github.io/pages/spark-on-docker.html

  • 相关阅读:
    《天才在左,疯子在右》
    MVC思想概述
    java文件读写
    HTTP协议简单笔记
    自学Python_Day01
    Linux基础介绍篇
    PHP学习 Day_01
    Linux中部分命令英语全拼
    Linux学习基础命令(三)
    Linux学习基础命令(二)
  • 原文地址:https://www.cnblogs.com/hongdada/p/9471007.html
Copyright © 2020-2023  润新知