• AAS代码运行-第11章-1


    启动PySpark

    export IPYTHON=1 # PySpark也可使用IPython shell
    pyspark --master yarn --num-executors 3

    发生如下错误:

    /opt/cloudera/parcels/CDH-5.3.3-1.cdh5.3.3.p0.5/bin/../lib/spark/bin/pyspark: line 135: exec: ipython: not found

    原因是没有按照ipython,在google所有一下安装的方法,到网站http://continuum.io/downloads#all上下载Linux-64位的Anaconda版本,

    wget https://3230d63b5fc54e62148e-c95ac804525aac4b6dba79b00b39d1d3.ssl.cf1.rackcdn.com/Anaconda-2.3.0-Linux-x86_64.sh

    下载完成之后,运行如下命令进行安装:

    bash Anaconda-2.3.0-Linux-x86_64.sh

    上述命令会进入一个类似shell的交互式环境,安装过程中都选择yes即可。安装完成之后,在安装ipython,Anaconda在交换式环境中输入:

    pip install ipython

    这时候会安装一大堆的包。安装完成之后会提示是否修改环境变量,输入yes,安装脚本自动将如下脚本写入~/.bashrc中的尾部

    export PATH=/root/anaconda/bin:$PATH

    运行source ~/.bashrc使新环境变量生效。

    source ~/.bashrc

    再次运行pyspark命令即可,如下:

    pyspark --master yarn --num-executors 3

    得到如下输出:

    Type "copyright", "credits" or "license" for more information.
    
    IPython 3.2.0 -- An enhanced Interactive Python.
    Anaconda is brought to you by Continuum Analytics.
    Please check out: http://continuum.io/thanks and https://anaconda.org
    ?         -> Introduction and overview of IPython's features.
    %quickref -> Quick reference.
    help      -> Python's own help system.
    object?   -> Details about 'object', use 'object??' for extra details.
    15/07/04 12:21:02 INFO SecurityManager: Changing view acls to: root
    15/07/04 12:21:02 INFO SecurityManager: Changing modify acls to: root
    15/07/04 12:21:02 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
    15/07/04 12:21:03 INFO Slf4jLogger: Slf4jLogger started
    15/07/04 12:21:03 INFO Remoting: Starting remoting
    15/07/04 12:21:03 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@ip-172-31-25-243.us-west-2.compute.internal:50324]
    15/07/04 12:21:03 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriver@ip-172-31-25-243.us-west-2.compute.internal:50324]
    15/07/04 12:21:03 INFO Utils: Successfully started service 'sparkDriver' on port 50324.
    15/07/04 12:21:03 INFO SparkEnv: Registering MapOutputTracker
    15/07/04 12:21:03 INFO SparkEnv: Registering BlockManagerMaster
    15/07/04 12:21:03 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20150704122103-7afc
    15/07/04 12:21:03 INFO MemoryStore: MemoryStore started with capacity 265.4 MB
    15/07/04 12:21:05 INFO HttpFileServer: HTTP File server directory is /tmp/spark-e2f4c4e3-dc9b-4db0-8fd4-1dcbb8819b05
    15/07/04 12:21:05 INFO HttpServer: Starting HTTP Server
    15/07/04 12:21:05 INFO Utils: Successfully started service 'HTTP file server' on port 48934.
    15/07/04 12:21:05 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    15/07/04 12:21:05 INFO SparkUI: Started SparkUI at http://ip-172-31-25-243.us-west-2.compute.internal:4040
    15/07/04 12:21:05 INFO RMProxy: Connecting to ResourceManager at ip-172-31-25-243.us-west-2.compute.internal/172.31.25.243:8032
    15/07/04 12:21:06 INFO Client: Requesting a new application from cluster with 3 NodeManagers
    15/07/04 12:21:06 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (4720 MB per container)
    15/07/04 12:21:06 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
    15/07/04 12:21:06 INFO Client: Setting up container launch context for our AM
    15/07/04 12:21:06 INFO Client: Preparing resources for our AM container
    15/07/04 12:21:07 INFO Client: Setting up the launch environment for our AM container
    15/07/04 12:21:07 INFO SecurityManager: Changing view acls to: root
    15/07/04 12:21:07 INFO SecurityManager: Changing modify acls to: root
    15/07/04 12:21:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
    15/07/04 12:21:07 INFO Client: Submitting application 1 to ResourceManager
    15/07/04 12:21:07 INFO YarnClientImpl: Submitted application application_1436008024626_0001
    15/07/04 12:21:08 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:08 INFO Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: root.root
         start time: 1436012467313
         final status: UNDEFINED
         tracking URL: http://ip-172-31-25-243.us-west-2.compute.internal:8088/proxy/application_1436008024626_0001/
         user: root
    15/07/04 12:21:09 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:10 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:11 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:12 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:13 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:14 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:15 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:16 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:17 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:18 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:19 INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
    15/07/04 12:21:19 INFO YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://sparkYarnAM@ip-172-31-25-246.us-west-2.compute.internal:45245/user/YarnAM#-1546613765]
    15/07/04 12:21:19 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> ip-172-31-25-243.us-west-2.compute.internal, PROXY_URI_BASES -> http://ip-172-31-25-243.us-west-2.compute.internal:8088/proxy/application_1436008024626_0001), /proxy/application_1436008024626_0001
    15/07/04 12:21:19 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
    15/07/04 12:21:20 INFO Client: Application report for application_1436008024626_0001 (state: RUNNING)
    15/07/04 12:21:20 INFO Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: ip-172-31-25-246.us-west-2.compute.internal
         ApplicationMaster RPC port: 0
         queue: root.root
         start time: 1436012467313
         final status: UNDEFINED
         tracking URL: http://ip-172-31-25-243.us-west-2.compute.internal:8088/proxy/application_1436008024626_0001/
         user: root
    15/07/04 12:21:20 INFO YarnClientSchedulerBackend: Application application_1436008024626_0001 has started running.
    15/07/04 12:21:20 INFO NettyBlockTransferService: Server created on 52594
    15/07/04 12:21:20 INFO BlockManagerMaster: Trying to register BlockManager
    15/07/04 12:21:20 INFO BlockManagerMasterActor: Registering block manager ip-172-31-25-243.us-west-2.compute.internal:52594 with 265.4 MB RAM, BlockManagerId(<driver>, ip-172-31-25-243.us-west-2.compute.internal, 52594)
    15/07/04 12:21:20 INFO BlockManagerMaster: Registered BlockManager
    15/07/04 12:21:21 INFO EventLoggingListener: Logging events to hdfs://ns-ha/user/spark/applicationHistory/application_1436008024626_0001
    15/07/04 12:21:30 INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@ip-172-31-25-245.us-west-2.compute.internal:43360/user/Executor#1565208330] with ID 1
    15/07/04 12:21:30 INFO RackResolver: Resolved ip-172-31-25-245.us-west-2.compute.internal to /default
    15/07/04 12:21:30 INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@ip-172-31-25-244.us-west-2.compute.internal:39613/user/Executor#-786460899] with ID 2
    15/07/04 12:21:30 INFO RackResolver: Resolved ip-172-31-25-244.us-west-2.compute.internal to /default
    15/07/04 12:21:30 INFO BlockManagerMasterActor: Registering block manager ip-172-31-25-245.us-west-2.compute.internal:44702 with 530.3 MB RAM, BlockManagerId(1, ip-172-31-25-245.us-west-2.compute.internal, 44702)
    15/07/04 12:21:31 INFO BlockManagerMasterActor: Registering block manager ip-172-31-25-244.us-west-2.compute.internal:45974 with 530.3 MB RAM, BlockManagerId(2, ip-172-31-25-244.us-west-2.compute.internal, 45974)
    15/07/04 12:21:35 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _ / _ / _ `/ __/  '_/
       /__ / .__/\_,_/_/ /_/\_   version 1.2.0-SNAPSHOT
          /_/
    
    Using Python version 2.7.10 (default, May 28 2015 17:02:03)
    SparkContext available as sc.
    
    In [1]: 
  • 相关阅读:
    正则表达式中匹配中文
    计算机中的颜色——颜色概述
    人物系列Claude Shannon
    reading listfrom other blog
    how to write Makefile
    《麻省理工大学开放课程:线性代数》[中英双语字幕]视频下载
    正则表达式30分钟入门教程
    usage of fscanf and other read functions in C/C++
    《麻省理工大学开放课程:线性代数》学习
    Open review of papers
  • 原文地址:https://www.cnblogs.com/littlesuccess/p/4621210.html
Copyright © 2020-2023  润新知