• spark安装与调试


    I----
    1---jdk and scala install
    ****zyp@ubuntu:~/Desktop/software$ tar xvf jdk-7u67-linux-i586.tar.gz

    ****vim ~/.bashrc (vim /etc/profile false)
    # # JAVA_HOME 2015.12.18 binary x64 or i386(uname -a)
    112 export JAVA_HOME=/usr/lib/jvm/jdk1.7_586
    113 export JRE_HOME=$JAVA_HOME/jre
    114 export PATH=$JAVA_HOME/bin:$PATH
    115 #export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
    116 
    117 # SCALA_HOME 2015.12.18
    118 export SCALA_HOME=/usr/lib/jvm/scala-2.10.4
    119 export PATH=$PATH:$SCALA_HOME/bin

    ****source /etc/profile
    ****java -version
    ****scals -version

    --tar tgz  http://www.scala-lang.org/files/archive/    or  http://www.scala-lang.org/files/archive/scala-2.10.4.tgz

    2---spark install

    using spark-1.1.0-bin-hadoop1.tgz
    https://spark.apache.org/downloads.html
    https://spark.apache.org/examples.html


    ****/usr/lib/jvm/spark-1.1.0-bin-hadoop1$ ./bin/spark-shell   --start the spark and http://localhost:4040
    ****Welcome to
          ____              __
               / __/__  ___ _____/ /__
                   _ / _ / _ `/ __/  '_/
                      /___/ .__/\_,_/_/ /_/\_   version 1.1.0
                            /_/

    16/01/07 01:20:08 INFO Utils: Successfully started service 'HTTP file server' on port 38690.
    16/01/07 01:20:14 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    16/01/07 01:20:14 INFO SparkUI: Started SparkUI at http://ubuntu.local:4040 or http://192.168.174.129:4040/stages/
    16/01/07 01:20:14 INFO Executor: Using REPL class URI: http://192.168.174.129:43766
    16/01/07 01:20:14 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@ubuntu.local:59425/user/HeartbeatReceiver
    16/01/07 01:20:14 INFO SparkILoop: Created spark context..
    Spark context available as sc
    ****scala> sc
    res0: org.apache.spark.SparkContext = org.apache.spark.SparkContext@118c6de
    ****scala> val inFile = sc.textFile("README.md")
    16/01/07 01:42:25 WARN SizeEstimator: Failed to check whether UseCompressedOops is set; assuming yes
    16/01/07 01:42:25 INFO MemoryStore: ensureFreeSpace(31447) called with curMem=0, maxMem=280248975
    16/01/07 01:42:25 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 30.7 KB, free 267.2 MB)
    inFile: org.apache.spark.rdd.RDD[String] = README.md MappedRDD[1] at textFile at <console>:12
    ****scala> val sparks = inFile.filter(line=>line.contains("Spark"))
    sparks: org.apache.spark.rdd.RDD[String] = FilteredRDD[2] at filter at <console>:14
    ****scala> sparks.count
    ****scala> exit(1)
    end 
    ****awk '{if(match($0,"SPARK"))} {print}' README.md | wc -l
    ****cat README.md | grep -rn "spark"
    3---- run error***  zyp@ubuntu:/usr/lib/jvm/scala-2.10.4/bin$ scalac
    /usr/lib/jvm/scala-2.10.4/bin/scalac: line 23: java: command not found
    ---HelloWorld for scala
    ****$  scalac Demo.scala
    zyp@ubuntu:/usr/lib/jvm/code/demo_scala$ scalac -encoding gbk Demo.scala
    ****$  scalac SampleDemo.scala
    zyp@ubuntu:/usr/lib/jvm/code/demo_scala$ scalac -encoding gbk SampleDemo.scala
    ****$  scala SampleDemo
    zyp@ubuntu:/usr/lib/jvm/code/demo_scala$ scala SampleDemo

    4---demo_ssc
    import org.apache.spark._
    import org.apache.spark.streaming._
    import org.apache.spark.streaming.StreamingContext._
    // //这里指在本地执行。2个线程。一个监听,一个处理数据
    val conf = new SparkConf().setAppName("NetworkWordCount").setMaster("local[2]")
    //// Create the context
    val ssc = new StreamingContext(conf, Seconds(20))
    val lines = ssc.textFileStream("README.md")
    val words = lines.flatMap(_.split(" "))
    val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
    wordCounts.print()
    wordCounts.saveAsTextFile("zyp.txt")
    words.count
    println("****Hello Scala! Welcome to my Zoon")
    ssc.start()
    ssc.awaitTermination()

    II---- linux 下更改文件胡权限肯用户以及用户组

    使用chown命令能够改动文件或文件夹所属的用户:
    命令:chown 用户 文件夹或文件名称
    比如:chown qq /home/qq  (把home文件夹下的qq文件夹的拥有者改为qq用户) 

    使用chgrp命令能够改动文件或文件夹所属的组:
     命令:chgrp 组 文件夹或文件名称
     比如:chgrp qq /home/qq  (把home文件夹下的qq文件夹的所属组改为qq组)

    III---- python 读取zip压缩文件

    #!/usr/bin/python
    #coding=utf-8

    import zipfile
    z = zipfile.ZipFile("test.zip", "r") ##tarfile.TarFile()
    #打印zip文件里的文件列表
    for filename in z.namelist():
        print 'File:', filename

    #读取zip文件里的第一个文件
    first_file_name = z.namelist()[1]
    content = z.read(first_file_name)
    print first_file_name

    print content

    IV 相关链接

    1-- https://spark.apache.org/examples.html

    2-- http://spark.apache.org/docs/latest/  --- Spark API ****http://spark.apache.org/docs/latest/streaming-programming-guide.html#initializing-streamingcontext

    3-- http://www.scala-lang.org/  ---- Scala API ***

    4-- SparkStream 使用

    5-- http://www.sxt.cn/info-2730-u-756.html

    6-- Spark 执行与配置

    7-- Spark RDD API具体解释(一) Map和Reduce ****

    8-- Spark入门实战系列--7.Spark Streaming(下)--实时流计算Spark Streaming实战 **

    10-- http://maven.apache.org/guides/getting-started/  Maven Getting Started Guide
    http://maven.apache.org/plugins/    ./plugins/maven-compiler-plugin/  ./plugins/maven-deploy-plugin/



  • 相关阅读:
    云计算-MapReduce
    云计算--hbase shell
    云计算--hdfs dfs 命令
    云计算--MPI
    jQuery 效果
    jQuery 效果
    JQuery效果隐藏/显示
    JQuery教程
    六级啊啊啊
    jQuery 安装
  • 原文地址:https://www.cnblogs.com/yjbjingcha/p/7041409.html
Copyright © 2020-2023  润新知