• spark-shell下有提示了,但是发现不能退格


    配好了Spark集群后,先用pyspark写了两个小例子,但是发现Tab键没有提示,于是打算转到scala上试试,在spark-shell下有提示了,但是发现不能退格,而且提示也不是复写,而是追加,这样根本就没法写程序.

          解决办法:

    1.打开会话选项


    2.终端-仿真    在终端中选择Linux


    3.映射键   勾选两个选项


    4.至此已经成功了,但是如果远程长时间未操作 就会中断连接,下次再操作时需要等待,其实也很影响使用,在这里也附上解决办法(可选)


    val lines =sc.textFile("hdfs://alamps:9000/wordcount/input/test.txt")

    lines.count()

    -----
    scala> val lines =sc.textFile("hdfs://alamps:9000/wordcount/input/test.txt")
    17/10/13 23:09:24 INFO MemoryStore: ensureFreeSpace(77922) called with curMem=179665, maxMem=280248975
    17/10/13 23:09:24 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 76.1 KB, free 267.0 MB)
    17/10/13 23:09:24 INFO MemoryStore: ensureFreeSpace(31262) called with curMem=257587, maxMem=280248975
    17/10/13 23:09:24 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 30.5 KB, free 267.0 MB)
    17/10/13 23:09:24 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:41619 (size: 30.5 KB, free: 267.2 MB)
    17/10/13 23:09:24 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
    17/10/13 23:09:24 INFO SparkContext: Created broadcast 1 from textFile at <console>:12
    lines: org.apache.spark.rdd.RDD[String] = hdfs://alamps:9000/wordcount/input/test.txt MappedRDD[3] at textFile at <console>:12

    scala> lines.count()
    17/10/13 23:09:45 INFO FileInputFormat: Total input paths to process : 1
    17/10/13 23:09:48 INFO SparkContext: Starting job: count at <console>:15
    17/10/13 23:09:48 INFO DAGScheduler: Got job 0 (count at <console>:15) with 1 output partitions (allowLocal=false)
    17/10/13 23:09:48 INFO DAGScheduler: Final stage: Stage 0(count at <console>:15)
    17/10/13 23:09:48 INFO DAGScheduler: Parents of final stage: List()
    17/10/13 23:09:48 INFO DAGScheduler: Missing parents: List()
    17/10/13 23:09:48 INFO DAGScheduler: Submitting Stage 0 (hdfs://alamps:9000/wordcount/input/test.txt MappedRDD[3] at textFile at <console>:12), which has no missing parents
    17/10/13 23:09:48 INFO MemoryStore: ensureFreeSpace(2544) called with curMem=288849, maxMem=280248975
    17/10/13 23:09:48 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.5 KB, free 267.0 MB)
    17/10/13 23:09:48 INFO MemoryStore: ensureFreeSpace(1898) called with curMem=291393, maxMem=280248975
    17/10/13 23:09:48 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1898.0 B, free 267.0 MB)
    17/10/13 23:09:48 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:41619 (size: 1898.0 B, free: 267.2 MB)
    17/10/13 23:09:48 INFO BlockManagerMaster: Updated info of block broadcast_2_piece0
    17/10/13 23:09:48 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:838
    17/10/13 23:09:48 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (hdfs://alamps:9000/wordcount/input/test.txt MappedRDD[3] at textFile at <console>:12)
    17/10/13 23:09:48 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
    17/10/13 23:09:48 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, ANY, 1307 bytes)
    17/10/13 23:09:48 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    17/10/13 23:09:49 INFO HadoopRDD: Input split: hdfs://alamps:9000/wordcount/input/test.txt:0+88
    17/10/13 23:09:49 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
    17/10/13 23:09:49 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
    17/10/13 23:09:49 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
    17/10/13 23:09:49 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
    17/10/13 23:09:49 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
    17/10/13 23:09:53 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1920 bytes result sent to driver
    17/10/13 23:09:53 INFO DAGScheduler: Stage 0 (count at <console>:15) finished in 4.875 s
    17/10/13 23:09:53 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 4812 ms on localhost (1/1)
    17/10/13 23:09:53 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
    17/10/13 23:09:53 INFO DAGScheduler: Job 0 finished: count at <console>:15, took 5.480197 s
    res2: Long = 8



    [hadoop@alamps sbin]$ jps
    3596 Master
    3733 Worker
    2558 DataNode
    2748 SecondaryNameNode
    3814 Jps
    2884 ResourceManager
    2986 NodeManager
    2467 NameNode
    [hadoop@alamps sbin]$ hadoop fs -ls /
    Found 11 items
    drwxr-xr-x   - hadoop supergroup          0 2017-10-02 06:29 /aaa
    drwxr-xr-x   - hadoop supergroup          0 2017-10-06 04:04 /external
    drwxr-xr-x   - hadoop supergroup          0 2017-10-04 09:14 /flowsum
    -rw-r--r--   1 hadoop supergroup         43 2017-10-02 02:52 /hello.txt
    drwxr-xr-x   - hadoop supergroup          0 2017-10-04 21:10 /index
    -rw-r--r--   1 hadoop supergroup  143588167 2017-10-01 08:38 /jdk-7u65-linux-i586.tar.gz
    drwx------   - hadoop supergroup          0 2017-10-05 22:43 /tmp
    drwxr-xr-x   - hadoop supergroup          0 2017-10-02 06:18 /upload
    drwxr-xr-x   - hadoop supergroup          0 2017-10-05 22:44 /user
    drwxr-xr-x   - hadoop supergroup          0 2017-10-03 06:20 /wc
    drwxr-xr-x   - hadoop supergroup          0 2017-10-01 09:07 /wordcount
    [hadoop@alamps sbin]$ hadoop fs -cat /wordcount
    cat: `/wordcount': Is a directory
    [hadoop@alamps sbin]$ hadoop fs -ls /wordcount
    Found 2 items
    drwxr-xr-x   - hadoop supergroup          0 2017-10-01 09:00 /wordcount/input
    drwxr-xr-x   - hadoop supergroup          0 2017-10-01 09:07 /wordcount/out
    [hadoop@alamps sbin]$ hadoop fs -ls /wordcount/input
    Found 1 items
    -rw-r--r--   1 hadoop supergroup         88 2017-10-01 09:00 /wordcount/input/test.txt
    [hadoop@alamps sbin]$ hadoop fs -cat /wordcount/input/test.txt
    hello tom
    hello java
    hello c
    hello python
    hello scala
    hello spark
    hello baby
    hello java
    [hadoop@alamps sbin]$

    val lines =sc.textFile("hdfs://alamps:9000/wordcount/input/test.txt")

    lines.count()

    -----
    scala> val lines =sc.textFile("hdfs://alamps:9000/wordcount/input/test.txt")
    17/10/13 23:09:24 INFO MemoryStore: ensureFreeSpace(77922) called with curMem=179665, maxMem=280248975
    17/10/13 23:09:24 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 76.1 KB, free 267.0 MB)
    17/10/13 23:09:24 INFO MemoryStore: ensureFreeSpace(31262) called with curMem=257587, maxMem=280248975
    17/10/13 23:09:24 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 30.5 KB, free 267.0 MB)
    17/10/13 23:09:24 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:41619 (size: 30.5 KB, free: 267.2 MB)
    17/10/13 23:09:24 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
    17/10/13 23:09:24 INFO SparkContext: Created broadcast 1 from textFile at <console>:12
    lines: org.apache.spark.rdd.RDD[String] = hdfs://alamps:9000/wordcount/input/test.txt MappedRDD[3] at textFile at <console>:12

    scala> lines.count()
    17/10/13 23:09:45 INFO FileInputFormat: Total input paths to process : 1
    17/10/13 23:09:48 INFO SparkContext: Starting job: count at <console>:15
    17/10/13 23:09:48 INFO DAGScheduler: Got job 0 (count at <console>:15) with 1 output partitions (allowLocal=false)
    17/10/13 23:09:48 INFO DAGScheduler: Final stage: Stage 0(count at <console>:15)
    17/10/13 23:09:48 INFO DAGScheduler: Parents of final stage: List()
    17/10/13 23:09:48 INFO DAGScheduler: Missing parents: List()
    17/10/13 23:09:48 INFO DAGScheduler: Submitting Stage 0 (hdfs://alamps:9000/wordcount/input/test.txt MappedRDD[3] at textFile at <console>:12), which has no missing parents
    17/10/13 23:09:48 INFO MemoryStore: ensureFreeSpace(2544) called with curMem=288849, maxMem=280248975
    17/10/13 23:09:48 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.5 KB, free 267.0 MB)
    17/10/13 23:09:48 INFO MemoryStore: ensureFreeSpace(1898) called with curMem=291393, maxMem=280248975
    17/10/13 23:09:48 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1898.0 B, free 267.0 MB)
    17/10/13 23:09:48 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:41619 (size: 1898.0 B, free: 267.2 MB)
    17/10/13 23:09:48 INFO BlockManagerMaster: Updated info of block broadcast_2_piece0
    17/10/13 23:09:48 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:838
    17/10/13 23:09:48 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (hdfs://alamps:9000/wordcount/input/test.txt MappedRDD[3] at textFile at <console>:12)
    17/10/13 23:09:48 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
    17/10/13 23:09:48 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, ANY, 1307 bytes)
    17/10/13 23:09:48 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    17/10/13 23:09:49 INFO HadoopRDD: Input split: hdfs://alamps:9000/wordcount/input/test.txt:0+88
    17/10/13 23:09:49 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
    17/10/13 23:09:49 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
    17/10/13 23:09:49 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
    17/10/13 23:09:49 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
    17/10/13 23:09:49 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
    17/10/13 23:09:53 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1920 bytes result sent to driver
    17/10/13 23:09:53 INFO DAGScheduler: Stage 0 (count at <console>:15) finished in 4.875 s
    17/10/13 23:09:53 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 4812 ms on localhost (1/1)
    17/10/13 23:09:53 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
    17/10/13 23:09:53 INFO DAGScheduler: Job 0 finished: count at <console>:15, took 5.480197 s
    res2: Long = 8



    [hadoop@alamps sbin]$ jps
    3596 Master
    3733 Worker
    2558 DataNode
    2748 SecondaryNameNode
    3814 Jps
    2884 ResourceManager
    2986 NodeManager
    2467 NameNode
    [hadoop@alamps sbin]$ hadoop fs -ls /
    Found 11 items
    drwxr-xr-x   - hadoop supergroup          0 2017-10-02 06:29 /aaa
    drwxr-xr-x   - hadoop supergroup          0 2017-10-06 04:04 /external
    drwxr-xr-x   - hadoop supergroup          0 2017-10-04 09:14 /flowsum
    -rw-r--r--   1 hadoop supergroup         43 2017-10-02 02:52 /hello.txt
    drwxr-xr-x   - hadoop supergroup          0 2017-10-04 21:10 /index
    -rw-r--r--   1 hadoop supergroup  143588167 2017-10-01 08:38 /jdk-7u65-linux-i586.tar.gz
    drwx------   - hadoop supergroup          0 2017-10-05 22:43 /tmp
    drwxr-xr-x   - hadoop supergroup          0 2017-10-02 06:18 /upload
    drwxr-xr-x   - hadoop supergroup          0 2017-10-05 22:44 /user
    drwxr-xr-x   - hadoop supergroup          0 2017-10-03 06:20 /wc
    drwxr-xr-x   - hadoop supergroup          0 2017-10-01 09:07 /wordcount
    [hadoop@alamps sbin]$ hadoop fs -cat /wordcount
    cat: `/wordcount': Is a directory
    [hadoop@alamps sbin]$ hadoop fs -ls /wordcount
    Found 2 items
    drwxr-xr-x   - hadoop supergroup          0 2017-10-01 09:00 /wordcount/input
    drwxr-xr-x   - hadoop supergroup          0 2017-10-01 09:07 /wordcount/out
    [hadoop@alamps sbin]$ hadoop fs -ls /wordcount/input
    Found 1 items
    -rw-r--r--   1 hadoop supergroup         88 2017-10-01 09:00 /wordcount/input/test.txt
    [hadoop@alamps sbin]$ hadoop fs -cat /wordcount/input/test.txt
    hello tom
    hello java
    hello c
    hello python
    hello scala
    hello spark
    hello baby
    hello java
    [hadoop@alamps sbin]$

  • 相关阅读:
    IO之同步、异步、阻塞、非阻塞 (2)
    IO之同步、异步、阻塞、非阻塞
    Syncthing源码解析
    Syncthing源码解析
    在Gogland里对GO程序进行单元测试!
    GO学习笔记
    GO学习笔记
    GO学习笔记
    GO学习笔记
    GO学习笔记
  • 原文地址:https://www.cnblogs.com/alamps/p/7667262.html
Copyright © 2020-2023  润新知