• Spark笔记


    [root@master bin]# ./spark-shell  --master yarn-client 
    20/03/21 02:48:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    20/03/21 02:48:40 INFO spark.SecurityManager: Changing view acls to: root
    20/03/21 02:48:40 INFO spark.SecurityManager: Changing modify acls to: root
    20/03/21 02:48:40 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
    20/03/21 02:48:40 INFO spark.HttpServer: Starting HTTP Server
    20/03/21 02:48:41 INFO server.Server: jetty-8.y.z-SNAPSHOT
    20/03/21 02:48:41 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:50659
    20/03/21 02:48:41 INFO util.Utils: Successfully started service 'HTTP class server' on port 50659.
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _ / _ / _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_   version 1.6.0
          /_/
    
    Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45)
    Type in expressions to have them evaluated.
    Type :help for more information.
    20/03/21 02:48:52 INFO spark.SparkContext: Running Spark version 1.6.0
    20/03/21 02:48:52 INFO spark.SecurityManager: Changing view acls to: root
    20/03/21 02:48:52 INFO spark.SecurityManager: Changing modify acls to: root
    20/03/21 02:48:52 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
    20/03/21 02:48:53 INFO util.Utils: Successfully started service 'sparkDriver' on port 38244.
    20/03/21 02:48:54 INFO slf4j.Slf4jLogger: Slf4jLogger started
    20/03/21 02:48:54 INFO Remoting: Starting remoting
    20/03/21 02:48:55 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.0.90:44009]
    20/03/21 02:48:55 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 44009.
    20/03/21 02:48:55 INFO spark.SparkEnv: Registering MapOutputTracker
    20/03/21 02:48:55 INFO spark.SparkEnv: Registering BlockManagerMaster
    20/03/21 02:48:55 INFO storage.DiskBlockManager: Created local directory at /usr/local/src/spark-1.6.0-bin-hadoop2.6/blockmgr-bae586ec-9c91-4d9b-bbec-682c94e5fe14
    20/03/21 02:48:55 INFO storage.MemoryStore: MemoryStore started with capacity 511.5 MB
    20/03/21 02:48:56 INFO spark.SparkEnv: Registering OutputCommitCoordinator
    20/03/21 02:48:56 INFO server.Server: jetty-8.y.z-SNAPSHOT
    20/03/21 02:48:56 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
    20/03/21 02:48:56 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
    20/03/21 02:48:56 INFO ui.SparkUI: Started SparkUI at http://192.168.0.90:4040
    20/03/21 02:48:57 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.0.90:8032
    20/03/21 02:48:58 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
    20/03/21 02:48:58 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
    20/03/21 02:48:58 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
    20/03/21 02:48:58 INFO yarn.Client: Setting up container launch context for our AM
    20/03/21 02:48:58 INFO yarn.Client: Setting up the launch environment for our AM container
    20/03/21 02:48:58 INFO yarn.Client: Preparing resources for our AM container
    20/03/21 02:49:00 INFO yarn.Client: Uploading resource file:/usr/local/src/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar -> hdfs://master:9000/user/root/.sparkStaging/application_1584680108277_0036/spark-assembly-1.6.0-hadoop2.6.0.jar
    scala> lines.flatMap(_.split(" "))
    res25: List[String] = List(Preface, “The, Forsyte, Saga”, was, the, title, originally, destined, for, that, part, of, it, which, is, called, “The, Man, of, Property”;, and, to, adopt, it, for, the, collected, chronicles, of, the, Forsyte, family, has, indulged, the, Forsytean, tenacity, that, is, in, all, of, us., The, word, Saga, might, be, objected, to, on, the, ground, that, it, connotes, the, heroic, and, that, there, is, little, heroism, in, these, pages., But, it, is, used, with, a, suitable, irony;, and,, after, all,, this, long, tale,, though, it, may, deal, with, folk, in, frock, coats,, furbelows,, and, a, gilt-edged, period,, is, not, devoid, of, the, essential, heat, of, conflict., Discounting, for, the, gigantic, stature, and, blood-thirstiness, of, old, days,, as, they, ha...
    scala> lines.flatMap(_.split(" ")).map((_,1)).groupBy(_._1).mapValues((_.size)).toArray.sortWith(_._2>_._2).slice(0,10) 
    res26: Array[(String, Int)] = Array((the,5144), (of,3407), (to,2782), (and,2573), (a,2543), (he,2139), (his,1912), (was,1702), (in,1694), (had,1526))
    Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45)
    Type in expressions to have them evaluated.
    Type :help for more information.
    
    scala> import spark.sql
    import spark.sql
    
    scala> val orders = sql("select * from badou.orders")
    20/03/22 10:23:16 ERROR metastore.ObjectStore: Version information found in metastore differs 0.13.0 from expected schema version 1.2.0. Schema verififcation is disabled hive.metastore.schema.verification so setting version.
    20/03/22 10:23:20 ERROR metastore.RetryingHMSHandler: AlreadyExistsException(message:Database default already exists)
            at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:891)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
            at com.sun.proxy.$Proxy17.create_database(Unknown Source)
            at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:644)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
            at com.sun.proxy.$Proxy18.createDatabase(Unknown Source)
            at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:306)
            at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply$mcV$sp(HiveClientImpl.scala:309)
            at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply(HiveClientImpl.scala:309)
            at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply(HiveClientImpl.scala:309)
            at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:280)
            at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:227)
            at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:226)
            at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:269)
            at org.apache.spark.sql.hive.client.HiveClientImpl.createDatabase(HiveClientImpl.scala:308)
            at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply$mcV$sp(HiveExternalCatalog.scala:99)
            at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply(HiveExternalCatalog.scala:99)
            at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply(HiveExternalCatalog.scala:99)
            at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:72)
            at org.apache.spark.sql.hive.HiveExternalCatalog.createDatabase(HiveExternalCatalog.scala:98)
            at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createDatabase(SessionCatalog.scala:147)
            at org.apache.spark.sql.catalyst.catalog.SessionCatalog.<init>(SessionCatalog.scala:89)
            at org.apache.spark.sql.hive.HiveSessionCatalog.<init>(HiveSessionCatalog.scala:51)
            at org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:49)
            at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
            at org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(HiveSessionState.scala:63)
            at org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
            at org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
            at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
            at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
            at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
            at $line18.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:24)
            at $line18.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:29)
            at $line18.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31)
            at $line18.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33)
            at $line18.$read$$iw$$iw$$iw$$iw.<init>(<console>:35)
            at $line18.$read$$iw$$iw$$iw.<init>(<console>:37)
            at $line18.$read$$iw$$iw.<init>(<console>:39)
            at $line18.$read$$iw.<init>(<console>:41)
            at $line18.$read.<init>(<console>:43)
            at $line18.$read$.<init>(<console>:47)
            at $line18.$read$.<clinit>(<console>)
            at $line18.$eval$.$print$lzycompute(<console>:7)
            at $line18.$eval$.$print(<console>:6)
            at $line18.$eval.$print(<console>)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
            at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
            at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
            at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
            at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
            at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
            at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
            at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
            at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
            at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
            at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
            at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
            at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:415)
            at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923)
            at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
            at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
            at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
            at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
            at org.apache.spark.repl.Main$.doMain(Main.scala:68)
            at org.apache.spark.repl.Main$.main(Main.scala:51)
            at org.apache.spark.repl.Main.main(Main.scala)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
            at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
            at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
            at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
            at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    
    orders: org.apache.spark.sql.DataFrame = [order_id: string, user_id: string ... 5 more fields]
    
    scala> orders.show(10)
    +--------+-------+--------+------------+---------+-----------------+----------------------+
    |order_id|user_id|eval_set|order_number|order_dow|order_hour_of_day|days_since_prior_order|
    +--------+-------+--------+------------+---------+-----------------+----------------------+
    |order_id|user_id|eval_set|order_number|order_dow|order_hour_of_day|  days_since_prior_...|
    | 2539329|      1|   prior|           1|        2|               08|                      |
    | 2398795|      1|   prior|           2|        3|               07|                  15.0|
    |  473747|      1|   prior|           3|        3|               12|                  21.0|
    | 2254736|      1|   prior|           4|        4|               07|                  29.0|
    |  431534|      1|   prior|           5|        4|               15|                  28.0|
    | 3367565|      1|   prior|           6|        2|               07|                  19.0|
    |  550135|      1|   prior|           7|        1|               09|                  20.0|
    | 3108588|      1|   prior|           8|        1|               14|                  14.0|
    | 2295261|      1|   prior|           9|        1|               16|                   0.0|
    +--------+-------+--------+------------+---------+-----------------+----------------------+
    only showing top 10 rows
  • 相关阅读:
    【题解】【BT】【Leetcode】Populating Next Right Pointers in Each Node
    【题解】【BT】【Leetcode】Binary Tree Level Order Traversal
    【题解】【BST】【Leetcode】Unique Binary Search Trees
    【题解】【矩阵】【回溯】【Leetcode】Rotate Image
    【题解】【排列组合】【素数】【Leetcode】Unique Paths
    【题解】【矩阵】【回溯】【Leetcode】Unique Paths II
    【题解】【BST】【Leetcode】Validate Binary Search Tree
    【题解】【BST】【Leetcode】Convert Sorted Array to Binary Search Tree
    第 10 章 判断用户是否登录
    第 8 章 动态管理资源结合自定义登录页面
  • 原文地址:https://www.cnblogs.com/hackerer/p/12541479.html
Copyright © 2020-2023  润新知