spark 配置使用

介绍

Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports **a rich set of higher-level tools **including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.
直接到官网下载代码，
需要安装jdk，java已经设置到系统路径PATH，或者设置JAVA_HOME。

运行例子：
use bin/run-example [params] in the top-level Spark directory.
./bin/run-example SparkPi 10
使用交互探索Spark，不过是scala语言
./bin/spark-shell --master local[2]
使用run Spark shell with the --help option可以查命令参数.
Spark also provides a Python API. To run Spark interactively in a Python interpreter, use bin/pyspark:
./bin/pyspark --master local[2]
也可以将这个包导入到python的包文件里，这里提供了方法：
http://blog.csdn.net/sinat_26599509/article/details/51895999
Example applications are also provided in Python.当然可以自学去看怎么是实现。可以运行：
./bin/spark-submit examples/src/main/python/pi.py 10
对于R，也是一样的可以运行交互环境 sparkR。也可以和上面使用spark-submit 运行R文件

相关阅读:
DHCP配置实例
upupw phpmyadmin写shell
网络配置课学期总结
c#写一个网站后台扫描器
移位运算符
JavaScript 事件
JS自动爆炸案例
生成树协议
暴力操作节点
为博客园添加统计访问量的工具

原文地址：https://www.cnblogs.com/logmopeng/p/7439894.html