注:之前本人写了一篇SparkR的安装部署文章:SparkR安装部署及数据分析实例,当时SparkR项目还没正式入主Spark,需要自己下载SparkR安装包,但现在spark已经支持R接口,so更新了这篇文章。
1、Hadoop安装
参考:
http://www.linuxidc.com/Linux/2015-11/124800.htm
http://blog.csdn.net/sa14023053/article/details/51952534
yarn-site.xml
1 <property> 2 <name>yarn.scheduler.maximum-allocation-mb</name> 3 <value>57344</value> 4 </property> 5 6 <property> 7 <name>yarn.scheduler.minimum-allocation-mb</name> 8 <value>2048</value> 9 </property> 10 11 <property> 12 <name>yarn.nodemanager.resource.memory-mb</name> 13 <value>57344</value> 14 </property> 15 16 <property> 17 <name>yarn.app.mapreduce.am.resource.mb</name> 18 <value>2048</value> 19 </property>
2、Spark安装
参考:
http://blog.csdn.net/sa14023053/article/details/51953836
vim spark-defaults.conf
1 spark.master spark://hadoopmaster:7077 2 spark.eventLog.enabled true 3 spark.eventLog.dir hdfs://hadoopmaster:9000/directory 4 spark.yarn.historyServer.address hadoopmaster:18080 5 spark.serializer org.apache.spark.serializer.KryoSerializer 6 spark.driver.memory 30g 7 spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
vim spark-env.sh
1 export SCALA_HOME=/xxx/scala-2.11.8 2 export JAVA_HOME=/usr/java/jdk1.8.0_101 3 export HADOOP_HOME=/xxx/hadoop-2.7.3 4 export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop 5 export SPARK_MASTER_IP=XXXX 6 export SPARK_LOCAL_DIRS=/xxx/spark-2.0.0-bin-hadoop2.7 7 export SPARK_WORKER_MEMORY=110g 8 export SPARK_LOCAL_DIRS=/data/sparkdata/local 9 SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true" 10 export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://hadoopmaster:9000/directory"
3、Mysql安装
参考:
http://blog.csdn.net/wendi_0506/article/details/39478369
4、Hive安装
http://blog.csdn.net/lnho2015/article/details/51355511
http://blog.csdn.net/blue_jjw/article/details/50479263
hive-site.xml配置如下:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> <description>location of default database for the warehouse</description> </property> <property> <name>hive.querylog.location</name> <value>/user/hive/log</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> <description>password to use against metastore database</description> </property> </configuration>
hive客户端配置(scp hadoopmaster上的hive安装目录到slaver1,修改hive_site.xml配置):
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.uris</name> <value>thrift://hadoopmaster:9083</value> </property> </configuration>
5、R安装
5.1 安装相应软件
yum -y groupinstall 'Development Tools'
yum -y install gfortran
yum -y install cmake
yum -y groupinstall "X Window System"
yum install gcc
yum install gcc-c++
yum install gcc-gfortran
yum install bzip2-libs
yum install bzip2-devel
yum -y install xz-devel.x86_64
yum install libcurl-devel.x86_64
yum -y install readline-devel tcl tk libX11-devel libXtst-devel xorg-x11-xtrans-devel libpng-devel libXt-devel
5.2 编译和安装
./configure --prefix /usr/R --enable-R-shlib --with-readline=yes --with-x=yes --with-tcltk=yes --with-cairo=yes --with-libpng=yes --with-jpeglib=yes --with-libtiff=yes --with-aqua=yes --with-ICU=yes --with-libcurl=yes --enable-utf8
make
make install
vim /etc/profile:
export R_HOME=/usr/R/lib64/R
source /etc/profile
R CMD javareconf
source /usr/R/lib64/R/etc/ldpaths
vim ~/.bash_profile
添加PATH=/usr/R/bin:$PATH
然后source ~/.bash_profile
ok,SparkR集群安装完成!