工作需要,现在开始做大数据开发了,通过下面的配置步骤,你可以在win10系统中,部署出一套hadoop+hbase,便于单机测试调试开发。
准备资料:
1.
hadoop-2.7.2:
https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/stable/
2.
hadoop-common-2.2.0-bin-master:
https://github.com/srccodes/hadoop-common-2.2.0-bin/archive/master.zip
3.
hbase-1.2.3:
http://apache.fayea.com/hbase/stable/
4.
jdk1.8:
http://dl-t1.wmzhe.com/30/30118/jdk_1.8.0.0_64.exe
以上压缩包用winrar解压失败的话,请下载安装 Cygwin 然后用它下的命令提示符下输入:tar -zxvf hadoop-2.7.2.tar.gz
将下载好的3个压缩包分别解压缩到
D:HBasehadoop-2.7.2
D:HBasehadoop-common-2.2.0-bin-master
D:HBasehbase-1.2.3
复制 D:HBasehadoop-common-2.2.0-bin-masterin 的7个文件(注意:只复制这7个) 复制到 D:HBasehadoop-2.7.2in
hadoop.dll、hadoop.exp、hadoop.lib、hadoop.pdb、libwinutils.lib、winutils.exe、winutils.pdb
配置Hadoop:
D:HBasehadoop-2.7.2etchadoop
hadoop-env.cmd 内容如下:
@echo off @rem Licensed to the Apache Software Foundation (ASF) under one or more @rem contributor license agreements. See the NOTICE file distributed with @rem this work for additional information regarding copyright ownership. @rem The ASF licenses this file to You under the Apache License, Version 2.0 @rem (the "License"); you may not use this file except in compliance with @rem the License. You may obtain a copy of the License at @rem @rem http://www.apache.org/licenses/LICENSE-2.0 @rem @rem Unless required by applicable law or agreed to in writing, software @rem distributed under the License is distributed on an "AS IS" BASIS, @rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. @rem See the License for the specific language governing permissions and @rem limitations under the License. @rem Set Hadoop-specific environment variables here. @rem The only required environment variable is JAVA_HOME. All others are @rem optional. When running a distributed configuration it is best to @rem set JAVA_HOME in this file, so that it is correctly defined on @rem remote nodes. @rem The java implementation to use. Required. set JAVA_HOME=%JAVA_HOME% @rem The jsvc implementation to use. Jsvc is required to run secure datanodes. @rem set JSVC_HOME=%JSVC_HOME% @rem set HADOOP_CONF_DIR= @rem Extra Java CLASSPATH elements. Automatically insert capacity-scheduler. if exist %HADOOP_HOME%contribcapacity-scheduler ( if not defined HADOOP_CLASSPATH ( set HADOOP_CLASSPATH=%HADOOP_HOME%contribcapacity-scheduler*.jar ) else ( set HADOOP_CLASSPATH=%HADOOP_CLASSPATH%;%HADOOP_HOME%contribcapacity-scheduler*.jar ) ) @rem The maximum amount of heap to use, in MB. Default is 1000. @rem set HADOOP_HEAPSIZE= @rem set HADOOP_NAMENODE_INIT_HEAPSIZE="" @rem Extra Java runtime options. Empty by default. @rem set HADOOP_OPTS=%HADOOP_OPTS% -Djava.net.preferIPv4Stack=true @rem Command specific options appended to HADOOP_OPTS when specified if not defined HADOOP_SECURITY_LOGGER ( set HADOOP_SECURITY_LOGGER=INFO,RFAS ) if not defined HDFS_AUDIT_LOGGER ( set HDFS_AUDIT_LOGGER=INFO,NullAppender ) set HADOOP_NAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_NAMENODE_OPTS% set HADOOP_DATANODE_OPTS=-Dhadoop.security.logger=ERROR,RFAS %HADOOP_DATANODE_OPTS% set HADOOP_SECONDARYNAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_SECONDARYNAMENODE_OPTS% @rem The following applies to multiple commands (fs, dfs, fsck, distcp etc) set HADOOP_CLIENT_OPTS=-Xmx512m %HADOOP_CLIENT_OPTS% @rem set HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData %HADOOP_JAVA_PLATFORM_OPTS%" @rem On secure datanodes, user to run the datanode as after dropping privileges set HADOOP_SECURE_DN_USER=%HADOOP_SECURE_DN_USER% @rem Where log files are stored. %HADOOP_HOME%/logs by default. @rem set HADOOP_LOG_DIR=%HADOOP_LOG_DIR%\%USERNAME% @rem Where log files are stored in the secure data environment. set HADOOP_SECURE_DN_LOG_DIR=%HADOOP_LOG_DIR%\%HADOOP_HDFS_USER% @rem The directory where pid files are stored. /tmp by default. @rem NOTE: this should be set to a directory that can only be written to by @rem the user that will run the hadoop daemons. Otherwise there is the @rem potential for a symlink attack. set HADOOP_PID_DIR=%HADOOP_PID_DIR% set HADOOP_SECURE_DN_PID_DIR=%HADOOP_PID_DIR% @rem A string representing this instance of hadoop. %USERNAME% by default. set HADOOP_IDENT_STRING=%USERNAME% set JAVA_HOME=D:Javajdk1.8.0_31 set HADOOP_HOME=D:HBasehadoop-2.7.2 set HADOOP_PREFIX=D:HBasehadoop-2.7.2 set HADOOP_CONF_DIR=%HADOOP_PREFIX%etchadoop set YARN_CONF_DIR=%HADOOP_CONF_DIR% set PATH=%PATH%;%HADOOP_PREFIX%in
core-site.xml 内容如下:
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://0.0.0.0:19000</value> </property> </configuration>
hdfs-site.xml 内容如下:
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://0.0.0.0:19000</value> </property> </configuration>
创建mapred-site.xml 内容如下:中文名字处换成阁下的WINDOWS当前登录账户名
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapreduce.job.user.name</name> <value>冯明刚</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.apps.stagingDir</name> <value>/user/冯明刚/staging</value> </property> <property> <name>mapreduce.jobtracker.address</name> <value>local</value> </property> </configuration>
<?xml version="1.0"?> <configuration> <property> <name>yarn.server.resourcemanager.address</name> <value>0.0.0.0:8020</value> </property> <property> <name>yarn.server.resourcemanager.application.expiry.interval</name> <value>60000</value> </property> <property> <name>yarn.server.nodemanager.address</name> <value>0.0.0.0:45454</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.server.nodemanager.remote-app-log-dir</name> <value>/app-logs</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/dep/logs/userlogs</value> </property> <property> <name>yarn.server.mapreduce-appmanager.attempt-listener.bindAddress</name> <value>0.0.0.0</value> </property> <property> <name>yarn.server.mapreduce-appmanager.client-service.bindAddress</name> <value>0.0.0.0</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>-1</value> </property> <property> <name>yarn.application.classpath</name> <value>%HADOOP_CONF_DIR%,%HADOOP_COMMON_HOME%/share/hadoop/common/*,%HADOOP_COMMON_HOME%/share/hadoop/common/lib/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/lib/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/lib/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/lib/*</value> </property> </configuration>
初始化Hadoop:
:> cd D:HBasehadoop-2.7.2etchadoop
:> hadoop-env.cmd
:>%HADOOP_PREFIX%inhdfs namenode -format
启动Hadoop:
:>%HADOOP_PREFIX%sbinstart-dfs.cmd
停止Hadoop:
:>%HADOOP_PREFIX%sbinstop-all.cmd
配置Hbase:
编辑hbase-site.xml内容如下
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- /** * * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ --> <configuration> <property> <name>hbase.rootdir</name> <value>file:///D:/HBase/hbase-1.2.3/root</value> </property> <property> <name>hbase.tmp.dir</name> <value>D:/HBase/hbase-1.2.3/tmp</value> </property> <property> <name>hbase.zookeeper.quorum</name><value>127.0.0.1</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name><value>D:/HBase/hbase-1.2.3/zoo</value> </property> <property> <name>hbase.cluster.distributed</name><value>false</value> </property> </configuration>
编辑hbase-env.cmd内容如下:
@rem/** @rem * Licensed to the Apache Software Foundation (ASF) under one @rem * or more contributor license agreements. See the NOTICE file @rem * distributed with this work for additional information @rem * regarding copyright ownership. The ASF licenses this file @rem * to you under the Apache License, Version 2.0 (the @rem * "License"); you may not use this file except in compliance @rem * with the License. You may obtain a copy of the License at @rem * @rem * http://www.apache.org/licenses/LICENSE-2.0 @rem * @rem * Unless required by applicable law or agreed to in writing, software @rem * distributed under the License is distributed on an "AS IS" BASIS, @rem * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. @rem * See the License for the specific language governing permissions and @rem * limitations under the License. @rem */ @rem Set environment variables here. @rem The java implementation to use. Java 1.7+ required. @rem set JAVA_HOME=c:appsjava @rem Extra Java CLASSPATH elements. Optional. @rem set HBASE_CLASSPATH= @rem The maximum amount of heap to use. Default is left to JVM default. @rem set HBASE_HEAPSIZE=1000 @rem Uncomment below if you intend to use off heap cache. For example, to allocate 8G of @rem offheap, set the value to "8G". @rem set HBASE_OFFHEAPSIZE=1000 @rem For example, to allocate 8G of offheap, to 8G: @rem etHBASE_OFFHEAPSIZE=8G @rem Extra Java runtime options. @rem Below are what we set by default. May only work with SUN JVM. @rem For more on why as well as other possible settings, @rem see http://wiki.apache.org/hadoop/PerformanceTuning @rem JDK6 on Windows has a known bug for IPv6, use preferIPv4Stack unless JDK7. @rem @rem See TestIPv6NIOServerSocketChannel. set HBASE_OPTS="-XX:+UseConcMarkSweepGC" "-Djava.net.preferIPv4Stack=true" @rem Configure PermSize. Only needed in JDK7. You can safely remove it for JDK8+ set HBASE_MASTER_OPTS=%HBASE_MASTER_OPTS% "-XX:PermSize=128m" "-XX:MaxPermSize=128m" set HBASE_REGIONSERVER_OPTS=%HBASE_REGIONSERVER_OPTS% "-XX:PermSize=128m" "-XX:MaxPermSize=128m" @rem Uncomment below to enable java garbage collection logging for the server-side processes @rem this enables basic gc logging for the server processes to the .out file @rem set SERVER_GC_OPTS="-verbose:gc" "-XX:+PrintGCDetails" "-XX:+PrintGCDateStamps" %HBASE_GC_OPTS% @rem this enables gc logging using automatic GC log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. Either use this set of options or the one above @rem set SERVER_GC_OPTS="-verbose:gc" "-XX:+PrintGCDetails" "-XX:+PrintGCDateStamps" "-XX:+UseGCLogFileRotation" "-XX:NumberOfGCLogFiles=1" "-XX:GCLogFileSize=512M" %HBASE_GC_OPTS% @rem Uncomment below to enable java garbage collection logging for the client processes in the .out file. @rem set CLIENT_GC_OPTS="-verbose:gc" "-XX:+PrintGCDetails" "-XX:+PrintGCDateStamps" %HBASE_GC_OPTS% @rem Uncomment below (along with above GC logging) to put GC information in its own logfile (will set HBASE_GC_OPTS) @rem set HBASE_USE_GC_LOGFILE=true @rem Uncomment and adjust to enable JMX exporting @rem See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access. @rem More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html @rem @rem set HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false" "-Dcom.sun.management.jmxremote.authenticate=false" @rem set HBASE_MASTER_OPTS=%HBASE_JMX_BASE% "-Dcom.sun.management.jmxremote.port=10101" @rem set HBASE_REGIONSERVER_OPTS=%HBASE_JMX_BASE% "-Dcom.sun.management.jmxremote.port=10102" @rem set HBASE_THRIFT_OPTS=%HBASE_JMX_BASE% "-Dcom.sun.management.jmxremote.port=10103" @rem set HBASE_ZOOKEEPER_OPTS=%HBASE_JMX_BASE% -Dcom.sun.management.jmxremote.port=10104" @rem File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default. @rem set HBASE_REGIONSERVERS=%HBASE_HOME%conf egionservers @rem Where log files are stored. $HBASE_HOME/logs by default. @rem set HBASE_LOG_DIR=%HBASE_HOME%logs @rem A string representing this instance of hbase. $USER by default. @rem set HBASE_IDENT_STRING=%USERNAME% @rem Seconds to sleep between slave commands. Unset by default. This @rem can be useful in large clusters, where, e.g., slave rsyncs can @rem otherwise arrive faster than the master can service them. @rem set HBASE_SLAVE_SLEEP=0.1 @rem Tell HBase whether it should manage it's own instance of Zookeeper or not. @rem set HBASE_MANAGES_ZK=true set JAVA_HOME=D:Javajdk1.8.0_31 set HADOOP_HOME=D:HBasehadoop-2.7.2 set HADOOP_PREFIX=D:HBasehadoop-2.7.2 set HADOOP_CONF_DIR=%HADOOP_PREFIX%etchadoop set YARN_CONF_DIR=%HADOOP_CONF_DIR% set PATH=%PATH%;%HADOOP_PREFIX%in
启动Hbase
:> cd D:HBasehbase-1.2.3conf
:> hbase-env.cmd
:> cd D:HBasehbase-1.2.3in
:> start-hbase.cmd
以上就配置完了,用 Hbase Shell试一下是否能操作数据库
:> cd D:HBasehbase-1.2.3in
:>hbase shell
创建表
创建一个名为 test 的表,这个表只有一个 列族 为 cf。可以列出所有的表来检查创建情况,然后插入些值。
>create 'test','cf'
插入记录
>put 'test','row1','cf:a','value1'
查询
>scan 'test'
完毕,有问题,请留言