• window下部署单机hadoop环境


    window本地部署单机hadoop,修改配置文件和脚本如下,只记录关键配置和步骤,仅供参考

    • hadoop-2.6.5
    • spark-2.3.3

    1.配置文件core-site.xml

    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://localhost:9000</value>
        </property>
    	<property>
    		<name>hadoop.data.dir</name>
    		<value>file:/D:/02_bigdata/hadoop-2.6.5/data</value>
    	</property>
    	<property>
    		<name>hadoop.tmp.dir</name>
    		<value>${hadoop.data.dir}</value>
    	</property>
    	<property>
    		<name>hadoop.http.staticuser.user</name>
    		<value>${user.name}</value>
    	</property>
    </configuration>
    
    

    2.配置文件hdfs-site.xml

    <configuration>
    	<property>
    			<name>dfs.namenode.http-address</name>
    			<value>0.0.0.0:50070</value>
    	</property>
    	<property>
    			<name>dfs.replication</name>
    			<value>1</value>
    	</property>
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>${hadoop.data.dir}/dfs/name</value>
        </property>
        <property>
            <name>dfs.datanode.data.dir</name>
            <value>${hadoop.data.dir}/dfs/data</value>
        </property>
    </configuration>
    
    

    3.配置文件mapred-site.xml

    <configuration>
    	<property>
    			<name>mapreduce.framework.name</name>
    			<value>yarn</value>
    	</property>
    </configuration>
    

    4.配置文件yarn-site.xml

    <configuration>
    
    <!-- Site specific YARN configuration properties -->
            <property>
                    <name>yarn.nodemanager.aux-services</name>
                    <value>mapreduce_shuffle</value>
            </property>
            <property>
                    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
                    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
            </property>
            <property>
                    <name>yarn.scheduler.minimum-allocation-mb</name>
                    <value>512</value>
            </property>
    		<property>
    			<name>yarn.log-aggregation-enable</name>
    			<value>true</value>
    		</property>
    		 <property>
    			<name>yarn.log-aggregation.retain-seconds</name>
    			<value>2592000</value>
    		</property>
    		<property>
    			<name>yarn.log.server.url</name>
    			<value>http://localhost:19888/jobhistory/logs</value>
    		</property>
    		<property>
    			<name>yarn.nodemanager.remote-app-log-dir</name>
    			<value>hdfs://localhost:9000/user/merit/yarn-logs/</value>
    		</property>
    		
    		<property>
    			<name>yarn.nodemanager.address</name>
    			<value>localhost:8041</value>
    		</property>
    </configuration>
    

    5.环境设置脚本hadoop-env.cmd

    @rem 设置JAVA_HOME
    set JAVA_HOME=D:\06_devptools\jdk1_8_0_73
    

    6.环境设置脚本yarn-env.cmd

    @rem 设置yarn组件的日志文件名称
    set YARN_RESOURCEMANAGER_OPTS=-Dyarn.log.file=YARN-RESOURCEMANAGER.log -Dhadoop.log.file=YARN-RESOURCEMANAGER.log
    set HADOOP_NODEMANAGER_OPTS=-Dyarn.log.file=YARN-NODEMANAGER.log -Dhadoop.log.file=YARN-NODEMANAGER.log
    set HADOOP_HISTORYSERVER_OPTS=-Dyarn.log.file=YARN-HISTORYSERVER.log -Dhadoop.log.file=YARN-HISTORYSERVER.log
    

    7.启动脚本start-dfs.cmd

    @rem 设置Path
    set PATH=%HADOOP_HOME%\bin;%PATH%
    
    start "Apache Hadoop Distribution" hadoop namenode
    start "Apache Hadoop Distribution" hadoop datanode
    
    

    8.启动脚本start-yarn.cmd

    @rem 设置Path
    set PATH=%HADOOP_HOME%\bin;%PATH%
    @rem start resourceManager
    start "Apache Hadoop Distribution" yarn resourcemanager
    @rem start nodeManager
    start "Apache Hadoop Distribution" yarn nodemanager
    
    @rem 修改默认的historyserver启动脚本,将yarn historyserver改为mapred historyserver
    @rem start historyserver
    start "Apache Hadoop Distribution" mapred historyserver
    

    9.测试yarn运行mapreduce任务

    C:\Windows\system32>D:\02_bigdata\hadoop-2.6.5\bin\hadoop jar D:\02_bigdata\hadoop-2.6.5\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.6.5.jar pi 12 3
    Number of Maps  = 12
    Samples per Map = 3
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Wrote input for Map #3
    Wrote input for Map #4
    Wrote input for Map #5
    Wrote input for Map #6
    Wrote input for Map #7
    Wrote input for Map #8
    Wrote input for Map #9
    Wrote input for Map #10
    Wrote input for Map #11
    Starting Job
    22/06/18 14:09:30 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
    22/06/18 14:09:30 INFO input.FileInputFormat: Total input paths to process : 12
    22/06/18 14:09:30 INFO mapreduce.JobSubmitter: number of splits:12
    22/06/18 14:09:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1655532552208_0001
    22/06/18 14:09:30 INFO impl.YarnClientImpl: Submitted application application_1655532552208_0001
    22/06/18 14:09:30 INFO mapreduce.Job: The url to track the job: http://LAPTOP-TC4A0SCV:8088/proxy/application_1655532552208_0001/
    22/06/18 14:09:30 INFO mapreduce.Job: Running job: job_1655532552208_0001
    22/06/18 14:09:43 INFO mapreduce.Job: Job job_1655532552208_0001 running in uber mode : false
    22/06/18 14:09:43 INFO mapreduce.Job:  map 0% reduce 0%
    22/06/18 14:09:55 INFO mapreduce.Job:  map 8% reduce 0%
    22/06/18 14:09:58 INFO mapreduce.Job:  map 17% reduce 0%
    22/06/18 14:10:02 INFO mapreduce.Job:  map 25% reduce 0%
    22/06/18 14:10:05 INFO mapreduce.Job:  map 42% reduce 0%
    22/06/18 14:10:06 INFO mapreduce.Job:  map 50% reduce 0%
    22/06/18 14:10:08 INFO mapreduce.Job:  map 58% reduce 0%
    22/06/18 14:10:16 INFO mapreduce.Job:  map 58% reduce 19%
    22/06/18 14:10:21 INFO mapreduce.Job:  map 67% reduce 19%
    22/06/18 14:10:23 INFO mapreduce.Job:  map 75% reduce 19%
    22/06/18 14:10:24 INFO mapreduce.Job:  map 75% reduce 25%
    22/06/18 14:10:25 INFO mapreduce.Job:  map 100% reduce 25%
    22/06/18 14:10:27 INFO mapreduce.Job:  map 100% reduce 100%
    22/06/18 14:10:27 INFO mapreduce.Job: Job job_1655532552208_0001 completed successfully
    22/06/18 14:10:27 INFO mapreduce.Job: Counters: 49
            File System Counters
                    FILE: Number of bytes read=270
                    FILE: Number of bytes written=1413432
                    FILE: Number of read operations=0
                    FILE: Number of large read operations=0
                    FILE: Number of write operations=0
                    HDFS: Number of bytes read=3182
                    HDFS: Number of bytes written=215
                    HDFS: Number of read operations=51
                    HDFS: Number of large read operations=0
                    HDFS: Number of write operations=3
            Job Counters
                    Launched map tasks=12
                    Launched reduce tasks=1
                    Rack-local map tasks=12
                    Total time spent by all maps in occupied slots (ms)=321884
                    Total time spent by all reduces in occupied slots (ms)=53392
                    Total time spent by all map tasks (ms)=160942
                    Total time spent by all reduce tasks (ms)=26696
                    Total vcore-milliseconds taken by all map tasks=160942
                    Total vcore-milliseconds taken by all reduce tasks=26696
                    Total megabyte-milliseconds taken by all map tasks=164804608
                    Total megabyte-milliseconds taken by all reduce tasks=27336704
            Map-Reduce Framework
                    Map input records=12
                    Map output records=24
                    Map output bytes=216
                    Map output materialized bytes=336
                    Input split bytes=1766
                    Combine input records=0
                    Combine output records=0
                    Reduce input groups=2
                    Reduce shuffle bytes=336
                    Reduce input records=24
                    Reduce output records=0
                    Spilled Records=48
                    Shuffled Maps =12
                    Failed Shuffles=0
                    Merged Map outputs=12
                    GC time elapsed (ms)=877
                    CPU time spent (ms)=8426
                    Physical memory (bytes) snapshot=3624378368
                    Virtual memory (bytes) snapshot=4172316672
                    Total committed heap usage (bytes)=2562195456
            Shuffle Errors
                    BAD_ID=0
                    CONNECTION=0
                    IO_ERROR=0
                    WRONG_LENGTH=0
                    WRONG_MAP=0
                    WRONG_REDUCE=0
            File Input Format Counters
                    Bytes Read=1416
            File Output Format Counters
                    Bytes Written=97
    Job Finished in 57.683 seconds
    Estimated value of Pi is 3.44444444444444444444
    

    10.测试yarn运行spark任务

    D:\02_bigdata\spark-2.3.3-bin-hadoop2.6>bin\spark-submit.cmd --master yarn --deploy-mode cluster --class org.apache.spark.examples.SparkPi examples\jars\spark-examples_2.11-2.3.3.jar 122
    22/06/18 14:11:15 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    22/06/18 14:11:15 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
    22/06/18 14:11:15 INFO Client: Requesting a new application from cluster with 1 NodeManagers
    22/06/18 14:11:15 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
    22/06/18 14:11:15 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
    22/06/18 14:11:15 INFO Client: Setting up container launch context for our AM
    22/06/18 14:11:15 INFO Client: Setting up the launch environment for our AM container
    22/06/18 14:11:15 INFO Client: Preparing resources for our AM container
    22/06/18 14:11:16 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
    22/06/18 14:11:20 INFO Client: Uploading resource file:/C:/Users/merit/AppData/Local/Temp/spark-c2946869-0b04-4a44-97b6-b8389f691999/__spark_libs__3819455099437137527.zip -> file:/C:/Users/merit/.sparkStaging/application_1655532552208_0002/__spark_libs__3819455099437137527.zip
    22/06/18 14:11:21 INFO Client: Uploading resource file:/D:/02_bigdata/spark-2.3.3-bin-hadoop2.6/examples/jars/spark-examples_2.11-2.3.3.jar -> file:/C:/Users/merit/.sparkStaging/application_1655532552208_0002/spark-examples_2.11-2.3.3.jar
    22/06/18 14:11:22 INFO Client: Uploading resource file:/C:/Users/merit/AppData/Local/Temp/spark-c2946869-0b04-4a44-97b6-b8389f691999/__spark_conf__1079735780404125589.zip -> file:/C:/Users/merit/.sparkStaging/application_1655532552208_0002/__spark_conf__.zip
    22/06/18 14:11:22 INFO SecurityManager: Changing view acls to: merit
    22/06/18 14:11:22 INFO SecurityManager: Changing modify acls to: merit
    22/06/18 14:11:22 INFO SecurityManager: Changing view acls groups to:
    22/06/18 14:11:22 INFO SecurityManager: Changing modify acls groups to:
    22/06/18 14:11:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(merit); groups with view permissions: Set(); users  with modify permissions: Set(merit); groups with modify permissions: Set()
    22/06/18 14:11:22 INFO Client: Submitting application application_1655532552208_0002 to ResourceManager
    22/06/18 14:11:22 INFO YarnClientImpl: Submitted application application_1655532552208_0002
    22/06/18 14:11:23 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:23 INFO Client:
             client token: N/A
             diagnostics: N/A
             ApplicationMaster host: N/A
             ApplicationMaster RPC port: -1
             queue: default
             start time: 1655532682804
             final status: UNDEFINED
             tracking URL: http://LAPTOP-TC4A0SCV:8088/proxy/application_1655532552208_0002/
             user: merit
    22/06/18 14:11:24 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:25 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:26 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:27 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:28 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:29 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:30 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:31 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:32 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:33 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:34 INFO Client: Application report for application_1655532552208_0002 (state: ACCEPTED)
    22/06/18 14:11:35 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:35 INFO Client:
             client token: N/A
             diagnostics: N/A
             ApplicationMaster host: 191.168.2.78
             ApplicationMaster RPC port: 0
             queue: default
             start time: 1655532682804
             final status: UNDEFINED
             tracking URL: http://LAPTOP-TC4A0SCV:8088/proxy/application_1655532552208_0002/
             user: merit
    22/06/18 14:11:36 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:37 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:38 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:39 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:40 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:41 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:42 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:44 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:45 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:46 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:47 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:48 INFO Client: Application report for application_1655532552208_0002 (state: RUNNING)
    22/06/18 14:11:49 INFO Client: Application report for application_1655532552208_0002 (state: FINISHED)
    22/06/18 14:11:49 INFO Client:
             client token: N/A
             diagnostics: N/A
             ApplicationMaster host: 191.168.2.78
             ApplicationMaster RPC port: 0
             queue: default
             start time: 1655532682804
             final status: SUCCEEDED
             tracking URL: http://LAPTOP-TC4A0SCV:8088/proxy/application_1655532552208_0002/
             user: merit
    22/06/18 14:11:49 INFO ShutdownHookManager: Shutdown hook called
    22/06/18 14:11:49 INFO ShutdownHookManager: Deleting directory C:\Users\merit\AppData\Local\Temp\spark-0282c910-523b-495d-bae8-f42d4559dac2
    22/06/18 14:11:49 INFO ShutdownHookManager: Deleting directory C:\Users\merit\AppData\Local\Temp\spark-c2946869-0b04-4a44-97b6-b8389f691999
    
  • 相关阅读:
    SVN服务器搭建和使用(二)
    SVN服务器搭建和使用(一)
    【CentOs】配置nginx
    【CentOs】sudo使用
    【CentOS】搭建Web服务器
    【CentOS】搭建git服务器
    【Python】内置数据类型
    【Python】Eclipse和pydev搭建Python开发环境
    【Python】一个简单的例子
    【Python】vim7.4 配置python2.6支持Gundo
  • 原文地址:https://www.cnblogs.com/flowerbirds/p/16388256.html
Copyright © 2020-2023  润新知