机群搭建好,执行自带wordcount时出现: Input path does not exist: hdfs://ns1/user/root/a.txt 此错误。
[root@slave1 hadoop]# ls
a.txt dfs1 include libexec name sbin test tmp2
bin etc journal LICENSE.txt NOTICE.txt share tmp zookeeper.out
data hdfs lib logs README.txt src tmp1
[root@slave1 hadoop]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.4.jar wordcount a.txt /mrout
17/11/10 17:44:39 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/root/.staging/job_1510302622448_0003
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://ns1/user/root/a.txt
出错原因:单机读取的是本地的文件,分布式环境下需要从hdfs 上读取文件。
将本地的文件上传到hdfs上,然后再运行wordcount可以成功执行。
[root@slave1 hadoop]# bin/hdfs dfs -mkdir /input
[root@slave1 hadoop]# bin/hdfs dfs iput test/a.txt /input
[root@slave1 hadoop]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.4.jar wordcount /input/a.txt /mrout2
17/11/13 10:22:48 INFO input.FileInputFormat: Total input paths to process : 1
17/11/13 10:22:49 INFO mapreduce.JobSubmitter: number of splits:1
17/11/13 10:22:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1510302622448_0010
17/11/13 10:22:50 INFO impl.YarnClientImpl: Submitted application application_1510302622448_0010
17/11/13 10:22:50 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1510302622448_0010/
17/11/13 10:22:50 INFO mapreduce.Job: Running job: job_1510302622448_0010