1.启动hadoop守护进程
bin/start-all.sh
2.在hadoop的bin目录下建立一个input文件夹
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ mkdir input
3.进入input目录之后,在input目录下新建两个文本文件,并想其写入内容
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ cd input JIAS-MacBook-Pro:input jia$ echo "hello excuse me fine thank you">text1.txt JIAS-MacBook-Pro:input jia$ echo "hello how do you do thank you">text2.txt
4.进入hadoop的bin目录,输入jps命令,确认hadoop已经跑起来了
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ cd bin JIAS-MacBook-Pro:bin jia$ jps 656 SecondaryNameNode 517 NameNode 709 JobTracker 777 TaskTracker 587 DataNode 797 Jps
5.把input文件上传到hdfs上
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -put input in
6.查看hdfs上的项目
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -ls ./in/* -rw-r--r-- 1 jia supergroup 31 2014-07-17 20:39 /user/jia/in/text1.txt -rw-r--r-- 1 jia supergroup 30 2014-07-17 20:39 /user/jia/in/text2.txt
7.利用自带的wordcount执行,并把结果放在output文件夹上
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount in output 14/07/17 20:46:56 INFO input.FileInputFormat: Total input paths to process : 2 14/07/17 20:46:56 INFO mapred.JobClient: Running job: job_201407172036_0001 14/07/17 20:46:57 INFO mapred.JobClient: map 0% reduce 0% 14/07/17 20:47:04 INFO mapred.JobClient: map 100% reduce 0% 14/07/17 20:47:16 INFO mapred.JobClient: map 100% reduce 100% 14/07/17 20:47:18 INFO mapred.JobClient: Job complete: job_201407172036_0001 14/07/17 20:47:18 INFO mapred.JobClient: Counters: 17 14/07/17 20:47:18 INFO mapred.JobClient: Map-Reduce Framework 14/07/17 20:47:18 INFO mapred.JobClient: Combine output records=11 14/07/17 20:47:18 INFO mapred.JobClient: Spilled Records=22 14/07/17 20:47:18 INFO mapred.JobClient: Reduce input records=11 14/07/17 20:47:18 INFO mapred.JobClient: Reduce output records=8 14/07/17 20:47:18 INFO mapred.JobClient: Map input records=2 14/07/17 20:47:18 INFO mapred.JobClient: Map output records=13 14/07/17 20:47:18 INFO mapred.JobClient: Map output bytes=113 14/07/17 20:47:18 INFO mapred.JobClient: Reduce shuffle bytes=73 14/07/17 20:47:18 INFO mapred.JobClient: Combine input records=13 14/07/17 20:47:18 INFO mapred.JobClient: Reduce input groups=8 14/07/17 20:47:18 INFO mapred.JobClient: FileSystemCounters 14/07/17 20:47:18 INFO mapred.JobClient: HDFS_BYTES_READ=61 14/07/17 20:47:18 INFO mapred.JobClient: FILE_BYTES_WRITTEN=322 14/07/17 20:47:18 INFO mapred.JobClient: FILE_BYTES_READ=126 14/07/17 20:47:18 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=54 14/07/17 20:47:18 INFO mapred.JobClient: Job Counters 14/07/17 20:47:18 INFO mapred.JobClient: Launched map tasks=2 14/07/17 20:47:18 INFO mapred.JobClient: Launched reduce tasks=1 14/07/17 20:47:18 INFO mapred.JobClient: Data-local map tasks=2 JIAS-MacBook-Pro:hadoop-0.20.2 jia$
8.查看结果
JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -ls Found 2 items drwxr-xr-x - jia supergroup 0 2014-07-17 20:39 /user/jia/in drwxr-xr-x - jia supergroup 0 2014-07-17 20:47 /user/jia/output JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -ls ./output Found 2 items drwxr-xr-x - jia supergroup 0 2014-07-17 20:46 /user/jia/output/_logs -rw-r--r-- 1 jia supergroup 54 2014-07-17 20:47 /user/jia/output/part-r-00000 JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -cat ./output/* do 2 excuse 1 fine 1 hello 2 how 1 me 1 thank 2 you 3 cat: Source must be a file.