1、CDH页面装完hadoop后,执行报错
[root@node1 bin]# hadoop fs -ls
/opt/cloudera/parcels/CDH-5.10.2-1.cdh5.10.2.p0.5/bin/../lib/hadoop/bin/hadoop: line 144: /usr/java/jdk1.7.0_67-clouderaexport/bin/java: No such file or directory
/opt/cloudera/parcels/CDH-5.10.2-1.cdh5.10.2.p0.5/bin/../lib/hadoop/bin/hadoop: line 144: exec: /usr/java/jdk1.7.0_67-clouderaexport/bin/java: cannot execute: No such file or directory
原因:CDH上hdfs安装完后还是需要该对应配置文件和profile的(增加HADOOP_home等),因该机器原来装过一个hadoop,hadoop_home没作更改,所以报错
修改/etc/profile后ok,修改如下:
#java
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
#export JAVA_HOME=/usr/local/jdk1.8.0_191
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
#export HADOOP_HOME=/usr/local/hadoop-2.6.0-cdh5.7.0
export HADOOP_HOME=/opt/cloudera/parcels/CDH-5.10.2-1.cdh5.10.2.p0.5
2、常见操作:
#java
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
#export JAVA_HOME=/usr/local/jdk1.8.0_191
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
#export HADOOP_HOME=/usr/local/hadoop-2.6.0-cdh5.7.0
export HADOOP_HOME=/opt/cloudera/parcels/CDH-5.10.2-1.cdh5.10.2.p0.5
3、使用wordcount 统计功能
hdfs上上传一个文件:文件内容如下:
[root@node1 examples]# more /home/test/test1.sh
#!/bin/bash
#edate=$(chage -l $USER|grep "Password expires" |awk '{print $4,$5,$6,$7}')
edate=$(chage -l test|grep "Password expires" |awk '{print $4,$5,$6,$7}')
date3=$(date -d "+3 day"|awk '{print $2,$3,$6}')
if [[ $edate = "never" ]]; then
echo "never expired"
elif [[ $date3 = $edate ]]; then
echo "3 days"
else
echo "unexpired"
fi
上传文件:
[root@node1 test]# hadoop fs -put /home/test/test1.sh /tmp
查看:
[root@node1 test]# hadoop fs -ls /tmp
Found 2 items
drwxrwxrwx - hdfs supergroup 0 2019-12-24 16:35 /tmp/.cloudera_health_monitoring_canary_files
-rw-r--r-- 3 root supergroup 346 2019-12-24 16:35 /tmp/test1.sh
[root@node1 test]#
执行wordcount的mapreduce 需在 hdfs用户下,否则创建输出路径时报错:
hadoop jar /opt/cloudera/parcels/CDH-5.10.2-1.cdh5.10.2.p0.5/share/doc/hadoop-0.20-mapreduce/examples/hadoop-examples-2.6.0-mr1-cdh5.10.2.jar wordcount /tmp/test1.sh /output1
输入日志如下:
19/12/24 16:45:15 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 19/12/24 16:45:15 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 19/12/24 16:45:15 INFO input.FileInputFormat: Total input paths to process : 1 19/12/24 16:45:15 INFO mapreduce.JobSubmitter: number of splits:1 19/12/24 16:45:16 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local704420616_0001 19/12/24 16:45:16 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 19/12/24 16:45:16 INFO mapreduce.Job: Running job: job_local704420616_0001 19/12/24 16:45:16 INFO mapred.LocalJobRunner: OutputCommitter set in config null 19/12/24 16:45:16 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 19/12/24 16:45:16 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 19/12/24 16:45:16 INFO mapred.LocalJobRunner: Waiting for map tasks 19/12/24 16:45:16 INFO mapred.LocalJobRunner: Starting task: attempt_local704420616_0001_m_000000_0 19/12/24 16:45:16 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 19/12/24 16:45:16 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 19/12/24 16:45:16 INFO mapred.MapTask: Processing split: hdfs://node1:8020/tmp/test1.sh:0+346 19/12/24 16:45:16 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 19/12/24 16:45:16 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 19/12/24 16:45:16 INFO mapred.MapTask: soft limit at 83886080 19/12/24 16:45:16 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 19/12/24 16:45:16 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 19/12/24 16:45:16 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 19/12/24 16:45:16 INFO mapred.LocalJobRunner: 19/12/24 16:45:16 INFO mapred.MapTask: Starting flush of map output 19/12/24 16:45:16 INFO mapred.MapTask: Spilling map output 19/12/24 16:45:16 INFO mapred.MapTask: bufstart = 0; bufend = 524; bufvoid = 104857600 19/12/24 16:45:16 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214212(104856848); length = 185/6553600 19/12/24 16:45:16 INFO mapred.MapTask: Finished spill 0 19/12/24 16:45:16 INFO mapred.Task: Task:attempt_local704420616_0001_m_000000_0 is done. And is in the process of committing 19/12/24 16:45:16 INFO mapred.LocalJobRunner: map 19/12/24 16:45:16 INFO mapred.Task: Task 'attempt_local704420616_0001_m_000000_0' done. 19/12/24 16:45:16 INFO mapred.LocalJobRunner: Finishing task: attempt_local704420616_0001_m_000000_0 19/12/24 16:45:16 INFO mapred.LocalJobRunner: map task executor complete. 19/12/24 16:45:16 INFO mapred.LocalJobRunner: Waiting for reduce tasks 19/12/24 16:45:16 INFO mapred.LocalJobRunner: Starting task: attempt_local704420616_0001_r_000000_0 19/12/24 16:45:16 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 19/12/24 16:45:16 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 19/12/24 16:45:16 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@720cd60d 19/12/24 16:45:16 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=175793760, maxSingleShuffleLimit=43948440, mergeThreshold=116023888, ioSortFactor=10, memToMemMergeOutputsThreshold=10 19/12/24 16:45:16 INFO reduce.EventFetcher: attempt_local704420616_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events 19/12/24 16:45:16 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local704420616_0001_m_000000_0 decomp: 447 len: 451 to MEMORY 19/12/24 16:45:16 INFO reduce.InMemoryMapOutput: Read 447 bytes from map-output for attempt_local704420616_0001_m_000000_0 19/12/24 16:45:16 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 447, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->447 19/12/24 16:45:16 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning 19/12/24 16:45:16 INFO mapred.LocalJobRunner: 1 / 1 copied. 19/12/24 16:45:16 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs 19/12/24 16:45:16 INFO mapred.Merger: Merging 1 sorted segments 19/12/24 16:45:16 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 441 bytes 19/12/24 16:45:16 INFO reduce.MergeManagerImpl: Merged 1 segments, 447 bytes to disk to satisfy reduce memory limit 19/12/24 16:45:16 INFO reduce.MergeManagerImpl: Merging 1 files, 451 bytes from disk 19/12/24 16:45:16 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce 19/12/24 16:45:16 INFO mapred.Merger: Merging 1 sorted segments 19/12/24 16:45:16 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 441 bytes 19/12/24 16:45:16 INFO mapred.LocalJobRunner: 1 / 1 copied. 19/12/24 16:45:17 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords 19/12/24 16:45:17 INFO mapred.Task: Task:attempt_local704420616_0001_r_000000_0 is done. And is in the process of committing 19/12/24 16:45:17 INFO mapred.LocalJobRunner: 1 / 1 copied. 19/12/24 16:45:17 INFO mapred.Task: Task attempt_local704420616_0001_r_000000_0 is allowed to commit now 19/12/24 16:45:17 INFO output.FileOutputCommitter: Saved output of task 'attempt_local704420616_0001_r_000000_0' to hdfs://node1:8020/output1/_temporary/0/task_local704420616_0001_r_000000 19/12/24 16:45:17 INFO mapred.LocalJobRunner: reduce > reduce 19/12/24 16:45:17 INFO mapred.Task: Task 'attempt_local704420616_0001_r_000000_0' done. 19/12/24 16:45:17 INFO mapred.LocalJobRunner: Finishing task: attempt_local704420616_0001_r_000000_0 19/12/24 16:45:17 INFO mapred.LocalJobRunner: reduce task executor complete. 19/12/24 16:45:17 INFO mapreduce.Job: Job job_local704420616_0001 running in uber mode : false 19/12/24 16:45:17 INFO mapreduce.Job: map 100% reduce 100% 19/12/24 16:45:17 INFO mapreduce.Job: Job job_local704420616_0001 completed successfully 19/12/24 16:45:17 INFO mapreduce.Job: Counters: 35 File System Counters FILE: Number of bytes read=553634 FILE: Number of bytes written=1137145 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=692 HDFS: Number of bytes written=313 HDFS: Number of read operations=13 HDFS: Number of large read operations=0 HDFS: Number of write operations=4 Map-Reduce Framework Map input records=11 Map output records=47 Map output bytes=524 Map output materialized bytes=451 Input split bytes=95 Combine input records=47 Combine output records=33 Reduce input groups=33 Reduce shuffle bytes=451 Reduce input records=33 Reduce output records=33 Spilled Records=66 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=16 Total committed heap usage (bytes)=504365056 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=346 File Output Format Counters Bytes Written=313
查看output1目录下的统计:
[hdfs@node1 examples]$ hadoop fs -cat /output1/*
"+3 1
"3 1
"Password 2
"never 1
"never" 1
"unexpired" 1
#!/bin/bash 1
#edate=$(chage 1
$2,$3,$6}') 1
$4,$5,$6,$7}') 2
$USER|grep 1
$date3 1
$edate 2
'{print 3
-d 1
-l 2
= 2
[[ 2
]]; 2
date3=$(date 1
day"|awk 1
days" 1
echo 3
edate=$(chage 1
elif 1
else 1
expired" 1
expires" 2
fi 1
if 1
test|grep 1
then 2
|awk 2
完成ok~