使用dfsadmin使用程序执行HDFS操作
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
一.hdfs dfsadmin概述
可以使用hdfs dfsadmin命令从明朗了和管理HDFS。虽然使用hdfs dfs命令也可以管理HDFS文件和目录,但dfsadmin命令空间以执行HDFS特定的管理任务。 [root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin Usage: hdfs dfsadmin Note: Administrative commands can only be run as the HDFS superuser. [-report [-live] [-dead] [-decommissioning] [-enteringmaintenance] [-inmaintenance]] [-safemode <enter | leave | get | wait>] [-saveNamespace] [-rollEdits] [-restoreFailedStorage true|false|check] [-refreshNodes] [-setQuota <quota> <dirname>...<dirname>] [-clrQuota <dirname>...<dirname>] [-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>] [-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>] [-finalizeUpgrade] [-rollingUpgrade [<query|prepare|finalize>]] [-refreshServiceAcl] [-refreshUserToGroupsMappings] [-refreshSuperUserGroupsConfiguration] [-refreshCallQueue] [-refresh <host:ipc_port> <key> [arg1..argn] [-reconfig <namenode|datanode> <host:ipc_port> <start|status|properties>] [-printTopology] [-refreshNamenodes datanode_host:ipc_port] [-getVolumeReport datanode_host:ipc_port] [-deleteBlockPool datanode_host:ipc_port blockpoolId [force]] [-setBalancerBandwidth <bandwidth in bytes per second>] [-getBalancerBandwidth <datanode_host:ipc_port>] [-fetchImage <local directory>] [-allowSnapshot <snapshotDir>] [-disallowSnapshot <snapshotDir>] [-shutdownDatanode <datanode_host:ipc_port> [upgrade]] [-evictWriters <datanode_host:ipc_port>] [-getDatanodeInfo <datanode_host:ipc_port>] [-metasave filename] [-triggerBlockReport [-incremental] <datanode_host:ipc_port>] [-listOpenFiles] [-help [cmd]] Generic options supported are: -conf <configuration file> specify an application configuration file -D <property=value> define a value for a given property -fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations. -jt <local|resourcemanager:port> specify a ResourceManager -files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster -libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath -archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines The general command line syntax is: command [genericOptions] [commandOptions] [root@hadoop101.yinzhengjie.com ~]#
二.dfsadmin -report命令
使用dfsamin工具可以检查HDFS集群的状态。dfsadmin -report命令能够显示集群的基本统计信息,包括DataNode和NameNode的状态,配置的磁盘容量和数据块的运行状态等有用的信息。 dfsadmin -report命令显示集群和各个DataNode级的以下信息(下面是一个使用dfsadmin -report命令的示例): (1)HDFS存储分配的摘要,包括有关已配置,已用和剩余空间的信息; (2)如果已配置集中式HDFS缓存,则显示使用和剩余的缓存百分比; (3)缺少,损坏和少于复制因子的块; [root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin -report Configured Capacity: 16493959577600 (15.00 TB) #此集群中HDFS的已配置容量 Present Capacity: 16493959577600 (15.00 TB) #此集群中现有的容量 DFS Remaining: 16493167906816 (15.00 TB) #此集群中剩余容量 DFS Used: 791670784 (755.00 MB) #HDFS使用的存储统计信息 DFS Used%: 0.00% #同上,只不过以百分比显示而已 Under replicated blocks: 16 #显示是否由任何未充分复制,损坏或丢失的块 Blocks with corrupt replicas: 0 #具有损坏副本的块 Missing blocks: 0 #丢失的块 Missing blocks (with replication factor 1): 0 #丢失的块(复制因子为1) Pending deletion blocks: 0 #挂起的删除块。 ------------------------------------------------- Live datanodes (2): #显示集群中由多少个DataNode是活动的并可用,虽然我有3个DN节点,但只有2个是正常工作的,通过NameNode的WebUI查看也是如此,如下图所示。 Name: 172.200.6.102:50010 (hadoop102.yinzhengjie.com) #DN节点的IP地址及端口号 Hostname: hadoop102.yinzhengjie.com #DN节点的主机名 Rack: /rack001 #该DN节点的机架编号 Decommission Status : Normal #DataNode的退役状态 Configured Capacity: 8246979788800 (7.50 TB) #DN节点的配置容量 DFS Used: 395841536 (377.50 MB) #DN节点的使用容量 Non DFS Used: 0 (0 B) #未使用的容量 DFS Remaining: 8246583947264 (7.50 TB) #剩余的容量 DFS Used%: 0.00% #DN节点的使用百分比 DFS Remaining%: 100.00% #DN节点的剩余百分比 Configured Cache Capacity: 32000000 (30.52 MB) #缓存使用情况 Cache Used: 319488 (312 KB) Cache Remaining: 31680512 (30.21 MB) Cache Used%: 1.00% Cache Remaining%: 99.00% Xceivers: 2 Last contact: Mon Aug 17 05:08:10 CST 2020 Last Block Report: Mon Aug 17 04:18:40 CST 2020 Name: 172.200.6.103:50010 (hadoop103.yinzhengjie.com) Hostname: hadoop103.yinzhengjie.com Rack: /rack002 Decommission Status : Normal Configured Capacity: 8246979788800 (7.50 TB) DFS Used: 395829248 (377.49 MB) Non DFS Used: 0 (0 B) DFS Remaining: 8246583959552 (7.50 TB) DFS Used%: 0.00% DFS Remaining%: 100.00% Configured Cache Capacity: 32000000 (30.52 MB) Cache Used: 0 (0 B) Cache Remaining: 32000000 (30.52 MB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 2 Last contact: Mon Aug 17 05:08:10 CST 2020 Last Block Report: Mon Aug 17 01:43:05 CST 2020 Dead datanodes (1): Name: 172.200.6.104:50010 (hadoop104.yinzhengjie.com) Hostname: hadoop104.yinzhengjie.com Rack: /rack002 Decommission Status : Normal Configured Capacity: 8246979788800 (7.50 TB) DFS Used: 395776000 (377.44 MB) Non DFS Used: 0 (0 B) DFS Remaining: 8246584012800 (7.50 TB) DFS Used%: 0.00% DFS Remaining%: 100.00% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 0 Last contact: Mon Aug 17 04:02:57 CST 2020 Last Block Report: Mon Aug 17 01:43:05 CST 2020 [root@hadoop101.yinzhengjie.com ~]#
三.dfsadmin -refreshNodes命令
[root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin -refreshNodes #用于更新连接到NameNode的DataNode列表。 Refresh nodes successful [root@hadoop101.yinzhengjie.com ~]# 温馨提示: dfs.hosts 为文件命名,该文件包含允许连接到名称节点的主机列表。必须指定文件的完整路径名。如果该值为空,则允许所有主机。
dfs.hosts.exclude: 为文件命名,该文件包含不允许连接到名称节点的主机列表。必须指定文件的完整路径名。如果该值为空,则不排除任何主机。
NameNode从dfs.hosts指向的文件和hdfs-site.xml文件中的"dfs.hosts.exclude"配置参数读取主机名。
dfs.hosts文件列出了运行注册到NameNode的所有主机。dfs.hosts.exclude文件列出了所有需要停用的DataNode(要停用的节点的所有副本都被复制到其它DataNode之后,即停用)。
四.dfsadmin -metasave命令
dfsadmin -metasave命令提供的信息比dfsadmin -report命令提供的更多。使用此命令可以获取各种与块相关的信息。例如: (1)块总数 (2)DataNode的心跳信息(比如可以看到Live Datanodes和Dead Datanodes) (3)正在等待复制的块 (4)当前这个在复制的块 (5)等待删除的块等 具体使用方法可参考下面我给的案例。
[root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin -help metasave -metasave <filename>: Save Namenode's primary data structures to <filename> in the directory specified by hadoop.log.dir property. <filename> is overwritten if it exists. <filename> will contain one line for each of the following 1. Datanodes heart beating with Namenode 2. Blocks waiting to be replicated 3. Blocks currrently being replicated 4. Blocks waiting to be deleted [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# ll /yinzhengjie/softwares/hadoop/logs/ total 2316 -rw-r--r-- 1 root root 2271719 Aug 17 05:44 hadoop-root-namenode-hadoop101.yinzhengjie.com.log -rw-r--r-- 1 root root 733 Aug 17 01:42 hadoop-root-namenode-hadoop101.yinzhengjie.com.out -rw-r--r-- 1 root root 733 Aug 16 12:09 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.1 -rw-r--r-- 1 root root 733 Aug 16 11:44 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.2 -rw-r--r-- 1 root root 733 Aug 14 19:01 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.3 -rw-r--r-- 1 root root 733 Aug 14 02:54 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.4 -rw-r--r-- 1 root root 733 Aug 13 18:40 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.5 -rw-r--r-- 1 root root 64372 Aug 12 15:50 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.log -rw-r--r-- 1 root root 733 Aug 12 15:49 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.out -rw-r--r-- 1 root root 733 Aug 12 14:57 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.out.1 -rw-r--r-- 1 root root 0 Aug 12 14:57 SecurityAuth-root.audit [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfsadmin -metasave meta.log #使用此命令可以获取各种与块相关的信息,指定的文件会默认保存在Hadoop的安装目录的logs目录下。 Created metasave file meta.log in the log directory of namenode hdfs://hadoop101.yinzhengjie.com:9000 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll /yinzhengjie/softwares/hadoop/logs/ total 2320 -rw-r--r-- 1 root root 2271719 Aug 17 05:44 hadoop-root-namenode-hadoop101.yinzhengjie.com.log -rw-r--r-- 1 root root 733 Aug 17 01:42 hadoop-root-namenode-hadoop101.yinzhengjie.com.out -rw-r--r-- 1 root root 733 Aug 16 12:09 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.1 -rw-r--r-- 1 root root 733 Aug 16 11:44 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.2 -rw-r--r-- 1 root root 733 Aug 14 19:01 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.3 -rw-r--r-- 1 root root 733 Aug 14 02:54 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.4 -rw-r--r-- 1 root root 733 Aug 13 18:40 hadoop-root-namenode-hadoop101.yinzhengjie.com.out.5 -rw-r--r-- 1 root root 64372 Aug 12 15:50 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.log -rw-r--r-- 1 root root 733 Aug 12 15:49 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.out -rw-r--r-- 1 root root 733 Aug 12 14:57 hadoop-root-secondarynamenode-hadoop101.yinzhengjie.com.out.1 -rw-r--r-- 1 root root 3500 Aug 17 05:57 meta.log #该文件名称就是我们上面指定的,我们可以使用文件编辑工具来查看内容。 -rw-r--r-- 1 root root 0 Aug 12 14:57 SecurityAuth-root.audit [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# cat /yinzhengjie/softwares/hadoop/logs/meta.log #查看咱们保存的元数据信息 49 files and directories, 27 blocks = 76 total Live Datanodes: 2 Dead Datanodes: 1 Metasave: Blocks waiting for reconstruction: 16 /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Debuginfo.repo: blk_1073741855_1031 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/hadoop-2.10.0.tar.gz: blk_1073741850_1026 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200814193733/fstab: blk_1073741835_1011 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/epel-testing.repo: blk_1073741860_1036 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/hostname: blk_1073741851_1027 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200814193733/sysctl.conf: blk_1073741836_1012 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/epel.repo: blk_1073741861_1037 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Media.repo: blk_1073741856_1032 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-CR.repo: blk_1073741854_1030 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-fasttrack.repo: blk_1073741859_1035 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/hosts2020: blk_1073741862_1038 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.102:50010 : 172.200.6.103:50010 : /user/root/.Trash/200815080000/wc.txt.gz: blk_1073741848_1024 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/wc.txt.gz: blk_1073741852_1028 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.102:50010 : 172.200.6.103:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Sources.repo: blk_1073741857_1033 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.103:50010 : 172.200.6.102:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Base.repo: blk_1073741853_1029 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.102:50010 : 172.200.6.103:50010 : /user/root/.Trash/200815080000/yinzhengjie2020/yum.repos.d/CentOS-Vault.repo: blk_1073741858_1034 (replicas: l: 2 d: 0 c: 0 e: 0) 172.200.6.102:50010 : 172.200.6.103:50010 : Metasave: Blocks currently missing: 0 Mis-replicated blocks that have been postponed: Metasave: Blocks being replicated: 0 Metasave: Blocks 0 waiting deletion from 0 datanodes. Corrupt Blocks: Metasave: Number of datanodes: 3 172.200.6.104:50010 /rack002 IN 8246979788800(7.50 TB) 395776000(377.44 MB) 0.00% 8246584012800(7.50 TB) 0(0 B) 0(0 B) 100.00% 0(0 B) Mon Aug 17 04:02:57 CST 2020 172.200.6.102:50010 /rack001 IN 8246979788800(7.50 TB) 395841536(377.50 MB) 0.00% 8246583947264(7.50 TB) 32000000(30.52 MB) 319488(312 KB) 1.00% 31680512(30.21 MB) Mon Aug 17 05:57:29 CST 2020 172.200.6.103:50010 /rack002 IN 8246979788800(7.50 TB) 395829248(377.49 MB) 0.00% 8246583959552(7.50 TB) 32000000(30.52 MB) 0(0 B) 0.00% 32000000(30.52 MB) Mon Aug 17 05:57:29 CST 2020 [root@hadoop101.yinzhengjie.com ~]#
五.管理HDFS的空间配额
博主推荐阅读: https://www.cnblogs.com/yinzhengjie2020/p/13334148.html
六.
七.
八.
九.
十.