使用"hdfs dfs"实用程序来管理HDFS
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
一.命令行是管理HDFS存储的最常用方法
使用HDFS是一项最常见的Hadoop管理工作。虽然可以通过很多方式访问HDFS,但命令行是管理HDFS存储的最常用方法。 可以通过以下几种方式访问HDFS: (1)使用Java API方式访问HDFS集群; (2)在命令行使用简单的类似Linux的文件系统命令行可以管理HDFS集群(基于命令行访问HDFS的方式本质上是Hadoop官方对Java API的一种封装); (3)使用NameNode的WebUI访问HDFS集群; (4)使用称为WebHDFS的Web界面访问HDFS集群; (5)使用HttpFS网关通过防火墙访问HDFS集群; (6)通过Hue的文件浏览器 虽然可以使用多种方式访问HDFS,但是多数时候使用命令行来管理HDFS文件和目录。可以在命令行使用hdfs dfs文件系统命令访问HDFS集群。 温馨提示: 重要的是要记住,HDFS文件系统只是Hadoop实现文件系统的一种方式。还有其它几个可以在Hadoop上工作的使用Java实现的文件系统,包括本地文件系统(文件),WebHDFS,HAR(Hadoop规档文件),View(viewfs)及S3等。 对于每种文件系统,Hadoop使用不同的文件系统实例URI方案,以便于其连接。例如,使用文件URI方案列出本地文件中的文件,如下所示(这将得到一个存储在本地Linux文件系统上的文件列表)。 [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls file:/// Found 20 items dr-xr-xr-x - root root 20480 2020-08-12 10:28 file:///bin dr-xr-xr-x - root root 4096 2020-01-20 04:21 file:///boot drwxr-xr-x - root root 3260 2020-08-14 00:06 file:///dev drwxr-xr-x - root root 8192 2020-08-12 10:28 file:///etc drwxr-xr-x - root root 6 2018-04-11 12:59 file:///home dr-xr-xr-x - root root 4096 2020-01-20 05:29 file:///lib dr-xr-xr-x - root root 24576 2020-08-12 10:28 file:///lib64 drwxr-xr-x - root root 6 2018-04-11 12:59 file:///media drwxr-xr-x - root root 6 2018-04-11 12:59 file:///mnt drwxr-xr-x - root root 6 2018-04-11 12:59 file:///opt dr-xr-xr-x - root root 0 2020-08-14 00:05 file:///proc dr-xr-x--- - root root 196 2020-08-14 03:11 file:///root drwxr-xr-x - root root 600 2020-08-14 00:06 file:///run dr-xr-xr-x - root root 12288 2020-08-06 06:33 file:///sbin drwxr-xr-x - root root 6 2018-04-11 12:59 file:///srv dr-xr-xr-x - root root 0 2020-08-14 00:05 file:///sys drwxrwxrwt - root root 4096 2020-08-14 03:32 file:///tmp drwxr-xr-x - root root 167 2020-01-21 01:45 file:///usr drwxr-xr-x - root root 267 2020-01-20 04:22 file:///var drwxr-xr-x - root root 35 2020-08-11 21:39 file:///yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
二.hdfs dfs命令概述
有几个Linux文件和目录命名在HDFS中有对应的命令,如ls,cp和mv命令。但是,Linux文件系统命令和HDFS文件系统命令之间的一个很大的区别是,在HDFS中没有有关于目录位置的命令,例如,HDFS中没有pwd命令或cd命令。 如下所示,可以使用"hdfs dfs"使用程序在Hadoop中执行HDFS命令,接下来我们会演示如何使用该命令。
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs Usage: hadoop fs [generic options] [-appendToFile <localsrc> ... <dst>] [-cat [-ignoreCrc] <src> ...] [-checksum <src> ...] [-chgrp [-R] GROUP PATH...] [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...] [-chown [-R] [OWNER][:[GROUP]] PATH...] [-copyFromLocal [-f] [-p] [-l] [-d] <localsrc> ... <dst>] [-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>] [-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] <path> ...] [-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>] [-createSnapshot <snapshotDir> [<snapshotName>]] [-deleteSnapshot <snapshotDir> <snapshotName>] [-df [-h] [<path> ...]] [-du [-s] [-h] [-x] <path> ...] [-expunge] [-find <path> ... <expression> ...] [-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>] [-getfacl [-R] <path>] [-getfattr [-R] {-n name | -d} [-e en] <path>] [-getmerge [-nl] [-skip-empty-file] <src> <localdst>] [-help [cmd ...]] [-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [<path> ...]] [-mkdir [-p] <path> ...] [-moveFromLocal <localsrc> ... <dst>] [-moveToLocal <src> <localdst>] [-mv <src> ... <dst>] [-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>] [-renameSnapshot <snapshotDir> <oldName> <newName>] [-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...] [-rmdir [--ignore-fail-on-non-empty] <dir> ...] [-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]] [-setfattr {-n name [-v value] | -x name} <path>] [-setrep [-R] [-w] <rep> <path> ...] [-stat [format] <path> ...] [-tail [-f] <file>] [-test -[defsz] <path>] [-text [-ignoreCrc] <src> ...] [-touchz <path> ...] [-truncate [-w] <length> <path> ...] [-usage [cmd ...]] Generic options supported are: -conf <configuration file> specify an application configuration file -D <property=value> define a value for a given property -fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations. -jt <local|resourcemanager:port> specify a ResourceManager -files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster -libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath -archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines The general command line syntax is: command [genericOptions] [commandOptions] [root@hadoop101.yinzhengjie.com ~]#
三.hdfs dfs实战案例
1>.查看给定命令的帮助信息
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -usage ls #显示ls命令的使用方法,该帮助信息相对(-help)来说比较简洁。 Usage: hadoop fs [generic options] -ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [<path> ...] [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help ls #显示ls命令的使用方法,该帮助信息相对(-usage)来说比较详细。 -ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [<path> ...] : List the contents that match the specified file pattern. If path is not specified, the contents of /user/<currentUser> will be listed. For a directory a list of its direct children is returned (unless -d option is specified). Directory entries are of the form: permissions - userId groupId sizeOfDirectory(in bytes) modificationDate(yyyy-MM-dd HH:mm) directoryName and file entries are of the form: permissions numberOfReplicas userId groupId sizeOfFile(in bytes) modificationDate(yyyy-MM-dd HH:mm) fileName -C Display the paths of files and directories only. -d Directories are listed as plain files. -h Formats the sizes of files in a human-readable fashion rather than a number of bytes. -q Print ? instead of non-printable characters. -R Recursively list the contents of directories. -t Sort files by modification time (most recent first). -S Sort files by size. -r Reverse the order of the sort. -u Use time of last access instead of modification for display and sorting. [root@hadoop101.yinzhengjie.com ~]#
2>.列出文件和目录
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / #仅查看HDFS的根路径下存在的文件或目录(注意,新建的集群默认是没有文件和目录的哟~) Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 07:07 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls -R / #递归查看HDFS的根路径下存在的文件或目录 drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie drwxr-xr-x - root admingroup 0 2020-08-14 07:07 /yinzhengjie/data drwxr-xr-x - root admingroup 0 2020-08-14 07:07 /yinzhengjie/data/hadoop drwxr-xr-x - root admingroup 0 2020-08-14 07:07 /yinzhengjie/data/hadoop/hdfs drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie/yum.repos.d -rw-r--r-- 3 root admingroup 1664 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Base.repo -rw-r--r-- 3 root admingroup 1309 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-CR.repo -rw-r--r-- 3 root admingroup 649 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Debuginfo.repo -rw-r--r-- 3 root admingroup 630 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Media.repo -rw-r--r-- 3 root admingroup 1331 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Sources.repo -rw-r--r-- 3 root admingroup 5701 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Vault.repo -rw-r--r-- 3 root admingroup 314 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-fasttrack.repo -rw-r--r-- 3 root admingroup 1050 2020-08-14 07:13 /yinzhengjie/yum.repos.d/epel-testing.repo -rw-r--r-- 3 root admingroup 951 2020-08-14 07:13 /yinzhengjie/yum.repos.d/epel.repo [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls hdfs://hadoop101.yinzhengjie.com:9000/ #列出文件时指定HDFS URI Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 hdfs://hadoop101.yinzhengjie.com:9000/bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 hdfs://hadoop101.yinzhengjie.com:9000/hosts drwxr-xr-x - root admingroup 0 2020-08-14 07:13 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/data/ /yinzhengjie/yum.repos.d #列出文件时可以指定多个文件或目录 Found 1 items drwxr-xr-x - root admingroup 0 2020-08-14 07:07 /yinzhengjie/data/hadoop Found 9 items -rw-r--r-- 3 root admingroup 1664 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Base.repo -rw-r--r-- 3 root admingroup 1309 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-CR.repo -rw-r--r-- 3 root admingroup 649 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Debuginfo.repo -rw-r--r-- 3 root admingroup 630 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Media.repo -rw-r--r-- 3 root admingroup 1331 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Sources.repo -rw-r--r-- 3 root admingroup 5701 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Vault.repo -rw-r--r-- 3 root admingroup 314 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-fasttrack.repo -rw-r--r-- 3 root admingroup 1050 2020-08-14 07:13 /yinzhengjie/yum.repos.d/epel-testing.repo -rw-r--r-- 3 root admingroup 951 2020-08-14 07:13 /yinzhengjie/yum.repos.d/epel.repo [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / #注意下面的文件信息哟,当列出文件时,将显示每个文件的复制因子,很显然,默认的副本数是3. Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls -d /yinzhengjie/ /bigdata #查看与目录相关的信息 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls file:///root #查看本地文件系统的"/root"目录下的文件或目录 Found 14 items drwx------ - root root 27 2020-08-12 10:47 file:///root/.ansible -rw------- 1 root root 22708 2020-08-14 21:46 file:///root/.bash_history -rw-r--r-- 1 root root 18 2013-12-29 10:26 file:///root/.bash_logout -rw-r--r-- 1 root root 176 2013-12-29 10:26 file:///root/.bash_profile -rw-r--r-- 1 root root 176 2013-12-29 10:26 file:///root/.bashrc -rw-r--r-- 1 root root 100 2013-12-29 10:26 file:///root/.cshrc drwxr----- - root root 19 2020-08-12 10:27 file:///root/.pki drwx------ - root root 80 2020-08-12 10:39 file:///root/.ssh -rw-r--r-- 1 root root 129 2013-12-29 10:26 file:///root/.tcshrc -rw------- 1 root root 11632 2020-08-14 23:11 file:///root/.viminfo -rw-r--r-- 1 root root 392115733 2020-08-10 15:42 file:///root/hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 26 2020-08-14 23:42 file:///root/hostname -rw-r--r-- 1 root root 371 2020-08-14 23:41 file:///root/hosts -rw-r--r-- 1 root root 397 2020-08-14 23:44 file:///root/res.log [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls hdfs:/// #查看HDFS文件系统的"/"目录下的文件或目录 Found 3 items --w------- 2 jason yinzhengjie 371 2020-08-14 21:42 hdfs:///hosts drwx------ - root admingroup 0 2020-08-14 19:19 hdfs:///user drwxr-xr-x - root admingroup 0 2020-08-14 23:22 hdfs:///yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / #不难发现,默认使用的是"hdfs://"协议 Found 3 items --w------- 2 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
3>.获取有关文件的详细信息
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help stat #查看stat命令的帮助信息 -stat [format] <path> ... : Print statistics about the file/directory at <path> in the specified format. Format accepts permissions in octal (%a) and symbolic (%A), filesize in bytes (%b), type (%F), group name of owner (%g), name (%n), block size (%o), replication (%r), user name of owner (%u), access date (%x, %X). modification date (%y, %Y). %x and %y show UTC date as "yyyy-MM-dd HH:mm:ss" and %X and %Y show milliseconds since January 1, 1970 UTC. If the format is not specified, %y is used by default. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -stat "%a | %A | %b | %F | %g | %n | %o | %r | %u | %y | %Y" /hosts #获取文件的详细信息 644 | rw-r--r-- | 371 | regular file | admingroup | hosts | 536870912 | 3 | root | 2020-08-13 23:09:55 | 1597360195058 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -stat "%a | %A | %b | %F | %g | %n | %o | %r | %u | %y | %Y" /yinzhengjie/ #如果对目录运行"stat"指令,它会指出其的确是一个目录。 755 | rwxr-xr-x | 0 | directory | admingroup | yinzhengjie | 0 | 0 | root | 2020-08-13 23:13:11 | 1597360391417 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -stat "%a %A %b %F %g %n %o %r %u %y %Y" /bigdata 755 rwxr-xr-x 0 directory admingroup bigdata 0 0 root 2020-08-13 23:08:33 1597360113877 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
4>.创建目录
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help mkdir -mkdir [-p] <path> ... : Create a directory in specified location. -p Do not fail if the directory already exists [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -mkdir /test #在根路径下创建一个"test"目录 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -mkdir -p /test2/sub1/sub2 #递归创建一个目录 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 5 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls -R /test2 drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2/sub1 drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2/sub1/sub2 [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 5 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -mkdir hdfs://hadoop101.yinzhengjie.com:9000/test3 #创建目录时也可以指定带有URI的目录 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 6 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwxr-xr-x - root admingroup 0 2020-08-14 19:09 /test3 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 6 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwxr-xr-x - root admingroup 0 2020-08-14 19:09 /test3 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -mkdir /dir001 /dir002 /dir003 #可以同时指定多个参数(用空格分割)以创建多个目录。 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 9 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata drwxr-xr-x - root admingroup 0 2020-08-14 19:10 /dir001 drwxr-xr-x - root admingroup 0 2020-08-14 19:10 /dir002 drwxr-xr-x - root admingroup 0 2020-08-14 19:10 /dir003 -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwxr-xr-x - root admingroup 0 2020-08-14 19:09 /test3 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
5>.创建文件
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help touchz -touchz <path> ... : Creates a file of zero length at <path> with current time as the timestamp of that <path>. An error is returned if the file exists with non-zero length [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -touchz /hdfs.log #创建一个空文件 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 0 2020-08-14 22:58 /hdfs.log --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user [root@hadoop101.yinzhengjie.com ~]#
6>.删除文件和目录
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help rmr #rmr已被官方废弃,推荐使用"rm -r"功能 -rmr : (DEPRECATED) Same as '-rm -r' [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help rm -rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ... : Delete all files that match the specified file pattern. Equivalent to the Unix command "rm <src>" -f If the file does not exist, do not display a diagnostic message or modify the exit status to reflect an error. -[rR] Recursively deletes directories. -skipTrash option bypasses trash, if enabled, and immediately deletes <src>. -safely option requires safety confirmation, if enabled, requires confirmation before deleting large directory with more than <hadoop.shell.delete.limit.num.files> files. Delay is expected when walking over large directory recursively to count the number of files to be deleted before the confirmation. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 6 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwxr-xr-x - root admingroup 0 2020-08-14 19:09 /test3 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -rmdir /test3 #仅删除一个空目录 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 5 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 9 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata drwxr-xr-x - root admingroup 0 2020-08-14 19:10 /dir001 drwxr-xr-x - root admingroup 0 2020-08-14 19:10 /dir002 drwxr-xr-x - root admingroup 0 2020-08-14 19:10 /dir003 -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwxr-xr-x - root admingroup 0 2020-08-14 19:09 /test3 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -rmdir /dir001 /dir002 /dir003 #同时删除多个空目录 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 6 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwxr-xr-x - root admingroup 0 2020-08-14 19:09 /test3 drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwxr-xr-x - root admingroup 0 2020-08-14 07:07 /yinzhengjie/data drwxr-xr-x - root admingroup 0 2020-08-14 07:13 /yinzhengjie/yum.repos.d [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/yum.repos.d Found 9 items -rw-r--r-- 3 root admingroup 1664 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Base.repo -rw-r--r-- 3 root admingroup 1309 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-CR.repo -rw-r--r-- 3 root admingroup 649 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Debuginfo.repo -rw-r--r-- 3 root admingroup 630 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Media.repo -rw-r--r-- 3 root admingroup 1331 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Sources.repo -rw-r--r-- 3 root admingroup 5701 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-Vault.repo -rw-r--r-- 3 root admingroup 314 2020-08-14 07:13 /yinzhengjie/yum.repos.d/CentOS-fasttrack.repo -rw-r--r-- 3 root admingroup 1050 2020-08-14 07:13 /yinzhengjie/yum.repos.d/epel-testing.repo -rw-r--r-- 3 root admingroup 951 2020-08-14 07:13 /yinzhengjie/yum.repos.d/epel.repo [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -rm -R /yinzhengjie/yum.repos.d #递归删除非空目录,如果开启了回收站功能,数据并没有被立刻删除,而是被移动到回收站啦~ 20/08/14 19:19:46 INFO fs.TrashPolicyDefault: Moved: 'hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/yum.repos.d' to trash at: hdfs://hadoop101.yinzhengjie.com:9000/user/root/.Trash/Current/yinzhengjie/yum.repos.d [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 1 items drwxr-xr-x - root admingroup 0 2020-08-14 07:07 /yinzhengjie/data [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /user/root/.Trash/Current/yinzhengjie/yum.repos.d #根据上面的提示信息,不难直到数据被移动到回收站所对应的路径哟~ Found 9 items -rw-r--r-- 3 root admingroup 1664 2020-08-14 07:13 /user/root/.Trash/Current/yinzhengjie/yum.repos.d/CentOS-Base.repo -rw-r--r-- 3 root admingroup 1309 2020-08-14 07:13 /user/root/.Trash/Current/yinzhengjie/yum.repos.d/CentOS-CR.repo -rw-r--r-- 3 root admingroup 649 2020-08-14 07:13 /user/root/.Trash/Current/yinzhengjie/yum.repos.d/CentOS-Debuginfo.repo -rw-r--r-- 3 root admingroup 630 2020-08-14 07:13 /user/root/.Trash/Current/yinzhengjie/yum.repos.d/CentOS-Media.repo -rw-r--r-- 3 root admingroup 1331 2020-08-14 07:13 /user/root/.Trash/Current/yinzhengjie/yum.repos.d/CentOS-Sources.repo -rw-r--r-- 3 root admingroup 5701 2020-08-14 07:13 /user/root/.Trash/Current/yinzhengjie/yum.repos.d/CentOS-Vault.repo -rw-r--r-- 3 root admingroup 314 2020-08-14 07:13 /user/root/.Trash/Current/yinzhengjie/yum.repos.d/CentOS-fasttrack.repo -rw-r--r-- 3 root admingroup 1050 2020-08-14 07:13 /user/root/.Trash/Current/yinzhengjie/yum.repos.d/epel-testing.repo -rw-r--r-- 3 root admingroup 951 2020-08-14 07:13 /user/root/.Trash/Current/yinzhengjie/yum.repos.d/epel.repo [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -rm -R /user/root/.Trash/Current/yinzhengjie/yum.repos.d #此时,我们才是真正意义上的删除目录的数据啦~ Deleted /user/root/.Trash/Current/yinzhengjie/yum.repos.d [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 6 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /hosts drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 19:19 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -rm /hosts #删除文件,如果开启了回收站,则删除文件就是将该文件移动到回收站里 20/08/14 19:26:07 INFO fs.TrashPolicyDefault: Moved: 'hdfs://hadoop101.yinzhengjie.com:9000/hosts' to trash at: hdfs://hadoop101.yinzhengjie.com:9000/user/root/.Trash/Current/hosts [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 5 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata drwxr-xr-x - root admingroup 0 2020-08-14 19:03 /test drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /test2 drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 19:19 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /user/root/.Trash/Current/hosts -rw-r--r-- 3 root admingroup 371 2020-08-14 07:09 /user/root/.Trash/Current/hosts [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -rm /user/root/.Trash/Current/hosts Deleted /user/root/.Trash/Current/hosts [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 540 2020-08-14 19:33 /limits.conf drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 19:19 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -rm -skipTrash /limits.conf #尽管我们开启了回收站功能,若使用"-skipTrash"选项时将绕过HDFS回收站,即立即删除指定的文件或目录。 Deleted /limits.conf [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 19:19 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
7>.清空回收站
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help expunge -expunge : Delete files from the trash that are older than the retention threshold [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /user/root/.Trash/Current #查看当前回收站的内容 Found 4 items -rw-r--r-- 3 root admingroup 490 2020-08-14 19:31 /user/root/.Trash/Current/fstab -rw-r--r-- 3 root admingroup 10779 2020-08-14 19:32 /user/root/.Trash/Current/sysctl.conf drwxr-xr-x - root admingroup 0 2020-08-14 19:04 /user/root/.Trash/Current/test2 drwx------ - root admingroup 0 2020-08-14 19:21 /user/root/.Trash/Current/yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -expunge #该命令将删除配置的时间间隔前的所有文件,即清空回收站。 20/08/14 19:37:33 INFO fs.TrashPolicyDefault: TrashPolicyDefault#deleteCheckpoint for trashRoot: hdfs://hadoop101.yinzhengjie.com:9000/user/root/.Trash 20/08/14 19:37:33 INFO fs.TrashPolicyDefault: TrashPolicyDefault#deleteCheckpoint for trashRoot: hdfs://hadoop101.yinzhengjie.com:9000/user/root/.Trash 20/08/14 19:37:33 INFO fs.TrashPolicyDefault: TrashPolicyDefault#createCheckpoint for trashRoot: hdfs://hadoop101.yinzhengjie.com:9000/user/root/.Trash 20/08/14 19:37:33 INFO fs.TrashPolicyDefault: Created trash checkpoint: /user/root/.Trash/200814193733 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /user/root/.Trash/Current #不难发现该目录已经被删除啦~ ls: `/user/root/.Trash/Current': No such file or directory [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /user/root/.Trash/ Found 1 items drwx------ - root admingroup 0 2020-08-14 19:32 /user/root/.Trash/200814193733 [root@hadoop101.yinzhengjie.com ~]#
8>.重命名文件或目录
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help mv -mv <src> ... <dst> : Move files that match the specified file pattern <src> to a destination <dst>. When moving multiple files, the destination must be a directory. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 0 2020-08-14 22:58 /hdfs.log --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -mv /hdfs.log /hdfs2020.log #将"hdfs.log"文件重命名为"hdfs2020.log" [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 0 2020-08-14 22:58 /hdfs2020.log --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 0 2020-08-14 22:58 /hdfs2020.log --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -mv /bigdata /yinzhengjie #将"/bigdata"目录更名为"/yinzhengjie" [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items -rw-r--r-- 3 root admingroup 0 2020-08-14 22:58 /hdfs2020.log --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
9>.拷贝文件或目录
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help cp -cp [-f] [-p | -p[topax]] [-d] <src> ... <dst> : Copy files that match the file pattern <src> to a destination. When copying multiple files, the destination must be a directory. Passing -p preserves status [topax] (timestamps, ownership, permission, ACLs, XAttr). If -p is specified with no <arg>, then preserves timestamps, ownership, permission. If -pa is specified, then preserves permission also because ACL is a super-set of permission. Passing -f overwrites the destination if it already exists. raw namespace extended attributes are preserved if (1) they are supported (HDFS only) and, (2) all of the source and target pathnames are in the /.reserved/raw hierarchy. raw namespace xattr preservation is determined solely by the presence (or absence) of the /.reserved/raw prefix and not by the -p option. Passing -d will skip creation of temporary file(<dst>._COPYING_). [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 6 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz -rw-r--r-- 3 root admingroup 26 2020-08-14 23:42 /hostname --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -cp /yinzhengjie/ /yinzhengjie2020 #在HDFS集群上拷贝目录 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 7 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz -rw-r--r-- 3 root admingroup 26 2020-08-14 23:42 /hostname --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie drwxr-xr-x - root admingroup 0 2020-08-14 23:48 /yinzhengjie2020 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 7 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz -rw-r--r-- 3 root admingroup 26 2020-08-14 23:42 /hostname --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie drwxr-xr-x - root admingroup 0 2020-08-14 23:48 /yinzhengjie2020 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -cp /hosts /hosts2020 #拷贝HDFS的文件 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 8 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz -rw-r--r-- 3 root admingroup 26 2020-08-14 23:42 /hostname --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts -rw-r--r-- 3 root admingroup 371 2020-08-14 23:49 /hosts2020 drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie drwxr-xr-x - root admingroup 0 2020-08-14 23:48 /yinzhengjie2020 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
10>.将本地文件或目录上传到HDFS集群
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help put -put [-f] [-p] [-l] [-d] <localsrc> ... <dst> : Copy files from the local file system into fs. Copying fails if the file already exists, unless the -f flag is given. Flags: -p Preserves access and modification times, ownership and the mode. -f Overwrites the destination if it already exists. -l Allow DataNode to lazily persist the file to disk. Forces replication factor of 1. This flag will result in reduced durability. Use with care. -d Skip creation of temporary file(<dst>._COPYING_). [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help copyFromLocal -copyFromLocal [-f] [-p] [-l] [-d] <localsrc> ... <dst> : Identical to the -put command. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help moveFromLocal -moveFromLocal <localsrc> ... <dst> : Same as -put, except that the source is deleted after it's copied. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -put /etc/yum.repos.d/ /yinzhengjie/ #我们将本地的yum仓库配置文件目录上传到HDFS的"/yinzhengjie"路径下 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 1 items drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/yum.repos.d Found 9 items -rw-r--r-- 3 root admingroup 1664 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Base.repo -rw-r--r-- 3 root admingroup 1309 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-CR.repo -rw-r--r-- 3 root admingroup 649 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Debuginfo.repo -rw-r--r-- 3 root admingroup 630 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Media.repo -rw-r--r-- 3 root admingroup 1331 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Sources.repo -rw-r--r-- 3 root admingroup 5701 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Vault.repo -rw-r--r-- 3 root admingroup 314 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-fasttrack.repo -rw-r--r-- 3 root admingroup 1050 2020-08-14 23:13 /yinzhengjie/yum.repos.d/epel-testing.repo -rw-r--r-- 3 root admingroup 951 2020-08-14 23:13 /yinzhengjie/yum.repos.d/epel.repo [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382932 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 69 Aug 14 23:11 wc.txt.gz [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -put wc.txt.gz / #将Linux的文件上传到HDFS的"/"路径下 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 1 items drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382932 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 69 Aug 14 23:11 wc.txt.gz [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -moveFromLocal wc.txt.gz /yinzhengjie/ #将文件上传到HDFS集群后,会删除源文件 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382928 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382928 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -copyFromLocal hadoop-2.10.0.tar.gz / #将Linux本地文件上传到HDFS上 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382928 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 5 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
11>.下载文件到本地
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help get -get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst> : Copy files that match the file pattern <src> to the local name. <src> is kept. When copying multiple files, the destination must be a directory. Passing -f overwrites the destination if it already exists and -p preserves access and modification times, ownership and the mode. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help copyToLocal #和"-get"命令类似 -copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst> : Identical to the -get command. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help getmerge #可用同时下载多个文件并合并程一个文件到本地 -getmerge [-nl] [-skip-empty-file] <src> <localdst> : Get all the files in the directories that match the source file pattern and merge and sort them to only one file on local fs. <src> is kept. -nl Add a newline character at the end of each file. -skip-empty-file Do not add new line character for empty file. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 5 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382928 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -get /hosts #下载文件到本地 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382932 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 371 Aug 14 23:41 hosts [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 6 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz -rw-r--r-- 3 root admingroup 26 2020-08-14 23:42 /hostname --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382932 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 371 Aug 14 23:41 hosts [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -copyToLocal /hostname #功能和get命令类似,都是下载文件到本地,推荐使用get命令 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382936 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 26 Aug 14 23:42 hostname -rw-r--r-- 1 root root 371 Aug 14 23:41 hosts [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 6 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz -rw-r--r-- 3 root admingroup 26 2020-08-14 23:42 /hostname --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382936 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 26 Aug 14 23:42 hostname -rw-r--r-- 1 root root 371 Aug 14 23:41 hosts [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -getmerge /hosts /hostname res.log #将"/hosts"和"/hostname"文件的内容下载并合并res.log到文件中 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382940 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 26 Aug 14 23:42 hostname -rw-r--r-- 1 root root 371 Aug 14 23:41 hosts -rw-r--r-- 1 root root 397 Aug 14 23:44 res.log [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
12>.查看某个文本文件的内容
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help cat #多用于查看普通文件内容 -cat [-ignoreCrc] <src> ... : Fetch all files that match the file pattern <src> and display their content on stdout. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help text #获取源文件(允许的格式为zip和TextRecordInputStream和Avro。)并以文本格式输出该文件。 -text [-ignoreCrc] <src> ... : Takes a source file and outputs the file in text format. The allowed formats are zip and TextRecordInputStream and Avro. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 5 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -cat /hosts #查看普通文件文件内容 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 #Hadoop 2.x 172.200.6.101 hadoop101.yinzhengjie.com 172.200.6.102 hadoop102.yinzhengjie.com 172.200.6.103 hadoop103.yinzhengjie.com 172.200.6.104 hadoop104.yinzhengjie.com 172.200.6.105 hadoop105.yinzhengjie.com [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 5 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -text /hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 #Hadoop 2.x 172.200.6.101 hadoop101.yinzhengjie.com 172.200.6.102 hadoop102.yinzhengjie.com 172.200.6.103 hadoop103.yinzhengjie.com 172.200.6.104 hadoop104.yinzhengjie.com 172.200.6.105 hadoop105.yinzhengjie.com [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -text /wc.txt.gz #text不仅可以查看普通文本文件内容,还可以查看Hadoop支持的序列化文件或者压缩文件内容 hadoop spark flink hive imapla clickhouse [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -cat /wc.txt.gz #cat命令无法查看压缩文件内容 ¸ ©6_wc.txtʈLʏ/P(.H,ɖH̉͋狈,KUɌM,lj㋎ȌώƯ-NァE´¢*[root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
13>.更改文件和目录所有者和所属组信息
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help chown -chown [-R] [OWNER][:[GROUP]] PATH... : Changes owner and group of a file. This is similar to the shell's chown command with a few exceptions. -R modifies the files recursively. This is the only option currently supported. If only the owner or group is specified, then only the owner or group is modified. The owner and group names may only consist of digits, alphabet, and any of [-_./@a-zA-Z0-9]. The names are case sensitive. WARNING: Avoid using '.' to separate user name and group though Linux allows it. If user names have dots in them and you are using local file system, you might see surprising results since the shell command 'chown' is used for local files. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 root admingroup 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 19:19 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chown jason:jason /hosts #更改"/hosts"文件的所属者和所属组信息,尽管本地的Linux操作系统没有对应的用户也可以修改成功哟~ [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 19:19 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# id jason id: jason: no such user [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382928 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# chown jason:jason hadoop-2.10.0.tar.gz chown: invalid user: ‘jason:jason’ [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwxr-xr-x - root admingroup 0 2020-08-14 07:07 /yinzhengjie/data drwxr-xr-x - root admingroup 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chown -R jason:yinzhengjie /yinzhengjie/ #递归更改"/yinzhengjie"目录的所属者和所属组信息 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwxr-xr-x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwxr-xr-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]#
14>. 更改文件和目录的权限信息
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help chmod -chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH... : Changes permissions of a file. This works similar to the shell's chmod command with a few exceptions. -R modifies the files recursively. This is the only option currently supported. <MODE> Mode is the same as mode used for the shell's command. The only letters recognized are 'rwxXt', e.g. +t,a+r,g-w,+rwx,o=r. <OCTALMODE> Mode specifed in 3 or 4 digits. If 4 digits, the first may be 1 or 0 to turn the sticky bit on or off, respectively. Unlike the shell command, it is not possible to specify only part of the mode, e.g. 754 is same as u=rwx,g=rx,o=r. If none of 'augo' is specified, 'a' is assumed and unlike the shell command, no umask is applied. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw-r--r-- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chmod 600 /hosts #更改"/hosts"文件的权限,HDFS文件的默认权限和Linux类似,均是644,我这里将其更改为600 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwxr-xr-x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwxr-xr-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chmod -R 700 /yinzhengjie/ #递归更改"/yinzhengjie"目录的权限,HDFS的目录默认权限是755,我更改为700. [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx------ - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx------ - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx------ - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx------ - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx------ - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx------ - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chmod -R a+x /yinzhengjie/ #为所属这和所属组及其他人均添加执行权限 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx--x--x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx--x--x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx--x--x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx--x--x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx--x--x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chmod -R g-x /yinzhengjie/ #将所属组的执行权限去掉 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx-----x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx-----x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx-----x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx-----x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chmod -R o+r /yinzhengjie/ #为其他人添加读取权限 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx---r-x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx---r-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx---r-x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx---r-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chmod -R g+w /yinzhengjie/ #为所属组添加写入权限 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx-w-r-x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx-w-r-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]#
15>.更改文件和目录的组信息
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help chgrp -chgrp [-R] GROUP PATH... : This is equivalent to -chown ... :GROUP ... [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason jason 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx-w-r-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chgrp yinzhengjie /hosts #将"/hosts"文件的所属组信息更改为"yinzhengjie"组 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx-w-r-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx-w-r-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx-w-r-x - jason yinzhengjie 0 2020-08-14 07:07 /yinzhengjie/data drwx-w-r-x - jason yinzhengjie 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -chgrp -R admingroup /yinzhengjie #递归将"/yinzhengjie"目录的所属组信息更改为"admingroup"组 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx-w-r-x - jason admingroup 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/ Found 2 items drwx-w-r-x - jason admingroup 0 2020-08-14 07:07 /yinzhengjie/data drwx-w-r-x - jason admingroup 0 2020-08-14 21:46 /yinzhengjie/softwares [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
16>.查看HDFS的可用空间
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help df -df [-h] [<path> ...] : Shows the capacity, free and used space of the filesystem. If the filesystem has multiple partitions, and no path to a particular partition is specified, then the status of the root partitions will be shown. -h Formats the sizes of files in a human-readable fashion rather than a number of bytes. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -df #查看HDFS中已配置的容量,已用空间,可用空间及已使用空间的百分比。 Filesystem Size Used Available Use% hdfs://hadoop101.yinzhengjie.com:9000 24740939366400 282624 24740939083776 0% [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -df Filesystem Size Used Available Use% hdfs://hadoop101.yinzhengjie.com:9000 24740939366400 282624 24740939083776 0% [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -df -h #使用"-h"选线可以以人性化可读的方式输出 Filesystem Size Used Available Use% hdfs://hadoop101.yinzhengjie.com:9000 22.5 T 276 K 22.5 T 0% [root@hadoop101.yinzhengjie.com ~]#
17>.查看HDFS的已用空间
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help du -du [-s] [-h] [-x] <path> ... : Show the amount of space, in bytes, used by the files that match the specified file pattern. The following flags are optional: -s Rather than showing the size of each individual file that matches the pattern, shows the total (summary) size. -h Formats the sizes of files in a human-readable fashion rather than a number of bytes. -x Excludes snapshots from being counted. Note that, even without the -s option, this only shows size summaries one level deep into a directory. The output is in the form size name(full path) [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx-w-r-x - jason admingroup 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -du / #查看整个HDFS文件系统中使用的存储 0 /bigdata 371 /hosts 11269 /user 0 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx-w-r-x - jason admingroup 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -du -h / #使用"-h"选线可用以人性化可读的方式显示文件和目录占用的空间大小,默认以字节为单位显示 0 /bigdata 371 /hosts 11.0 K /user 0 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwx-w-r-x - jason admingroup 0 2020-08-14 21:46 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -du -h / 0 /bigdata 371 /hosts 11.0 K /user 0 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -du -s -h / #仅查看"/"目录本身已使用的空间大小 11.4 K / [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
18>.测试文件
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help test -test -[defsz] <path> : Answer various questions about <path>, with result via exit status. -d return 0 if <path> is a directory. -e return 0 if <path> exists. -f return 0 if <path> is a file. -s return 0 if file <path> is greater than zero bytes in size. -w return 0 if file <path> exists and write permission is granted. -r return 0 if file <path> exists and read permission is granted. -z return 0 if file <path> is zero bytes in size, else return 1. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -e /hosts #如果"/hosts"路径存在则返回"0",不存在则返回"1"。 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 0 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -e /hosts2020 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 1 [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -f /hosts #如果"/hosts"是文件则返回"0",若不存在或者是目录均返回"1"。 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 0 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -f /bigdata [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 1 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -f /bigdata2020 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 1 [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -d /bigdata #如果"/bigdata"是目录则返回"0",若不存在或者是文件军返回"1"。 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 0 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -d /hosts [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 1 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -d /bigdata2020 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 1 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 4 items -rw-r--r-- 3 root admingroup 0 2020-08-14 22:47 /a.txt drwxr-xr-x - root admingroup 0 2020-08-14 07:08 /bigdata -rw------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -z /a.txt #如果文件的大小为"0"(即空文件或者目录),则返回"0",若是路径不存在或者文件大小大于0均会返回"1"。 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 0 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -z /hosts [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 1 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -z /bigdata [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 0 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -test -z /bigdata2020 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# echo $? 1 [root@hadoop101.yinzhengjie.com ~]#
19>.查看文件的校验和
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help checksum -checksum <src> ... : Dump checksum information for files that match the file pattern <src> to stdout. Note that this requires a round-trip to a datanode storing each block of the file, and thus is not efficient to run on a large number of files. The checksum of a file depends on its content, block size and the checksum algorithm and parameters used for creating the fil
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 5 items -rw-r--r-- 3 root admingroup 392115733 2020-08-14 23:25 /hadoop-2.10.0.tar.gz --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user -rw-r--r-- 3 root admingroup 69 2020-08-14 23:14 /wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -checksum /wc.txt.gz #查看HDFS文件的MD5校验和,校验和取决于文件的内容、块大小和校验和算法用于创建文件的参数。 /wc.txt.gz MD5-of-0MD5-of-512CRC32C 00000200000000000000000081c79e60ede6f33e67d79a84e77eebeb [root@hadoop101.yinzhengjie.com ~]#
20>.设置HDFS中文件的副本数量
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help setrep -setrep [-R] [-w] <rep> <path> ... : Set the replication level of a file. If <path> is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at <path>. -w It requests that the command waits for the replication to complete. This can potentially take a very long time. -R It is accepted for backwards compatibility. It has no effect. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items --w------- 3 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -setrep -R -w 2 /hosts #将"/hosts"的副本银子设置为2,使用"-w"参数会等待副本修改完成后当前终端才不会阻塞,"-R"参数并没有效果,因此可以不使用。 Replication 2 set: /hosts Waiting for /hosts ... WARNING: the waiting time may be long for DECREASING the number of replications. . done [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items --w------- 2 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls / Found 3 items --w------- 2 jason yinzhengjie 371 2020-08-14 21:42 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls -R /yinzhengjie/ -rw-r--r-- 3 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d -rw-r--r-- 3 root admingroup 1664 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Base.repo -rw-r--r-- 3 root admingroup 1309 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-CR.repo -rw-r--r-- 3 root admingroup 649 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Debuginfo.repo -rw-r--r-- 3 root admingroup 630 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Media.repo -rw-r--r-- 3 root admingroup 1331 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Sources.repo -rw-r--r-- 3 root admingroup 5701 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Vault.repo -rw-r--r-- 3 root admingroup 314 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-fasttrack.repo -rw-r--r-- 3 root admingroup 1050 2020-08-14 23:13 /yinzhengjie/yum.repos.d/epel-testing.repo -rw-r--r-- 3 root admingroup 951 2020-08-14 23:13 /yinzhengjie/yum.repos.d/epel.repo [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -setrep 2 /yinzhengjie #如果指定的路径是目录,则setrep命令将递归更改所指定目录下的所有文件的复制因子(我们可以不使用"-w"参数,当前终端因此并不会阻塞。) Replication 2 set: /yinzhengjie/wc.txt.gz Replication 2 set: /yinzhengjie/yum.repos.d/CentOS-Base.repo Replication 2 set: /yinzhengjie/yum.repos.d/CentOS-CR.repo Replication 2 set: /yinzhengjie/yum.repos.d/CentOS-Debuginfo.repo Replication 2 set: /yinzhengjie/yum.repos.d/CentOS-Media.repo Replication 2 set: /yinzhengjie/yum.repos.d/CentOS-Sources.repo Replication 2 set: /yinzhengjie/yum.repos.d/CentOS-Vault.repo Replication 2 set: /yinzhengjie/yum.repos.d/CentOS-fasttrack.repo Replication 2 set: /yinzhengjie/yum.repos.d/epel-testing.repo Replication 2 set: /yinzhengjie/yum.repos.d/epel.repo [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls -R /yinzhengjie/ -rw-r--r-- 2 root admingroup 69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz drwxr-xr-x - root admingroup 0 2020-08-14 23:13 /yinzhengjie/yum.repos.d -rw-r--r-- 2 root admingroup 1664 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Base.repo -rw-r--r-- 2 root admingroup 1309 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-CR.repo -rw-r--r-- 2 root admingroup 649 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Debuginfo.repo -rw-r--r-- 2 root admingroup 630 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Media.repo -rw-r--r-- 2 root admingroup 1331 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Sources.repo -rw-r--r-- 2 root admingroup 5701 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-Vault.repo -rw-r--r-- 2 root admingroup 314 2020-08-14 23:13 /yinzhengjie/yum.repos.d/CentOS-fasttrack.repo -rw-r--r-- 2 root admingroup 1050 2020-08-14 23:13 /yinzhengjie/yum.repos.d/epel-testing.repo -rw-r--r-- 2 root admingroup 951 2020-08-14 23:13 /yinzhengjie/yum.repos.d/epel.repo [root@hadoop101.yinzhengjie.com ~]#
21>.将Linux本地文件内容追加到hdfs集群中已经存在的文件中
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help appendToFile -appendToFile <localsrc> ... <dst> : Appends the contents of all the given local files to the given dst file. The dst file will be created if it does not exist. If <localSrc> is -, then the input is read from stdin. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# vim host.txt [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# cat host.txt #扩容节点 172.200.6.106 hadoop106.yinzhengjie.com 172.200.6.107 hadoop107.yinzhengjie.com 172.200.6.108 hadoop108.yinzhengjie.com [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -cat /hosts #未追加文件之前,查看其内容 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 #Hadoop 2.x 172.200.6.101 hadoop101.yinzhengjie.com 172.200.6.102 hadoop102.yinzhengjie.com 172.200.6.103 hadoop103.yinzhengjie.com 172.200.6.104 hadoop104.yinzhengjie.com 172.200.6.105 hadoop105.yinzhengjie.com [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382932 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 134 Aug 15 00:23 host.txt [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -appendToFile host.txt /hosts #将本地文件内容追加到HDFS文件系统中的"/hosts"文件中 [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# ll total 382932 -rw-r--r-- 1 root root 392115733 Aug 10 15:42 hadoop-2.10.0.tar.gz -rw-r--r-- 1 root root 134 Aug 15 00:23 host.txt [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -text /hosts #不难发现,文件被追加成功啦~ 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 #Hadoop 2.x 172.200.6.101 hadoop101.yinzhengjie.com 172.200.6.102 hadoop102.yinzhengjie.com 172.200.6.103 hadoop103.yinzhengjie.com 172.200.6.104 hadoop104.yinzhengjie.com 172.200.6.105 hadoop105.yinzhengjie.com #扩容节点 172.200.6.106 hadoop106.yinzhengjie.com 172.200.6.107 hadoop107.yinzhengjie.com 172.200.6.108 hadoop108.yinzhengjie.com [root@hadoop101.yinzhengjie.com ~]#
22>.显示一个文件的末尾(最后1KB的内容)
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help tail -tail [-f] <file> : Show the last 1KB of the file. -f Shows appended data as the file grows. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -tail /hosts #查看文件的尾部内容(默认只查看最后1KB的内容) 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 #Hadoop 2.x 172.200.6.101 hadoop101.yinzhengjie.com 172.200.6.102 hadoop102.yinzhengjie.com 172.200.6.103 hadoop103.yinzhengjie.com 172.200.6.104 hadoop104.yinzhengjie.com 172.200.6.105 hadoop105.yinzhengjie.com #扩容节点 172.200.6.106 hadoop106.yinzhengjie.com 172.200.6.107 hadoop107.yinzhengjie.com 172.200.6.108 hadoop108.yinzhengjie.com [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -tail -f /hosts #"-f"选项和Linux操作系统中的"tail"类似,当该文件末尾发生变化时,我们在终端是可以看到对应的新增数据 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 #Hadoop 2.x 172.200.6.101 hadoop101.yinzhengjie.com 172.200.6.102 hadoop102.yinzhengjie.com 172.200.6.103 hadoop103.yinzhengjie.com 172.200.6.104 hadoop104.yinzhengjie.com 172.200.6.105 hadoop105.yinzhengjie.com #扩容节点 172.200.6.106 hadoop106.yinzhengjie.com 172.200.6.107 hadoop107.yinzhengjie.com 172.200.6.108 hadoop108.yinzhengjie.com
23>.计算路径下的目录、文件和字节数与指定的文件模式匹配。
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -help count -count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] <path> ... : Count the number of directories, files and bytes under the paths that match the specified file pattern. The output columns are: DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME or, with the -q option: QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME The -h option shows file sizes in human readable format. The -v option displays a header line. The -x option excludes snapshots from being calculated. The -t option displays quota by storage types. It should be used with -q or -u option, otherwise it will be ignored. If a comma-separated list of storage types is given after the -t option, it displays the quota and usage for the specified types. Otherwise, it displays the quota and usage for all the storage types that support quota. The list of possible storage types(case insensitive): ram_disk, ssd, disk and archive. It can also pass the value '', 'all' or 'ALL' to specify all the storage types. The -u option shows the quota and the usage against the quota without the detailed content summary. [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls -h / Found 3 items --w------- 2 jason yinzhengjie 309.5 K 2020-08-16 11:37 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -count -h /hosts 0 1 309.5 K /hosts [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -count -h / #以人性化可读的方式统计("-h"选项)根("/")路径的信息,输出格式为:目录数量,文件数量,路径的总大小,路径名称。 19 29 374.3 M / [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -ls -h / Found 3 items --w------- 2 jason yinzhengjie 309.5 K 2020-08-16 11:37 /hosts drwx------ - root admingroup 0 2020-08-14 19:19 /user drwxr-xr-x - root admingroup 0 2020-08-14 23:22 /yinzhengjie [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -count -h / 19 29 374.3 M / [root@hadoop101.yinzhengjie.com ~]# [root@hadoop101.yinzhengjie.com ~]# hdfs dfs -count -h -v / #"-v"选项可以显示标题栏 DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 19 29 374.3 M / [root@hadoop101.yinzhengjie.com ~]#
[root@hadoop101.yinzhengjie.com ~]# hdfs dfs -count -h -v -q /user/root #使用"-q"参数检查配额信息 QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 88 51 66 G 64.5 G 16 21 748.2 M /user/root [root@hadoop101.yinzhengjie.com ~]# 相关术语解释如下: QUOTA: 名称配额相关信息,即文件和目录的限制。 REM_QUOTA: 此用户可以创建的配额中剩余文件和目录数。 SPACE_QUOTA: 授予此用户的空间配额。 REM_SPACE_QUOTA: 此用户剩余空间配额。 DIR_COUNT: 目录数。 FILE_COUNT: 文件数。 CONTENT_SIZE: 文件大小。 PATHNAME: 路径名称。
24>.权限管理
博主推荐阅读: https://www.cnblogs.com/yinzhengjie2020/p/13308791.html
25>.快照管理
博主推荐阅读: https://www.cnblogs.com/yinzhengjie2020/p/13303008.html