linux命令-split

1. 主要选项
2. 按行拆分文件
3. 按大小拆分文件(精确大小)
4 按大小拆分文件(以行为单位)
5. 输出文件后缀
6 输出文件后缀的位数
7. 输出文件前缀
8. help信息

1. 主要选项

选项	说明	例
-l	按行拆分文件	-l 300, 每300行一个文件
-b	按大小拆分文件(精确大小)	-b 20m，每个文件大小20M；m表示M，其它单位还有k=1024，b=512；
-C	按大小拆分文件(以行为单位)	-C 20m，每个文件大小20M，但不把行拆开；
-d	将输出文件后缀改为数字，默认为字母；
-a	设置输出文件后缀的位数	-a 3，后缀位数为3；

2. 按行拆分文件

说明：
输出一组文件：xaa、xab、xac一直排，最多可以排到xzz。
输出文件名前导字符是x，后缀字符是字母。
每个文件300000行。不过作为一个责任的人，忽悠人是不行地。严格来说，并不是每个文件一定都300000行，比如最后一个文件。

命令：

$ split -l 300000 file.txt
$ ls
file.txt xaa xab ...  xaz xba xbb ... xbf

3. 按大小拆分文件(精确大小)

说明：
分拆后每个文件大小为20m。
你也许会说，这20971520就是20M么？可以计算一下：20M= 20 * (2 ** 20) = 20 * 1048576 = 20971520

指定文件SIZE单位时，除了m(代表1M)，还可以使用k和b。
k代表1k=2**10=1024。
b代表的不是1bit，也不是1byte，而是512，即0.5k，朕不知道这是什么道理。
注意：由于是精确大小，所以会把同一行的内容拆分到不同文件中。

命令

$ split -b 20m file.txt
$ ls -l
total 2050204
-rw-r----- 1 gxxx336 wheel 1048575993 Nov 26 15:22 file.txt
-rw-r----- 1 gxxx336 wheel   20971520 Nov 27 11:43 xaa
-rw-r----- 1 gxxx336 wheel   20971520 Nov 27 11:43 xab
-rw-r----- 1 gxxx336 wheel   20971520 Nov 27 11:43 xac
...

4 按大小拆分文件(以行为单位)

说明：
通过-b选项拆分的文件大小严格地符合指定的大小。
但这有一个缺点：可能会把一行的内容分拆到两个文件中(那简直是一定要发生的)。
使用-C选项，分拆文件时，不会把行拆分(拆分出来的文件也不精确地等于指定大小)。

命令

$ split -C 20m file.txt
$ ls -l
total 2050212
-rw-r----- 1 gxxx336 wheel 1048575993 Nov 26 15:22 file.txt
-rw-r----- 1 gxxx336 wheel   20971450 Nov 27 11:43 xaa
-rw-r----- 1 gxxx336 wheel   20971257 Nov 27 11:43 xab
-rw-r----- 1 gxxx336 wheel   20971362 Nov 27 11:43 xac
...

5. 输出文件后缀

说明：默认输出文件后缀是字母a-z，可以使用-d选项将后缀改为数字。

命令：

$ split -d -l 300000 file.txt
$ ls
file.txt x00 x01 x02 ... x31

6 输出文件后缀的位数

说明：输出文件后缀默认是位数是2，可以使用-a选项更改其位数。

命令：

$ split -a 3 -d -l 300000 file.txt
$ ls
file.txt  x000 x001 x002 ... x031

这是32个文件，如果后缀位数指定为1位，将会报告后缀用完了，然后只分拆出前10个文件：

$ split -a 3 -d -l 300000 file.txt
split: Output file suffixes exhausted
$ ls
file.txt  x0 x1 x2 x3  x4 x5 x6 x7 x8 x9

7. 输出文件前缀

说明：默认输出文件前缀是字母x，前缀不是用类似-d这种选项指定的，是写在命令最后告诉命令的。

命令：

$ split -d -l 300000 file.txt file.txt.
$ ls
file.txt file.txt.00 file.txt.01 file.txt.02 ... file.txt.31

8. help信息

$ split --help
Usage: split [OPTION]... [FILE [PREFIX]]
Output pieces of FILE to PREFIXaa, PREFIXab, ...;
default size is 1000 lines, and default PREFIX is 'x'.

With no FILE, or when FILE is -, read standard input.

Mandatory arguments to long options are mandatory for short options too.
  -a, --suffix-length=N   generate suffixes of length N (default 2)
      --additional-suffix=SUFFIX  append an additional SUFFIX to file names
  -b, --bytes=SIZE        put SIZE bytes per output file
  -C, --line-bytes=SIZE   put at most SIZE bytes of records per output file
  -d                      use numeric suffixes starting at 0, not alphabetic
      --numeric-suffixes[=FROM]  same as -d, but allow setting the start value
  -x                      use hex suffixes starting at 0, not alphabetic
      --hex-suffixes[=FROM]  same as -x, but allow setting the start value
  -e, --elide-empty-files  do not generate empty output files with '-n'
      --filter=COMMAND    write to shell COMMAND; file name is $FILE
  -l, --lines=NUMBER      put NUMBER lines/records per output file
  -n, --number=CHUNKS     generate CHUNKS output files; see explanation below
  -t, --separator=SEP     use SEP instead of newline as the record separator;
                            '' (zero) specifies the NUL character
  -u, --unbuffered        immediately copy input to output with '-n r/...'
      --verbose           print a diagnostic just before each
                            output file is opened
      --help     display this help and exit
      --version  output version information and exit

The SIZE argument is an integer and optional unit (example: 10K is 10*1024).
Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000).

CHUNKS may be:
  N       split into N files based on size of input
  K/N     output Kth of N to stdout
  l/N     split into N files without splitting lines/records
  l/K/N   output Kth of N to stdout without splitting lines/records
  r/N     like 'l' but use round robin distribution
  r/K/N   likewise but only output Kth of N to stdout

GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Report split translation bugs to <https://translationproject.org/team/>
Full documentation at: <https://www.gnu.org/software/coreutils/split>
or available locally via: info '(coreutils) split invocation'

相关阅读:
【leetcode刷题笔记】Merge Intervals
【leetcode刷题笔记】Implement strStr()
【leetcode刷题笔记】Rotate List
【leetcode刷题笔记】Merge k Sorted Lists
【leetcode刷题笔记】Longest Substring Without Repeating Characters
【leetcode刷题笔记】Scramble String
【leetcode刷题笔记】Anagrams
【leetcode刷题笔记】Distinct Subsequences
【leetcode刷题笔记】Remove Duplicates from Sorted List II
结语与感悟
原文地址：https://www.cnblogs.com/gaiqingfeng/p/13572299.html