一、导出工具mongoexport
Mongodb中的mongoexport工具可以把一个collection导出成JSON格式或CSV格式的文件。可以通过参数指定导出的数据项,也可以根据指定的条件导出数据。mongoexport具体用法如下所示:
- [root@localhost mongodb]# ./bin/mongoexport --help
- Export MongoDB data to CSV, TSV or JSON files.
- options:
- --help produce help message
- -v [ --verbose ] be more verbose (include multiple times for more
- verbosity e.g. -vvvvv)
- --version print the program's version and exit
- -h [ --host ] arg mongo host to connect to ( <set name>/s1,s2 for
- sets)
- --port arg server port. Can also use --host hostname:port
- --ipv6 enable IPv6 support (disabled by default)
- -u [ --username ] arg username
- -p [ --password ] arg password
- --dbpath arg directly access mongod database files in the given
- path, instead of connecting to a mongod server -
- needs to lock the data directory, so cannot be used
- if a mongod is currently accessing the same path
- --directoryperdb if dbpath specified, each db is in a separate
- directory
- --journal enable journaling
- -d [ --db ] arg database to use
- -c [ --collection ] arg collection to use (some commands)
- -f [ --fields ] arg comma separated list of field names e.g. -f
- name,age
- --fieldFile arg file with fields names - 1 per line
- -q [ --query ] arg query filter, as a JSON string
- --csv export to csv instead of json
- -o [ --out ] arg output file; if not specified, stdout is used
- --jsonArray output to a json array rather than one object per
- line
- -k [ --slaveOk ] arg (=1) use secondaries for export if available, default
- true
参数说明:
-h:指明数据库宿主机的IP
-u:指明数据库的用户名
-p:指明数据库的密码
-d:指明数据库的名字
-c:指明collection的名字
-f:指明要导出那些列
-o:指明到要导出的文件名
-q:指明导出数据的过滤条件
实例:test库中存在着一个students集合,集合中数据如下:
- > db.students.find()
- { "_id" : ObjectId("5031143350f2481577ea81e5"), "classid" : 1, "age" : 20, "name" : "kobe" }
- { "_id" : ObjectId("5031144a50f2481577ea81e6"), "classid" : 1, "age" : 23, "name" : "nash" }
- { "_id" : ObjectId("5031145a50f2481577ea81e7"), "classid" : 2, "age" : 18, "name" : "james" }
- { "_id" : ObjectId("5031146a50f2481577ea81e8"), "classid" : 2, "age" : 19, "name" : "wade" }
- { "_id" : ObjectId("5031147450f2481577ea81e9"), "classid" : 2, "age" : 19, "name" : "bosh" }
- { "_id" : ObjectId("5031148650f2481577ea81ea"), "classid" : 2, "age" : 25, "name" : "allen" }
- { "_id" : ObjectId("5031149b50f2481577ea81eb"), "classid" : 1, "age" : 19, "name" : "howard" }
- { "_id" : ObjectId("503114a750f2481577ea81ec"), "classid" : 1, "age" : 22, "name" : "paul" }
- { "_id" : ObjectId("503114cd50f2481577ea81ed"), "classid" : 2, "age" : 24, "name" : "shane" }
由上可以看出文档中存在着3个字段:classid、age、name
1.直接导出数据到文件中
- [root@localhost mongodb]# ./bin/mongoexport -d test -c students -o students.dat
- connected to: 127.0.0.1
- exported 9 records
命令执行完后使用ll命令查看,发现目录下生成了一个students.dat的文件
- -rw-r--r-- 1 root root 869 Aug 21 00:05 students.dat
查看该文件信息,具体信息如下:
- [root@localhost mongodb]# cat students.dat
- { "_id" : { "$oid" : "5031143350f2481577ea81e5" }, "classid" : 1, "age" : 20, "name" : "kobe" }
- { "_id" : { "$oid" : "5031144a50f2481577ea81e6" }, "classid" : 1, "age" : 23, "name" : "nash" }
- { "_id" : { "$oid" : "5031145a50f2481577ea81e7" }, "classid" : 2, "age" : 18, "name" : "james" }
- { "_id" : { "$oid" : "5031146a50f2481577ea81e8" }, "classid" : 2, "age" : 19, "name" : "wade" }
- { "_id" : { "$oid" : "5031147450f2481577ea81e9" }, "classid" : 2, "age" : 19, "name" : "bosh" }
- { "_id" : { "$oid" : "5031148650f2481577ea81ea" }, "classid" : 2, "age" : 25, "name" : "allen" }
- { "_id" : { "$oid" : "5031149b50f2481577ea81eb" }, "classid" : 1, "age" : 19, "name" : "howard" }
- { "_id" : { "$oid" : "503114a750f2481577ea81ec" }, "classid" : 1, "age" : 22, "name" : "paul" }
- { "_id" : { "$oid" : "503114cd50f2481577ea81ed" }, "classid" : 2, "age" : 24, "name" : "shane" }
参数说明:
-d:指明使用的库,本例中为test
-c:指明要导出的集合,本例中为students
-o:指明要导出的文件名,本例中为students.dat
从上面的结果可以看出,我们在导出数据时没有显示指定导出样式 ,默认导出了JSON格式的数据。如果我们需要导出CSV格式的数据,则需要使用--csv参数,具体如下所示:
- [root@localhost mongodb]# ./bin/mongoexport -d test -c students --csv -f classid,name,age -o students_csv.dat
- connected to: 127.0.0.1
- exported 9 records
- [root@localhost mongodb]# cat students_csv.dat
- classid,name,age
- 1.0,"kobe",20.0
- 1.0,"nash",23.0
- 2.0,"james",18.0
- 2.0,"wade",19.0
- 2.0,"bosh",19.0
- 2.0,"allen",25.0
- 1.0,"howard",19.0
- 1.0,"paul",22.0
- 2.0,"shane",24.0
- [root@localhost mongodb]#
参数说明:
-csv:指明要导出为csv格式
-f:指明需要导出classid、name、age这3列的数据
由上面结果可以看出,mongoexport成功地将数据根据csv格式导出到了students_csv.dat文件中。
二、导入工具mongoimport
Mongodb中的mongoimport工具可以把一个特定格式文件中的内容导入到指定的collection中。该工具可以导入JSON格式数据,也可以导入CSV格式数据。具体使用如下所示:
- [root@localhost mongodb]# ./bin/mongoimport --help
- options:
- --help produce help message
- -v [ --verbose ] be more verbose (include multiple times for more
- verbosity e.g. -vvvvv)
- --version print the program's version and exit
- -h [ --host ] arg mongo host to connect to ( <set name>/s1,s2 for sets)
- --port arg server port. Can also use --host hostname:port
- --ipv6 enable IPv6 support (disabled by default)
- -u [ --username ] arg username
- -p [ --password ] arg password
- --dbpath arg directly access mongod database files in the given
- path, instead of connecting to a mongod server -
- needs to lock the data directory, so cannot be used
- if a mongod is currently accessing the same path
- --directoryperdb if dbpath specified, each db is in a separate
- directory
- --journal enable journaling
- -d [ --db ] arg database to use
- -c [ --collection ] arg collection to use (some commands)
- -f [ --fields ] arg comma separated list of field names e.g. -f name,age
- --fieldFile arg file with fields names - 1 per line
- --ignoreBlanks if given, empty fields in csv and tsv will be ignored
- --type arg type of file to import. default: json (json,csv,tsv)
- --file arg file to import from; if not specified stdin is used
- --drop drop collection first
- --headerline CSV,TSV only - use first line as headers
- --upsert insert or update objects that already exist
- --upsertFields arg comma-separated fields for the query part of the
- upsert. You should make sure this is indexed
- --stopOnError stop importing at first error rather than continuing
- --jsonArray load a json array, not one item per line. Currently
- limited to 4MB.
参数说明:
-h:指明数据库宿主机的IP
-u:指明数据库的用户名
-p:指明数据库的密码
-d:指明数据库的名字
-c:指明collection的名字
-f:指明要导入那些列
示例:先删除students中的数据,并验证
- > db.students.remove()
- > db.students.find()
- >
然后再导入上面导出的students.dat文件中的内容
- [root@localhost mongodb]# ./bin/mongoimport -d test -c students students.dat
- connected to: 127.0.0.1
- imported 9 objects
- [root@localhost mongodb]#
参数说明:
-d:指明数据库名,本例中为test
-c:指明collection名,本例中为students
students.dat:导入的文件名
查询students集合中的数据
- > db.students.find()
- { "_id" : ObjectId("5031143350f2481577ea81e5"), "classid" : 1, "age" : 20, "name" : "kobe" }
- { "_id" : ObjectId("5031144a50f2481577ea81e6"), "classid" : 1, "age" : 23, "name" : "nash" }
- { "_id" : ObjectId("5031145a50f2481577ea81e7"), "classid" : 2, "age" : 18, "name" : "james" }
- { "_id" : ObjectId("5031146a50f2481577ea81e8"), "classid" : 2, "age" : 19, "name" : "wade" }
- { "_id" : ObjectId("5031147450f2481577ea81e9"), "classid" : 2, "age" : 19, "name" : "bosh" }
- { "_id" : ObjectId("5031148650f2481577ea81ea"), "classid" : 2, "age" : 25, "name" : "allen" }
- { "_id" : ObjectId("5031149b50f2481577ea81eb"), "classid" : 1, "age" : 19, "name" : "howard" }
- { "_id" : ObjectId("503114a750f2481577ea81ec"), "classid" : 1, "age" : 22, "name" : "paul" }
- { "_id" : ObjectId("503114cd50f2481577ea81ed"), "classid" : 2, "age" : 24, "name" : "shane" }
- >
证明数据导入成功
上面演示的是导入JSON格式的文件中的内容,如果要导入CSV格式文件中的内容,则需要通过--type参数指定导入格式,具体如下所示:
先删除数据
- > db.students.remove()
- > db.students.find()
- >
再导入之前导出的students_csv.dat文件
- [root@localhost mongodb]# ./bin/mongoimport -d test -c students --type csv --headerline --file students_csv.dat
- connected to: 127.0.0.1
- imported 10 objects
- [root@localhost mongodb]#
参数说明:
-type:指明要导入的文件格式
-headerline:指明第一行是列名,不需要导入
-file:指明要导入的文件
查询students集合,验证导入是否成功:
- > db.students.find()
- { "_id" : ObjectId("503266029355c632cd118ad8"), "classid" : 1, "name" : "kobe", "age" : 20 }
- { "_id" : ObjectId("503266029355c632cd118ad9"), "classid" : 1, "name" : "nash", "age" : 23 }
- { "_id" : ObjectId("503266029355c632cd118ada"), "classid" : 2, "name" : "james", "age" : 18 }
- { "_id" : ObjectId("503266029355c632cd118adb"), "classid" : 2, "name" : "wade", "age" : 19 }
- { "_id" : ObjectId("503266029355c632cd118adc"), "classid" : 2, "name" : "bosh", "age" : 19 }
- { "_id" : ObjectId("503266029355c632cd118add"), "classid" : 2, "name" : "allen", "age" : 25 }
- { "_id" : ObjectId("503266029355c632cd118ade"), "classid" : 1, "name" : "howard", "age" : 19 }
- { "_id" : ObjectId("503266029355c632cd118adf"), "classid" : 1, "name" : "paul", "age" : 22 }
- { "_id" : ObjectId("503266029355c632cd118ae0"), "classid" : 2, "name" : "shane", "age" : 24 }
- >
MongoDB数据导入和导出
元数据需要事先创建好。
mongoimport
这个命令可以导入单个JSON/CSV/TSV格式的文件。文件的每一行都要是指定格式的标准格式。
需要指定一个数据库(database)和一个collection(相当于关系数据库中的表)。
×××××××××××××
options:
--help produce help message
-v [ --verbose ] be more verbose (include multiple times for more
verbosity e.g. -vvvvv)
-h [ --host ] arg mongo host to connect to ("left,right" for pairs)
-d [ --db ] arg database to use
-c [ --collection ] arg collection to use (some commands)
-u [ --username ] arg username
-p [ --password ] arg password
--dbpath arg directly access mongod data files in the given path,
instead of connecting to a mongod instance - needs to
lock the data directory, so cannot be used if a
mongod is currently accessing the same path
--directoryperdb if dbpath specified, each db is in a separate
directory
-f [ --fields ] arg comma seperated list of field names e.g. -f name,age
--fieldFile arg file with fields names - 1 per line
--ignoreBlanks if given, empty fields in csv and tsv will be ignored
--type arg type of file to import. default: json (json,csv,tsv)
--file arg file to import from; if not specified stdin is used
--drop drop collection first
--headerline CSV,TSV only - use first line as headers
×××××××××××××
MongoDB提供了比JSON丰富的多的数据格式,例如JSON中没有date数据类型,但是通过如下结构:
{"somefield" : 123456, "created_at" : {"$date" : 1285679232000}}
mongoimport 将create_at转换为Date类型数据了。
注意:像这里的$date这样的实现确定的数据类型必须加双引号,否则无法正确解析。
mongoexport
这个工具可以将一个collection导出为JSON为CSV格式的文件。可以指定导出哪些数据项,也可以根据给
定的条件导出数据。
如果想导出CSV格式的文件,需要指定数据项的次序。
×××××××××××××
options:
--help produce help message
-v [ --verbose ] be more verbose (include multiple times for more
verbosity e.g. -vvvvv)
-h [ --host ] arg mongo host to connect to ("left,right" for pairs)
-d [ --db ] arg database to use
-c [ --collection ] arg where 'arg' is the collection to use
-u [ --username ] arg username
-p [ --password ] arg password
--dbpath arg directly access mongod data files in the given path,
instead of connecting to a mongod instance - needs to
lock the data directory, so cannot be used if a
mongod is currently accessing the same path
--directoryperdb if dbpath specified, each db is in a separate
directory
-q [ --query ] arg query filter, as a JSON string
-f [ --fields ] arg comma seperated list of field names e.g. -f name,age
--csv export to csv instead of json
-o [ --out ] arg output file; if not specified, stdout is used
×××××××××××××
mongodump
这个用于对数据库进行备份。
×××××××××××××
options:
--help produce help message
-v [ --verbose ] be more verbose (include multiple times for more
verbosity e.g. -vvvvv)
-h [ --host ] arg mongo host to connect to ("left,right" for pairs)
-d [ --db ] arg database to use
-c [ --collection ] arg collection to use (some commands)
-u [ --username ] arg username
-p [ --password ] arg password
--dbpath arg directly access mongod data files in the given path,
instead of connecting to a mongod instance - needs
to lock the data directory, so cannot be used if a
mongod is currently accessing the same path
--directoryperdb if dbpath specified, each db is in a separate
directory
-o [ --out ] arg (=dump) output directory
-q [ --query ] arg json query
×××××××××××××
例如:导出所有数据库中的所有collects,使用mongodump加上--host参数就可以了,在本地会产生一个dump命名的文件夹。
×××××××××××××
$ ./mongodump --host prod.example.com
connected to: prod.example.com
all dbs
DATABASE: log to dump/log
log.errors to dump/log/errors.bson
713 objects
log.analytics to dump/log/analytics.bson
234810 objects
DATABASE: blog to dump/blog
blog.posts to dump/log/blog.posts.bson
59 objects
DATABASE: admin to dump/admin
×××××××××××××
例如:导出指定数据库中的一个指定的collection,导出结果是一个.bson文件。
$ ./mongodump --db blog --collection posts
connected to: 127.0.0.1
DATABASE: blog to dump/blog
blog.posts to dump/blog/posts.bson
59 objects
在mongodb 1.7.0版本或更高的版本中,还可以将指定的collect通过标准输出流输出。
这时需要指定 --out 参数
$ ./mongodump --db blog --collection posts --out - > blogposts.bson
一次只能导出一个collection到标准输出流文件中。
mongorestore
这个工具获取mongodump的输出结果然后重新存储。重新存储的过程会加上索引。
×××××××××××××
usage: ./mongorestore [options] [directory or filename to restore from]
options:
--help produce help message
-v [ --verbose ] be more verbose (include multiple times for more
verbosity e.g. -vvvvv)
-h [ --host ] arg mongo host to connect to ("left,right" for pairs)
-d [ --db ] arg database to use
-c [ --collection ] arg collection to use (some commands)
-u [ --username ] arg username
-p [ --password ] arg password
--dbpath arg directly access mongod data files in the given path,
instead of connecting to a mongod instance - needs to
lock the data directory, so cannot be used if a
mongod is currently accessing the same path
--directoryperdb if dbpath specified, each db is in a separate
directory
--drop drop each collection before import
--objcheck validate object before inserting
--filter arg filter to apply before inserting
--drop drop each collection before import
--indexesLast wait to add indexes (faster if data isn't inserted in
index order)
×××××××××××××
注意:如果不想创建索引的话,可以从数据库导出的文件夹(dump文件夹)中删除system.indexses.bson文件。
bsondump
这个是在1.6版本中新增加的。
这个工具将一个bson文件转换为一个json/debug输出文件。
×××××××××××××××
usage: ./bsondump [options] [filename]
options:
--help produce help message
--type arg (=json) typ