• mongoexport


    导出csv格式:默认为json格式

    1,导出csv格式数据,需要同时指定:--type=csv     --fields  column_a,column_b,column_c 

    mongoexport --authenticationDatabase admin   -hzzz  --port=28000 -uxxx -pyyy -d apple   -c users   --type=csv --fields _id,username,domain,name,display_name,department_name  -o  users.csv

    2,导出csv格式数据,需要同时指定:--type=csv   --fieldFile

    [work@hostname tmp]$ cat fields.txt
    _id
    username
    domain
    [work@hostname tmp]$ 
    
    $ mongoexport --authenticationDatabase admin   -hzzz  --port=28000 -uxxx -pyyy -d apple   -c users   --type=csv --fieldFile  fields.txt  -o  users.csv
    
    
    #####################
    
    这里就会按照_id,username,domain三个字段来导出:
    _id,username,domain
    ObjectId(5db805516e1c8355a15fe80a),glc,google.com
    ObjectId(5db805516e1c8355a15fe80b),wjl,apple.com
    ObjectId(5db805516e1c8355a15fe80c),zd,airbnb.com
    ObjectId(5db805516e1c8355a15fe80d),lt,amazon.com

    3,导出csv格式数据文件可以不要列名称 ,加上 --noHeaderLine就能让导出得文件不包含: _id,username,domain ,常用:

    [work@hostname tmp]$ cat fields.txt
    _id
    username
    domain
    [work@hostname tmp]$ 
    
    $ mongoexport --authenticationDatabase admin   -hzzz  --port=28000 -uxxx -pyyy -d apple   -c users   --type=csv --fieldFile  fields.txt --noHeaderLine -o  users.csv
    
    
    #####################
    
    这里就会按照_id,username,domain三个字段来导出:
    
    ObjectId(5db805516e1c8355a15fe80a),glc,google.com
    ObjectId(5db805516e1c8355a15fe80b),wjl,apple.com
    ObjectId(5db805516e1c8355a15fe80c),zd,airbnb.com
    ObjectId(5db805516e1c8355a15fe80d),lt,amazon.com

     按照指定条件导出数据:--query   '{}'

    将名称为“glc”的全部导出:

    [work@hostname tmp]$mongoexport --authenticationDatabase admin -hxxx  --port=28000 -uyyy -pzzz -d apple -c users --type=csv --fieldFile fields.txt --query '{"username":"glc"}' -o users.csv.glc
    2020-10-30T16:57:00.563+0800    connected to: xxx:28000
    2020-10-30T16:57:00.743+0800    exported 10 records
    [work@hostanme tmp]$ less cas_users.csv.glc
    _id,username,domain
    ObjectId(7db805916e5c8355a35fe80a),glc,apple.com
    ObjectId(7e05a1456e5c836c0c0a1240),glc,apple.com
    ObjectId(7e05a3f66e5c836c0c0ac235),glc,apple.cn
    ObjectId(7e05a53a6e5c836c0c0b19c4),glc,google.cn
    ObjectId(7e05a5ee6e5c836c0c0b5d2d),glc,google.cn
    ObjectId(7e05a77d6e5c836c0c0b8fb8),glc,google.cn
    ObjectId(7e3920b6d3dbdb333aa04844),glc,google.cn
    ObjectId(7e4e625ed3dbdb333aa04f7e),glc,amazon.com
    ObjectId(7e96c569e566e174015b7fae),glc,amazon.com
    ObjectId(7f4c80abd3dbdb3d702f7a00),glc,apple.com
    [work@hostname tmp]$ 

     ########################################################

    mongoexport导出某个集合:

    /home/work/mongodb/4.0/bin/mongoexport --authenticationDatabase admin --host 10.10.10.10 --port 28000 --username mongo_backup --password 123456  --db db_name --collection table_name --out /home/work/tmp/table_name

     从上面的结果可以看出,我们在导出数据时没有显示指定导出样式 ,默认导出了JSON格式的数据。

    mongoexport --host hostname --port 28000 --username user --authenticationDatabase admin --collection table_name --db db_name --out table_name.json
    mongoexport --uri 'mongodb://user:password@hostname:28000/db_name?authsource=admin' --collection table_name --out table_name.json
    mongoexport --db db_name --collection table_name --query '{"dept": "ABC", date: { $gte: { "$date": "2018-01-01T00:00:00.000Z" } }}'

    uri规则:

    --uri "mongodb://[username:password@]host1[:port1][,host2[:port2],...[,hostN[:portN]]][/[database][?options]]"
    [work@hostname tmp]$ mongoimport  --help
    Usage:
      mongoimport <options> <file>
    
    Import CSV, TSV or JSON data into MongoDB. If no file is provided, mongoimport reads from stdin.
    
    See http://docs.mongodb.org/manual/reference/program/mongoimport/ for more information.
    
    general options:
          --help                                      print usage
          --version                                   print the tool version and exit
    
    verbosity options:
      -v, --verbose=<level>                           more detailed log output (include multiple times for more verbosity, e.g. -vvvvv, or specify a
                                                      numeric value, e.g. --verbose=N)
          --quiet                                     hide all log output
    
    connection options:
      -h, --host=<hostname>                           mongodb host to connect to (setname/host1,host2 for replica sets)
          --port=<port>                               server port (can also use --host hostname:port)
    
    kerberos options:
          --gssapiServiceName=<service-name>          service name to use when authenticating using GSSAPI/Kerberos ('mongodb' by default)
          --gssapiHostName=<host-name>                hostname to use when authenticating using GSSAPI/Kerberos (remote server's address by default)
    
    ssl options:
          --ssl                                       connect to a mongod or mongos that has ssl enabled
          --sslCAFile=<filename>                      the .pem file containing the root certificate chain from the certificate authority
          --sslPEMKeyFile=<filename>                  the .pem file containing the certificate and key
          --sslPEMKeyPassword=<password>              the password to decrypt the sslPEMKeyFile, if necessary
          --sslCRLFile=<filename>                     the .pem file containing the certificate revocation list
          --sslAllowInvalidCertificates               bypass the validation for server certificates
          --sslAllowInvalidHostnames                  bypass the validation for server name
          --sslFIPSMode                               use FIPS mode of the installed openssl library
    
    authentication options:
      -u, --username=<username>                       username for authentication
      -p, --password=<password>                       password for authentication
          --authenticationDatabase=<database-name>    database that holds the user's credentials
          --authenticationMechanism=<mechanism>       authentication mechanism to use
    
    namespace options:
      -d, --db=<database-name>                        database to use
      -c, --collection=<collection-name>              collection to use
    
    uri options:
          --uri=mongodb-uri                           mongodb uri connection string
    
    input options:
      -f, --fields=<field>[,<field>]*                 comma separated list of fields, e.g. -f name,age
          --fieldFile=<filename>                      file with field names - 1 per line
          --file=<filename>                           file to import from; if not specified, stdin is used
          --headerline                                use first line in input source as the field list (CSV and TSV only)
          --jsonArray                                 treat input source as a JSON array
          --parseGrace=<grace>                        controls behavior when type coercion fails - one of: autoCast, skipField, skipRow, stop
                                                      (defaults to 'stop') (default: stop)
          --type=<type>                               input format to import: json, csv, or tsv (defaults to 'json') (default: json)
          --columnsHaveTypes                          indicated that the field list (from --fields, --fieldsFile, or --headerline) specifies types;
                                                      They must be in the form of '<colName>.<type>(<arg>)'. The type can be one of: auto, binary,
                                                      bool, date, date_go, date_ms, date_oracle, double, int32, int64, string. For each of the date
                                                      types, the argument is a datetime layout string. For the binary type, the argument can be one
                                                      of: base32, base64, hex. All other types take an empty argument. Only valid for CSV and TSV
                                                      imports. e.g. zipcode.string(), thumbnail.binary(base64)
    
    ingest options:
          --drop                                      drop collection before inserting documents
          --ignoreBlanks                              ignore fields with empty values in CSV and TSV
          --maintainInsertionOrder                    insert documents in the order of their appearance in the input source
      -j, --numInsertionWorkers=<number>              number of insert operations to run concurrently (defaults to 1) (default: 1)
          --stopOnError                               stop importing at first insert/upsert error
          --mode=[insert|upsert|merge]                insert: insert only. upsert: insert or replace existing documents. merge: insert or modify
                                                      existing documents. defaults to insert
          --upsertFields=<field>[,<field>]*           comma-separated fields for the query part when --mode is set to upsert or merge
          --writeConcern=<write-concern-specifier>    write concern options e.g. --writeConcern majority, --writeConcern '{w: 3, wtimeout: 500, fsync:
                                                      true, j: true}'
          --bypassDocumentValidation                  bypass document validation

    #################

    ###############################

  • 相关阅读:
    动物细胞结构模型 | animal cell structure
    课程学习
    (基因功能 & 基因表达调控)研究方案
    PCR | RT-PCR 的原理及应用
    ggplot的boxplot添加显著性 | Add P-values and Significance Levels to ggplots | 方差分析
    常见的医学基因筛查检测 | genetic testing | 相癌症早筛 | 液体活检
    (转载)RNA表观遗传学开创者何川
    生物信息基本工具和数据库
    女士品茶 | The Lady Tasting Tea | 统计学史
    R Shiny app | 交互式网页开发
  • 原文地址:https://www.cnblogs.com/igoodful/p/13902621.html
Copyright © 2020-2023  润新知