ArangoDB数据导入

ArangoDB数据导入
目录
- arangoimp方法
  - 参数解析
  - 实例展示
- python方法
  - 单条导入
  - 批量导入
1.arangoimp方法

参数解析

全局配置部分(Global configuration)
- --backslash-escape
use backslash as the escape character for quotes, used for csv (default: false)
- --batch-size
size for individual data batches (in bytes) (default: 16777216)
- --collection
collection name (default: "")
- --configuration
the configuration file or 'none' (default: "")
- --convert
convert the strings 'null', 'false', 'true' and strings containing numbers into non-string types (csv and tsv only) (default: true)
- --create-collection
create collection if it does not yet exist (default: false)
- --create-collection-type
type of collection if collection is created (edge or document). possible values: "document", "edge" (default: "document")
- --file
file name ("-" for STDIN) (default: "")
- --from-collection-prefix
_from collection name prefix (will be prepended to all values in '_from') (default: "")
- --ignore-missing
ignore missing columns in csv input (default: false)
- --on-duplicate
action to perform when a unique key constraint violation occurs. Possible values: ignore, replace, update, error. possible values: "error", "ignore", "replace", "update" (default: "error")
- --overwrite
overwrite collection if it exist (WARNING: this will remove any data from the collection) (default: false)
- --progress
show progress (default: true)
- --quote
quote character(s), used for csv (default: """)
- --remove-attribute <string...>
remove an attribute before inserting an attribute into a collection (for csv and tsv only) (default: )
- --separator
field separator, used for csv and tsv (default: "")
- --skip-lines
number of lines to skip for formats (csv and tsv only) (default: 0)
- --threads
Number of parallel import threads. Most useful for the rocksdb engine (default: 2)
- --to-collection-prefix
_to collection name prefix (will be prepended to all values in '_to') (default: "")
- --translate <string...>
translate an attribute name (use as --translate "from=to", for csv and tsv only) (default: )
- --type
type of import file. possible values: "auto", "csv", "json", "jsonl", "tsv" (default: "json")
- --version
reports the version and exits (default: false)

Section 'log' (Configure the logging)
- --log.color
use colors for TTY logging (default: true)
- --log.level <string...>
the global or topic-specific log level (default: "info")
- --log.output <string...>
log destination(s) (default: )
- --log.role
log server role (default: false)
- --log.use-local-time
use local timezone instead of UTC (default: false)
- --log.use-microtime
use microtime instead (default: false)

Section 'server' (Configure a connection to the server)
- --server.authentication
require authentication credentials when connecting (does not affect the server-side authentication settings) (default: true)
- --server.connection-timeout
connection timeout in seconds (default: 5)
- --server.database
database name to use when connecting (default: "_system")
- --server.endpoint
endpoint to connect to, use 'none' to start without a server (default: "http+tcp://127.0.0.1:8529")
- --server.password
password to use when connecting. If not specified and authentication is required, the user will be prompted for a password (default: "")
- --server.request-timeout
request timeout in seconds (default: 1200)
- --server.username
username to use when connecting (default: "root")

Section 'ssl' (Configure SSL communication)
- --ssl.protocol
ssl protocol (1 = SSLv2, 2 = SSLv2 or SSLv3 (negotiated), 3 = SSLv3, 4 = TLSv1, 5 = TLSV1.2). possible values: 1, 2, 3, 4, 5 (default: 5)

Section 'temp' (Configure temporary files)
- --temp.path
path for temporary files (default: "")

应用实例
- 导入节点集合数据
```
arangoimp --server.endpoint tcp://127.0.0.1:8529 --server.username root --server.password ××× --server.database _system --file test.csv --type csv --create-collection true --create-collection-type document --overwrite true --collection "test" 
```
- 导入边集合数据
```
arangoimp --server.endpoint tcp://127.0.0.1:8529 --server.username root --server.password *** --server.database _system --file test.csv --type csv --create-collection true --create-collection-type document --overwrite true --collection "test" 
```
python方法

单条导入
```
from arango import ArangoClient

# Initialize the ArangoDB client.
client = ArangoClient()

# Connect to "test" database as root user.
db = client.db('test', username='root', password='passwd')

# Get the API wrapper for "students" collection.
students = db.collection('students')

# Create some test documents to play around with.
lola = {'_key': 'lola', 'GPA': 3.5, 'first': 'Lola', 'last': 'Martin'}

# Insert a new document. This returns the document metadata.
metadata = students.insert(lola)
```
批量数据导入

由于每一次insert就会产生一次数据库连接，当数据规模较大时，一次次插入比较浪费网络资源，这时候就需要使用Transactions了
```
from arango import ArangoClient

# Initialize the ArangoDB client.
client = ArangoClient()

# Connect to "test" database as root user.
db = client.db('test', username='root', password='passwd')

# Get the API wrapper for "students" collection.
students = db.collection('students')

# Begin a transaction via context manager. This returns an instance of
# TransactionDatabase, a database-level API wrapper tailored specifically
# for executing transactions. The transaction is automatically committed
# when exiting the context. The TransactionDatabase wrapper cannot be
# reused after commit and may be discarded after.
with db.begin_transaction() as txn_db:

    # Child wrappers are also tailored for transactions.
    txn_col = txn_db.collection('students')

    # API execution context is always set to "transaction".
    assert txn_db.context == 'transaction'
    assert txn_col.context == 'transaction'

    # TransactionJob objects are returned instead of results.
    job1 = txn_col.insert({'_key': 'Abby'})
    job2 = txn_col.insert({'_key': 'John'})
    job3 = txn_col.insert({'_key': 'Mary'})

# Upon exiting context, transaction is automatically committed.
assert 'Abby' in students
assert 'John' in students
assert 'Mary' in students

# Retrieve the status of each transaction job.
for job in txn_db.queued_jobs():
    # Status is set to either "pending" (transaction is not committed yet
    # and result is not available) or "done" (transaction is committed and
    # result is available).
    assert job.status() in {'pending', 'done'}

# Retrieve the job results.
metadata = job1.result()
assert metadata['_id'] == 'students/Abby'

metadata = job2.result()
assert metadata['_id'] == 'students/John'

metadata = job3.result()
assert metadata['_id'] == 'students/Mary'

# Transactions can be initiated without using a context manager.
# If return_result parameter is set to False, no jobs are returned.
txn_db = db.begin_transaction(return_result=False)
txn_db.collection('students').insert({'_key': 'Jake'})
txn_db.collection('students').insert({'_key': 'Jill'})

# The commit must be called explicitly.
txn_db.commit()
assert 'Jake' in students
assert 'Jill' in students
```
参考资料

AranfoDB Document v3.3

python-arango document

欢迎转载，转载请注明网址：https://www.cnblogs.com/minglex/p/9705481.html
相关阅读:
windows系统往远程桌面上共享文件（某磁盘下文件）如何远程连接传输文件。
小程序实现读数据、统计词频、建词典
 pickle模块以特殊的二进制格式保存和恢复数据对象
 用一个简单小程序谈import和from...import的区别
 windows系统（64bit）安装python、pytorch
SQL Server 一个简单的游标
 SQL server高级语法
 SQL server基本语法
 SSIS SQL Server配置自动作业
 Power BI 入门资料
原文地址：https://www.cnblogs.com/minglex/p/9705481.html

ArangoDB数据导入

目录

1.arangoimp方法

参数解析

应用实例

python方法

单条导入

批量数据导入

参考资料

欢迎转载，转载请注明网址：https://www.cnblogs.com/minglex/p/9705481.html