倒排索引
查询
# 查看索引配置
GET /book/_settings
GET /_all/_settings
# 查询所有index的数据
GET _search
{
"query": {
"match_all": {}
}
}
# 查询文档
GET /lib/user/1
# 查询文档(指定字段)
GET /lib/user/1?_source=age,about
# 查看mapping
GET /lib/user/_mapping
添加
# 创建索引
PUT /lib/
{
"settings":{
"index":{
"number_of_shards":3,
"number_of_replicas":0
}
}
}
# 添加文档(指定id)
PUT /lib/user/1
{
"first_name":"Jane",
"last_name":"Smith",
"age":32,
"about":"I like to colloct rock albums",
"interests":["music","baseketball"]
}
# 添加文档(不指定文档id,系统自动生成id)
POST /lib/user/
{
"first_name":"Douglas",
"last_name":"Fir",
"age":23,
"about":"I like to bulid cabinets",
"interests":["forestry"]
}
更新
# 修改(全字段覆盖的方式)
PUT /lib/user/1
{
"first_name":"Jane",
"last_name":"Smith",
"age":36,
"about":"I like to colloct rock albums",
"interests":["music","baseketball"]
}
# 修改(指定字段)
POST /lib/user/1/_update
{
"doc":{
"age":30
}
}
删除
# 删除文档id
DELETE /lib/user/1
# 删除type
DELETE /lib/user
# 删除index
DELETE lib
批量获取文档
-
使用es提供的Multi Get API:
-
使用Multi Get API可以通过索引名、类型名、文档id一次得到一个文档集合,文档可以来自一个索引库,也可以来自不同索引库
-
使用curl命令:
curl 'http://192.168.242.22:9200/_mget' -d' {
"docs":[
{
"_index":"lib",
"_type":"user",
"_id":1
},
{
"_index":"lib",
"_type":"user",
"_id":AWdQF9axrlJvDlOTtvkF
}
]
}
# kibana dev tools
GET /_mget
{
"docs":[
{
"_index":"lib",
"_type":"user",
"_id":1
},
{
"_index":"lib",
"_type":"user",
"_id":"AWdQF9axrlJvDlOTtvkF"
}
]
}
# 指定获取的字段
GET /_mget
{
"docs":[
{
"_index":"lib",
"_type":"user",
"_id":1,
"_source":"interests"
},
{
"_index":"lib",
"_type":"user",
"_id":"AWdQF9axrlJvDlOTtvkF",
"_source":["interests","age"]
},
{
"_index":"book",
"_type":"novel",
"_id":"5",
"_source":["title","word_count"]
}
]
}
# 相同index、type
GET /lib/user/_mget
{
"docs":[
{
"_id":1
},
{
"_id":"AWdQF9axrlJvDlOTtvkF"
}
]
}
# 再简化
GET /lib/user/_mget
{
"ids":["1","AWdQF9axrlJvDlOTtvkF"]
}
使用Bulk API实现批量操作
-
bulk的格式:
{
action:{metadata}n
{resquestbody}n
}
# action:行为
# create:文档不存在时创建(如果存在使用会报错)
# update:更新文档
# index:创建新文档或替换已有文档
# delete:删除一个文档
# metedata:_index,_type,_id -
例子
# 删除
{"delete":{"_index":"lib","_type":"user","_id":"1"}}
# 批量添加
POST /lib/books/_bulk
{"index":{"_id":"1"}}
{"title":"Html5","price":45}
{"index":{"_id":"2"}}
{"title":"PHP","price":35}
{"index":{"_id":"3"}}
{"title":"Java","price":55}
{"index":{"_id":"1"}}
{"title":"Python","price":50}
{"index":{"_id":"1"}}
{"title":"Scala","price":48}
# 批量添加
POST /lib/books/_bulk
{"delete":{"_index":"lib","_type":"books","_id":"4"}}
{"create":{"_index":"tt","_type":"ttt","_id":"100"}}
{"name":"lisi"}
{"index":{"_index":"tt","_type":"ttt"}}
{"name":"zhaosi"}
{"update":{"_index":"lib","_type":"books","_id":"5"}}
{"doc":{"price":58}} -
bulk一次最大处理多少数据量
-
bulk会把将要处理的数据加载入内存中,所以数据量是有限制的,最佳的数据量不是一个确定的数值,它取决于你的硬件,你的文档大小及复杂性,你的索引以及搜索的负载
-
一般建议是1000-5000个文档,大小建议是5-15M,默认不能超过100M,可以再es的配置文件设置
-