• lucene update流程源码分析



    update操作buffer到DocumentsWriterDeleteQueue里,flush时处理deletes.
    DocumentsWriterDeleteQueue使用global DeleteSlice和DWPT DeleteSlice存储deletes。
    DWPT DeleteSlice
    用来更新DWPT绑定的unFlushed segment中docID小于docIdUpTo的docs.
    buffer
    IndexWriter.updateDocuments更新docs时,update term转化成delete TermNode.
    DWPT updateDocuments时,consumer.processDocument之前, 记录docsInRamBefore作为delete docIdUpTo.
    consumer.processDocument把new docs(es需把完整的docs传入)写入buffered segment.
    finishDocuments时, 将deleteNode加入DWPT DeleteSlice和global DeleteSlice.
    applyDeletes
    DWPT flush该segment时,传入该segment BufferedUpdates,构造private segment的FrozenBufferedUpdates。
    FreqProxTermsWriter.applyDeletes中读取term postings,处理pendingUpdates到小于docIDUpTo的docs。liveDocs中对应的bit设为0。

    global DeleteSlice
    用来更新flushed segments.
    buffer
    IndexWriter.deleteDocuments by Terms, deleteDocuments by Queries, updateDocValues时,
    TermArrayNode,QueryArrayNode, DocValuesUpdatesNode直接add到global DeleteSlice.
    IndexWriter.updateDocuments的delete TermNode既加入DWPT DeleteSlice,也加入global DeleteSlice.
    global DeleteSlice中deleteNode的docIdUpTo为MAX INT.
    buffer过大,或prepareFlush时,将globalBufferedUpdates构造global FrozenBufferedUpdates。
    applyDeletes
    IndexWriter flush segment后, applyAllDeletesAndUpdates时,
    通过FrozenBufferedUpdates.applyTermDeletes,FrozenBufferedUpdates.applyQueryDeletes,处理global FrozenBufferedUpdates.
    FrozenBufferedUpdates.applyTermDeletes
    delete by terms, 直接删除包含exact terms的docs.
    遍历segments,直接读取term的postings. 遍历postings删除doc。
    FrozenBufferedUpdates.applyQueryDeletes
    delete by queries, 走query流程,PhraseQuery等会analyze, normalize.
    TermQuery情况下,等同于applyTermDeletes。
    FrozenBufferedUpdates.

    参考:
    lucene8.7.0
  • 相关阅读:
    HDU 2078 复习时间
    HDU 2076 夹角有多大
    邮票(codevs 2033)
    特种部队(codevs 1427)
    小a和uim之大逃离(洛谷 1373)
    地铁间谍(洛谷 2583)
    推销员(codevs 5126)
    小朋友的数字(codevs 3293)
    车站分级(洛谷 1983)
    Code(poj 17801)
  • 原文地址:https://www.cnblogs.com/vsop/p/14460278.html
Copyright © 2020-2023  润新知