Lucene-如何编写Lucene程序

Lucene-如何编写Lucene程序
Lucene版本：7.1
使用Lucene的关键点
1. 创建文档(Document)，添加文件(Field)，保存了原始数据信息；
2. 把文档加入IndexWriter；
3. 使用QueryParser.parse()构建查询内容；
4. 使用IndexSearcher的search()方法，进行查询；
一、创建索引基本流程

//open a Directory
//FSDirectory指的是存放的文件夹，还可以使用缓存RAMDirectory
//indexPath：文件路径
Directory dir = FSDirectory.open(Paths.get(indexPath));
//instantiate Analyzer，处理文本文件
//StandardAnalyzer使用了Unicode文本分割算法，把符号转成小写，过滤出常用语
//不同语言需要使用不同的Analyzer，详见：https://lucene.apache.org/core/7_1_0/analyzers-common/overview-summary.html
Analyzer analyzer = new StandardAnalyzer();
//索引配置内容
IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
//CREATE,APPEND,CREATE_OR_APPEND
iwc.setOpenMode(OpenMode.CREATE);
//instantiate IndexWriter
IndexWriter writer = new IndexWriter(dir, iwc);
//instantiate Document，表示文件的文本内容及创建时间和位置信息等
Document doc = new Document();
//"path":索引字段
doc.add(new StringField("path", file.toString(), Field.Store.YES));
//doc.add(new LongPoint("modified", lastModified));
//doc.add(new TextField("contents", new BufferedReader(new InputStreamReader(stream, StandardCharsets.UTF_8))));
//添加到IndexWriter
writer.addDocument(doc);
//关闭
writer.close();

Lucene索引过程：原始文档转换成文本—>分析文本，处理成大量词汇单元—>分析完的结果保存到索引文件（一个或多个倒排索引的段）

正排索引(forward index)：通过文档ID索引文档，查找文档内容关键词

倒排索引(Inverted index)：通过文档关键词索引文档，查找文档

二、搜索基本流程

IndexReader reader = DirectoryReader.open(FSDirectory.open(Paths.get(index)));
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer();
//索引字段
QueryParser parser = new QueryParser("contents", analyzer);
//查询结果
Query query = parser.parse("123456");
TopDocs results = searcher.search(query, 5 * hitsPerPage);
ScoreDoc[] hits = results.scoreDocs;
相关阅读:
状压DP入门
 二分图匹配(最大匹配：匈牙利算法)
序列自动机入门
 Trie树入门+例题(字典树，前缀树)
扩展KMP算法(Z-Algorithm)
Oracle锁表查询和解锁方法
 oracle获取系统日期--当前时间+前一天+当前月+前一个月
 oracle获取年月日，两个日期相减
 oracle decode函数和 sign函数
 expdp、impdp数据泵导出导入数据
原文地址：https://www.cnblogs.com/bigshark/p/7899147.html