我的云之旅–Lucene内容存储进入Hadoop(136)

我的云之旅–Lucene内容存储进入Hadoop(136)

首先了解一下Lucene的使用：

package com.rx;索引的建立：

import java.io.File;

import java.io.IOException;

import org.apache.lucene.analysis.standard.StandardAnalyzer;

import org.apache.lucene.document.Document;

import org.apache.lucene.document.Field;

import org.apache.lucene.index.IndexWriter;

import org.apache.lucene.index.IndexWriterConfig;

import org.apache.lucene.store.Directory;

import org.apache.lucene.store.SimpleFSDirectory;

import org.apache.lucene.util.Version;

public class C {

public static void main(String[] args) throws IOException {

IndexWriterConfig i = new IndexWriterConfig(Version.LUCENE_35, new StandardAnalyzer(Version.LUCENE_35));

Directory d = new SimpleFSDirectory(new File("E:/index"));

IndexWriter writer = new IndexWriter(d, i);

Document doc = new Document();

doc.add(new Field("title", "lucene introduction", Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS));

doc.add(new Field("time", "60", Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS));

writer.addDocument(doc);

writer.commit();

writer.close();

}

}

索引的查询：

package com.rx;

import java.io.File;

import java.io.IOException;

import org.apache.lucene.document.Document;

import org.apache.lucene.index.IndexReader;

import org.apache.lucene.index.Term;

import org.apache.lucene.search.IndexSearcher;

import org.apache.lucene.search.Query;

import org.apache.lucene.search.ScoreDoc;

import org.apache.lucene.search.TermQuery;

import org.apache.lucene.search.TopDocs;

import org.apache.lucene.store.Directory;

import org.apache.lucene.store.SimpleFSDirectory;

public class R {

public static void main(String[] args) throws IOException {

Directory d = new SimpleFSDirectory(new File("E:/index"));

IndexReader reader = IndexReader.open(d);

IndexSearcher searcher = new IndexSearcher(reader);

Query query = new TermQuery(new Term("title", "lucene"));

TopDocs hits = searcher.search(query, 10);

System.out.println(hits.totalHits);

for (ScoreDoc scoreDoc : hits.scoreDocs) {

System.out.println(scoreDoc.doc);

Document doc = searcher.doc(scoreDoc.doc);

System.out.println("title /t " + doc.get("title"));

System.out.println(doc.get("time"));

}

searcher.close();

}

}

lucene的查看工具：

http://code.google.com/p/luke/

java -jar lukeall-3.5.0.jar 运行即可。

上面的写入运行2次后查询结果：

2

0

title /t lucene introduction

60

1

title /t lucene introduction

60

目前发现有人使用修改的lucene的代码和solr以及Hbase提供的分布式搜索，可以支持三千五百万的日搜索服务。
相关阅读:
CF960G-Bandit Blues【第一类斯特林数,分治,NTT】
P6122-[NEERC2016]Mole Tunnels【模拟费用流】
P5404-[CTS2019]重复【KMP,dp】
P5405-[CTS2019]氪金手游【树形dp,容斥,数学期望】
T183637-变异距离（2021 CoE III C）【单调栈】
61-A
2021-4-1考试
 JAVA日常练习—程序输入string转化为int并求和
 并发编程
 git clone 报filename too long 错误的解决方法
原文地址：https://www.cnblogs.com/hehehaha/p/6332469.html