1安装JDK
安装文件夹下的jdk-7u80-windows-x64.exe(注意该版本为64位),安装步骤可参考以下网址:
http://jingyan.baidu.com/article/6dad5075d1dc40a123e36ea3.html
2安装Tomcat
安装文件夹下的apache-tomcat-7.0.65.exe
默认路径为C:/Program Files/Apache Software Foundation/Tomcat 7.0(路径可随意更改).
2.1启动Tomcat服务器
浏览器输入: http://localhost:8080 ,如果能打开则为安装正确
这步操作是为了在Tomcat/conf目录下生成Catalina/localhost目录,这个文件会在接下来使用.
3安装solr
解压 压缩包solr-4.7.2.zip
将solr部署到Tomcat
a) 复制solr-4.7.2目录example/solr到Tomcat的根目录下
b) 复制solr-4.7.2目录dist/solr-4.7.2.war到Tomcat的webapps/solr.war(solr-4.7.2.war重命名为solr.war);
c) 复制solr-4.7.2目录example/lib/ext下的所有jar到Tomcat的lib下,同时将example/resources下的log4j.properties文件也复制到Tomcat的lib下;
d) 在Tomcat/conf/Catalina/localhost下创建solr.xml文件,内容如下:
<?xml version="1.0" encoding="utf-8"?>
<Context docBase="webapps/solr.war" debug="0" crossContext="true">
<Environment name="solr/home" type="java.lang.String" value="c:/Program Files/Apache Software Foundation /Tomcat7/solr" override="true"/>
</Context>
重新启动Tomcat服务器,在浏览器输入http://localhost:8080/solr 查看服务,出现如下界面就安装成功了.
4安装IK分词器
5.1 解压IKAnalyzer2012FF_hf1.rar
复制下面ik分词解压包下面的IKAnalyzer.cfg.xml、IKAnalyzer2012FF_u1.jar、stopword.dic
三个文件到Tomcat7\webapps\solr\WEB-INF\lib\文件夹下面。
5.2
修改Tomcat7\solr\collection1\conf\文件夹下的schema.xml.在<types></types>
中增加如下内容:
<!--simple Chinese-->
<!--配置IK分词器—name是名称 下面可以选择分词器-->
<fieldType name="text_ik" class="solr.TextField">
<!--索引时候的分词器-->
<analyzer type="index" isMaxWordLength="false" class="org.wltea.analyzer.lucene.IKAnalyzer"/>
<!--查询时候的分词器-->
<analyzer type="query" isMaxWordLength="true" class="org.wltea.analyzer.lucene.IKAnalyzer"/>
</fieldType>
5.3配置自定义分词
5.3.1 配置扩张分词(ext.dic)
一、首先在目录D:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\solr\WEB-INF下新建一个文件夹classes并在其下建立ext.dic文档。
5.3.2 配置顿词器(stopword.dic)
将IKAnalyzer_home文件夹下的stopword.dic和IKAnalyzer.cfg.xml复制到tomcat_home/webapps/solr/WEB-INF/classes下。
5.3.2修改IKAnalyzer.cfg.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict">ext.dic;</entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords">stopword.dic;</entry>
</properties>
5.4测试分词
5.4.1测试分词
启动Tomcat,在浏览器中输入http://localhost:8080/solr ,在页面左侧菜单中选择Core为collection1,点击Analysis(漏斗形状)菜单,在右侧页面Filed Value文本框中输入要测试分词的中文串.
5.4.2结果
在Field Value下方,Analyse FieldName/FieldType中选择text_ik,然后点击左侧Analyze Value按钮,就能看到分词结果。
5.6效果图如下:
6配置Solr
6.1 配置server.xml
这个文件是tomcat下的配置文件,位置在C:\Program Files\Apache Software Foundation\Tomcat 7.0\conf,检查一下该文件下这段代码:
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443"
URIEncoding="UTF-8"/>
如果没有URIEncoding="UTF-8",solr在查询的时候可能会出现乱码,有可能导致查不出东西来.
6.2 配置Solr连接oracle数据库
6.2.1
将文件下oracle驱动包ojdbc6.jar,将其复制到C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\solr\WEB-INF\lib (C:\Program Files\Apache Software Foundation\Tomcat 7.0为tomcat安装路径)
6.2.2
在C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\collection1\conf下新建data-config.xml文件
在C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\collection1\conf \solrconfig.xml,文件里配置data-confing.xml路径
<requestHandler name="/dataimport"class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\collection1\conf\data-config.xml</str>
</lst>
</requestHandler>
6.2.3
将solr4.72文件夹下的dist, contrib文件夹复制到C:\Program Files\Apache Software Foundation\Tomcat 7.0\
6.2.4
在C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\collection1\conf \solrconfig.xml,文件里配置dist, contrib这两个文件夹的路径(solrconfig.xml已存在这些路径,如果以你放置的路径不一样,修改一下就可以了)
<lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/contrib/extraction/lib" regex=".*\.jar"/>
<lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-cell-\d.*\.jar"/>
<lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/contrib/clustering/lib/" regex=".*\.jar"/>
<lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-clustering-\d.*\.jar"/>
<lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/contrib/langid/lib/" regex=".*\.jar"/>
<lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-langid-\d.*\.jar"/>
<lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/contrib/velocity/lib" regex=".*\.jar"/>
<lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-velocity-\d.*\.jar"/>
<lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-dataimporthandler-\d.*\.jar"/>
6.2.5
将dist文件夹下的这两个文件复制到与数据库驱动同一个文件夹下,以上配置的路径可以用绝对路径配置的,也可以用相对路径
6.2.6配置连接数据库
首先是配置data-confing.xml文件,data-confing.xml文件就是连接数据库的配置文件(刚才新建的),将如下代码粘贴到该文件中
<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
<dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://192.168.0.8;DatabaseName=test" user="sa" password="123"/>
<document name="Info">
<entity name="zpxx" transformer="ClobTransformer" pk="id"
query="select id, name from table"
deltaImportQuery="select id, name from table"
deltaQuery="SELECT id FROM table where adddate > '${dataimporter.last_index_time}'">
<field column=“id" name=“id" />
<field column=“name" name=“name" />
</entity>
</document>
</dataConfig>
7配置solr自动增量索引
1.将apache-solr-dataimportscheduler-1.0.jar复制到C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\solr\WEB-INF\lib (C:\Program Files\Apache Software Foundation\Tomcat 7.0为tomcat安装路径)
2.修改C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\solr\WEB-INF下的web.xml文件, 在servlet节点前面增加
<listener>
<listener-class>
org.apache.solr.handler.dataimport.scheduler.ApplicationListener
</listener-class>
</listener>
3.将apache-solr-dataimportscheduler-.jar 中 dataimport.properties 取出,放入C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\conf,没有conf新建一个
8重启tomcat即可
dataimport.properties 配置项说明
#################################################
# #
# dataimport scheduler properties #
# #
#################################################
# to sync or not to sync
# 1 - active; anything else - inactive
syncEnabled=1
# which cores to schedule
# in a multi-core environment you can decide which cores you want syncronized
# leave empty or comment it out if using single-core deployment
syncCores=collection1
# solr server name or IP address
# [defaults to localhost if empty]
server=localhost
# solr server port
# [defaults to 80 if empty]
port=8080
# application name/context
# [defaults to current ServletContextListener's context (app) name]
webapp=solr
# URL params [mandatory]
# remainder of URL
params=/select?qt=/dataimport&command=delta-import&clean=false&commit=true
# schedule interval
# number of minutes between two runs
# [defaults to 30 if empty]
interval=1
# 重做索引的时间间隔,单位分钟,默认7200,即1天;
# 为空,为0,或者注释掉:表示永不重做索引
# reBuildIndexInterval=0
# 重做索引的参数
#reBuildIndexParams=/select?qt=/dataimport&command=full-import&clean=true&commit=true
# 重做索引时间间隔的计时开始时间,第一次真正执行的时间#=reBuildIndexBeginTime+reBuildIndexInterval*60*1000;
# 两种格式:2012-04-11 03:10:00 或者 03:10:00,后一种会自动补全日期部分为服务启动时的日期
reBuildIndexBeginTime=02:10:00
9配置suggest智能提示
9.1 配置solrconfig.xml
9.1.1配置solr.SpellCheckComponent节点(有则修改无则添加)
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<str name="queryAnalyzerFieldType">text_ik</str>
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
<str name="field">suggest</str>
<float name="threshold">0</float>
<str name="buildOnCommit">true</str>
</lst>
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">suggest</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.2</float>
<int name="maxEdits">2</int>
<int name="minPrefix">1</int>
<int name="maxInspections">50</int>
<int name="minQueryLength">1</int>
<float name="maxQueryFrequency">0.01</float>
</lst>
<lst name="spellchecker">
<str name="name">wordbreak</str>
<str name="classname">solr.WordBreakSolrSpellChecker</str>
<str name="field">suggest</str>
<str name="combineWords">true</str>
<str name="breakWords">true</str>
<int name="maxChanges">10</int>
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler"
name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
<queryConverter name="phraseQueryConverter" class="org.apache.solr.spelling.SpellingQueryConverter"/>
9.1.2配置org.apache.solr.spelling.SuggestQueryConverter节点(有则修改无则添加)
<queryConverter name="queryConverter" class="org.apache.solr.spelling.SuggestQueryConverter"/>
9.2 配置schema.xml
9.2.1:打开schema.xml添加
<field name="suggest" type="text_ik" indexed="true" stored="true" multiValued="true" />
在此处。
8.2.2:并且在
添加:
<copyField source="cprname" dest="suggest" />
<copyField source="cpcname" dest="suggest" />
<copyField source="cbname" dest="suggest" />
<copyField source="cgoodsname" dest="suggest" />
<copyField source="ccspecname" dest="suggest" />
<copyField source="cattrname" dest="suggest" />
此为根据source值所产生的suggest提示。
8.2.3并添加
<fieldType name="text_spell" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<field name="suggestion" type="text_spell" indexed="true" stored="true" termVectors="true" multiValued="true" />
<copyField source="title" dest="suggestion" />