• Solr安装配置说明


    1安装JDK

    安装文件夹下的jdk-7u80-windows-x64.exe(注意该版本为64位),安装步骤可参考以下网址:

    http://jingyan.baidu.com/article/6dad5075d1dc40a123e36ea3.html

    2安装Tomcat

    安装文件夹下的apache-tomcat-7.0.65.exe

     默认路径为C:/Program Files/Apache Software Foundation/Tomcat 7.0(路径可随意更改).

    2.1启动Tomcat服务器

       浏览器输入: http://localhost:8080 ,如果能打开则为安装正确

       这步操作是为了在Tomcat/conf目录下生成Catalina/localhost目录,这个文件会在接下来使用.

    3安装solr

    解压 压缩包solr-4.7.2.zip  

    将solr部署到Tomcat

       a) 复制solr-4.7.2目录example/solr到Tomcat的根目录下

     b) 复制solr-4.7.2目录dist/solr-4.7.2.war到Tomcat的webapps/solr.war(solr-4.7.2.war重命名为solr.war);

     c) 复制solr-4.7.2目录example/lib/ext下的所有jar到Tomcat的lib下,同时将example/resources下的log4j.properties文件也复制到Tomcat的lib下;

     d) 在Tomcat/conf/Catalina/localhost下创建solr.xml文件,内容如下:

    <?xml version="1.0" encoding="utf-8"?>

    <Context docBase="webapps/solr.war" debug="0" crossContext="true">

    <Environment name="solr/home" type="java.lang.String" value="c:/Program Files/Apache Software Foundation /Tomcat7/solr" override="true"/>

    </Context>

     

    重新启动Tomcat服务器,在浏览器输入http://localhost:8080/solr 查看服务,出现如下界面就安装成功了.

    4安装IK分词器

    5.1 解压IKAnalyzer2012FF_hf1.rar

    复制下面ik分词解压包下面的IKAnalyzer.cfg.xml、IKAnalyzer2012FF_u1.jar、stopword.dic

    三个文件到Tomcat7\webapps\solr\WEB-INF\lib\文件夹下面。

    5.2

    修改Tomcat7\solr\collection1\conf\文件夹下的schema.xml.在<types></types>

    中增加如下内容:

    <!--simple Chinese-->

    <!--配置IK分词器—name是名称 下面可以选择分词器-->

    <fieldType name="text_ik" class="solr.TextField">

    <!--索引时候的分词器-->

    <analyzer type="index" isMaxWordLength="false" class="org.wltea.analyzer.lucene.IKAnalyzer"/>

    <!--查询时候的分词器-->

    <analyzer type="query" isMaxWordLength="true" class="org.wltea.analyzer.lucene.IKAnalyzer"/>

    </fieldType>

    5.3配置自定义分词

    5.3.1 配置扩张分词(ext.dic)

    一、首先在目录D:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\solr\WEB-INF下新建一个文件夹classes并在其下建立ext.dic文档。

    5.3.2 配置顿词器(stopword.dic)

    将IKAnalyzer_home文件夹下的stopword.dic和IKAnalyzer.cfg.xml复制到tomcat_home/webapps/solr/WEB-INF/classes下。

    5.3.2修改IKAnalyzer.cfg.xml

    <?xml version="1.0" encoding="UTF-8"?>

    <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">

    <properties>

    <comment>IK Analyzer 扩展配置</comment>

    <!--用户可以在这里配置自己的扩展字典 -->

    <entry key="ext_dict">ext.dic;</entry>

     

    <!--用户可以在这里配置自己的扩展停止词字典-->

    <entry key="ext_stopwords">stopword.dic;</entry>

     

    </properties>

     

    5.4测试分词

    5.4.1测试分词

    启动Tomcat,在浏览器中输入http://localhost:8080/solr ,在页面左侧菜单中选择Core为collection1,点击Analysis(漏斗形状)菜单,在右侧页面Filed Value文本框中输入要测试分词的中文串.

    5.4.2结果

    在Field Value下方,Analyse FieldName/FieldType中选择text_ik,然后点击左侧Analyze Value按钮,就能看到分词结果。

    5.6效果图如下:

    6配置Solr

    6.1  配置server.xml

         这个文件是tomcat下的配置文件,位置在C:\Program Files\Apache Software Foundation\Tomcat 7.0\conf,检查一下该文件下这段代码:

    <Connector port="8080" protocol="HTTP/1.1"
                   connectionTimeout="20000"
                   redirectPort="8443"
                   URIEncoding="UTF-8"/>

         如果没有URIEncoding="UTF-8",solr在查询的时候可能会出现乱码,有可能导致查不出东西来.

    6.2  配置Solr连接oracle数据库

    6.2.1 

    将文件下oracle驱动包ojdbc6.jar,将其复制到C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\solr\WEB-INF\lib (C:\Program Files\Apache Software Foundation\Tomcat 7.0为tomcat安装路径)

    6.2.2

     在C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\collection1\conf下新建data-config.xml文件

        在C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\collection1\conf \solrconfig.xml,文件里配置data-confing.xml路径

    <requestHandler name="/dataimport"class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
    <str name="config">C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\collection1\conf\data-config.xml</str>
    </lst>
    </requestHandler>

    6.2.3

    将solr4.72文件夹下的dist, contrib文件夹复制到C:\Program Files\Apache Software Foundation\Tomcat 7.0\

    6.2.4

    C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\collection1\conf \solrconfig.xml,文件里配置dist, contrib这两个文件夹的路径(solrconfig.xml已存在这些路径,如果以你放置的路径不一样,修改一下就可以了)

    <lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/contrib/extraction/lib" regex=".*\.jar"/>
    <lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-cell-\d.*\.jar"/>
    <lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/contrib/clustering/lib/" regex=".*\.jar"/>
    <lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-clustering-\d.*\.jar"/>
    <lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/contrib/langid/lib/" regex=".*\.jar"/>
    <lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-langid-\d.*\.jar"/>
    <lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/contrib/velocity/lib" regex=".*\.jar"/>
    <lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-velocity-\d.*\.jar"/>
    <lib dir="C:/Program Files/Apache Software Foundation/Tomcat 7.0/dist/" regex="solr-dataimporthandler-\d.*\.jar"/>

    6.2.5

    将dist文件夹下的这两个文件复制到与数据库驱动同一个文件夹下,以上配置的路径可以用绝对路径配置的,也可以用相对路径

    6.2.6配置连接数据库

    首先是配置data-confing.xml文件,data-confing.xml文件就是连接数据库的配置文件(刚才新建的),将如下代码粘贴到该文件中

     

    <?xml version="1.0" encoding="UTF-8"?>

    <dataConfig>

    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://192.168.0.8;DatabaseName=test" user="sa" password="123"/>

    <document name="Info">

     

    <entity name="zpxx"  transformer="ClobTransformer" pk="id"

                     query="select id, name from table"

                     deltaImportQuery="select id, name from table"

                     deltaQuery="SELECT id FROM table where adddate > '${dataimporter.last_index_time}'">

    <field column=“id"      name=“id"      />

    <field column=“name"      name=“name"      />

    </entity>

    </document>

    </dataConfig>

     

    7配置solr自动增量索引

    1.将apache-solr-dataimportscheduler-1.0.jar复制到C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\solr\WEB-INF\lib (C:\Program Files\Apache Software Foundation\Tomcat 7.0为tomcat安装路径)

    2.修改C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\solr\WEB-INF下的web.xml文件, 在servlet节点前面增加

    <listener>

    <listener-class>

                      org.apache.solr.handler.dataimport.scheduler.ApplicationListener

    </listener-class>

    </listener>

    3.将apache-solr-dataimportscheduler-.jar 中 dataimport.properties 取出,放入C:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\conf,没有conf新建一个

    8重启tomcat即可

    dataimport.properties 配置项说明

     

    #################################################

    #                                               #

    #       dataimport scheduler properties         #

    #                                               #

    #################################################

     

    #  to sync or not to sync

    #  1 - active; anything else - inactive

    syncEnabled=1

     

    #  which cores to schedule

    #  in a multi-core environment you can decide which cores you want syncronized

    #  leave empty or comment it out if using single-core deployment

    syncCores=collection1

     

    #  solr server name or IP address

    #  [defaults to localhost if empty]

    server=localhost

     

    #  solr server port

    #  [defaults to 80 if empty]

    port=8080

     

    #  application name/context

    #  [defaults to current ServletContextListener's context (app) name]

    webapp=solr

     

    #  URL params [mandatory]

    #  remainder of URL

    params=/select?qt=/dataimport&command=delta-import&clean=false&commit=true

     

    #  schedule interval

    #  number of minutes between two runs

    #  [defaults to 30 if empty]

    interval=1

     

    #  重做索引的时间间隔,单位分钟,默认7200,即1天;

    #  为空,为0,或者注释掉:表示永不重做索引

    # reBuildIndexInterval=0

     

    #  重做索引的参数

    #reBuildIndexParams=/select?qt=/dataimport&command=full-import&clean=true&commit=true

     

    #  重做索引时间间隔的计时开始时间,第一次真正执行的时间#=reBuildIndexBeginTime+reBuildIndexInterval*60*1000;

    #  两种格式:2012-04-11 03:10:00 或者  03:10:00,后一种会自动补全日期部分为服务启动时的日期

    reBuildIndexBeginTime=02:10:00

     

    9配置suggest智能提示

    9.1 配置solrconfig.xml

    9.1.1配置solr.SpellCheckComponent节点(有则修改无则添加)

             <searchComponent class="solr.SpellCheckComponent" name="suggest">

    <str name="queryAnalyzerFieldType">text_ik</str>

    <lst name="spellchecker">

    <str name="name">suggest</str>

    <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>

    <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>

    <str name="field">suggest</str>

    <float name="threshold">0</float>

    <str name="buildOnCommit">true</str>

    </lst>

    <lst name="spellchecker">

    <str name="name">default</str>

    <str name="field">suggest</str>

    <str name="classname">solr.DirectSolrSpellChecker</str>

    <str name="distanceMeasure">internal</str>

    <float name="accuracy">0.2</float>

    <int name="maxEdits">2</int>

    <int name="minPrefix">1</int>

    <int name="maxInspections">50</int>

    <int name="minQueryLength">1</int>

    <float name="maxQueryFrequency">0.01</float>

    </lst>

    <lst name="spellchecker">

    <str name="name">wordbreak</str>

    <str name="classname">solr.WordBreakSolrSpellChecker</str>

    <str name="field">suggest</str>

    <str name="combineWords">true</str>

    <str name="breakWords">true</str>

    <int name="maxChanges">10</int>

    </lst>

    </searchComponent>

    <requestHandler class="org.apache.solr.handler.component.SearchHandler" 

                        name="/suggest">

    <lst name="defaults">

    <str name="spellcheck">true</str>

    <str name="spellcheck.dictionary">suggest</str>

    <str name="spellcheck.onlyMorePopular">true</str>

    <str name="spellcheck.extendedResults">false</str>

    <str name="spellcheck.count">10</str>

    <str name="spellcheck.collate">true</str>

    </lst>

    <arr name="components">

    <str>suggest</str>

    </arr>

    </requestHandler>

    <queryConverter name="phraseQueryConverter" class="org.apache.solr.spelling.SpellingQueryConverter"/>

    9.1.2配置org.apache.solr.spelling.SuggestQueryConverter节点(有则修改无则添加)

    <queryConverter name="queryConverter" class="org.apache.solr.spelling.SuggestQueryConverter"/>

    9.2 配置schema.xml

    9.2.1:打开schema.xml添加

    <field name="suggest" type="text_ik" indexed="true" stored="true" multiValued="true" />

    在此处。

    8.2.2:并且在

    添加:

    <copyField source="cprname" dest="suggest" />

    <copyField source="cpcname" dest="suggest" />

    <copyField source="cbname" dest="suggest" />

    <copyField source="cgoodsname" dest="suggest" />

    <copyField source="ccspecname" dest="suggest" />

    <copyField source="cattrname" dest="suggest" />

    此为根据source值所产生的suggest提示。

    8.2.3并添加

    <fieldType name="text_spell" class="solr.TextField">

    <analyzer type="index">

    <tokenizer class="solr.StandardTokenizerFactory"/>

    <filter class="solr.LowerCaseFilterFactory"/>

    </analyzer>

    <analyzer type="query">

    <tokenizer class="solr.StandardTokenizerFactory"/>

    <filter class="solr.LowerCaseFilterFactory"/>

    </analyzer>

    </fieldType>

    <field name="suggestion" type="text_spell" indexed="true" stored="true" termVectors="true" multiValued="true" />

    <copyField source="title" dest="suggestion" />

  • 相关阅读:
    Django Form组件实例:登录界面[Form表单提交,Ajax提交]
    开张啦!
    CART剪枝
    TensorFlow全新的数据读取方式:Dataset API入门教程
    关于神经网络中的padding
    cs231n(一)
    协方差与pearson相关系数
    Auto Encoder
    markdown公式编辑参考
    Python使用日常
  • 原文地址:https://www.cnblogs.com/chuizilong/p/10179491.html
Copyright © 2020-2023  润新知