• solr 5.3.1安装配置


    1、下载Solr5.3.1

    http://mirror.bit.edu.cn/apache/lucene/solr/5.3.1/

    wget http://mirror.bit.edu.cn/apache/lucene/solr/5.3.0/solr-5.3.0.tgz

    2、解压压缩包

    tar zxf solr-5.3.1.tgz
    或
    unzip solr-5.3.1.zip

    3、配置solr

    1、复制solr项目文件

    mkdir -p /data/web/solr/solr_app/
    cp -r /data/solr-5.3.1/server/solr-webapp/webapp/* /data/web/solr/solr_app/

    2、复制dll文件

    cp /data/solr-5.3.1/server/lib/ext/* /data/web/solr/solr_app/WEB-INF/lib/

    3、复制日志文件

    mkdir /data/web/solr/solr_app/WEB-INF/classes
    cp /data/solr-5.3.1/server/resources/log4j.properties /data/web/solr/solr_app/WEB-INF/classes/

    4、修改solr.log文件的存储位置:默认在/root/logs/solr.log

    vim /data/web/solr/solr_app/WEB-INF/classes/log4j.properties

    改成自己的日志路径

    5、复制solr.xml文件到web.xml里面的<env-entry-value>的路径下

    mkdir -p /data/web/solr/solr_app/WEB-INF/solr_home
    cp /data/solr-5.3.1/example/example-DIH/solr/solr.xml /data/web/solr/solr_app/WEB-INF/solr_home/

    6、配置solr_home

    vim /data/web/solr/solr_app/WEB-INF/web.xml   --修改env-entry-value的值:/data/web/solr/solr_app/WEB-INF/solr_home

    tomcat配置->Server.xml->Connector->connectionTimeout="20000"不知道为什么,这个值大了启动tomcat,solr页面显示就是失败的。

    启动tomcat,此时没有集合,如下图:

     4、配置solr集合

    1、进入solr_home,开始配置solr的索引库、分词器、数据源和定时任务:

    cd /data/web/solr/solr_app/WEB-INF/solr_home/

    2、为某一个语言创建solr配置,首先需要该语言的目录,比如:英文

    mkdir pc_EN
    cd pc_EN
    touch core.properties
    mkdir conf
    mkdir data

     

     3、编辑core.properties文件,设置索引名称和索引存放的位置:

    vim core.properties
    --指定索引文件的存放位置(solr_index目录可以创建了mkdir -p /data/web/solr/solr_app/WEB-INF/solr_index
    --文件内容
    name
    =pc_EN
    dataDir
    =/data/web/solr/solr_app/WEB-INF/solr_index/master/pc_EN/data

     

     4、进入conf目录设置索引的数据格式、数据源

    cd conf
    find /data -name solrconfig.xml

    把rss文件夹下面的solrconfig.xml复制到pc_EN/conf目录里面

    cp /data/solr-5.3.0/example/example-DIH/solr/rss/conf/solrconfig.xml solrconfig.xml

    设置solrconfig.xml关联website-data-config.xml文件

    vim solrconfig.xml --搜索name="/dataimport"

    设置solrconfig.xml的solr搜索结果返回的数据格式为:xml

    设置solrconfig.xml关联schema.xml文件,增加如下代码:

    <requestHandler name="/replication" class="solr.ReplicationHandler" >
        <lst name="master">
            <str name="replicateAfter">commit</str>
            <str name="replicateAfter">startup</str>
            <str name="confFiles">schema.xml</str>
        </lst>
    </requestHandler>

    完整的solrconfig.xml文件

      1 <?xml version="1.0" encoding="UTF-8" ?>
      2 <!--
      3  Licensed to the Apache Software Foundation (ASF) under one or more
      4  contributor license agreements.  See the NOTICE file distributed with
      5  this work for additional information regarding copyright ownership.
      6  The ASF licenses this file to You under the Apache License, Version 2.0
      7  (the "License"); you may not use this file except in compliance with
      8  the License.  You may obtain a copy of the License at
      9 
     10      http://www.apache.org/licenses/LICENSE-2.0
     11 
     12  Unless required by applicable law or agreed to in writing, software
     13  distributed under the License is distributed on an "AS IS" BASIS,
     14  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     15  See the License for the specific language governing permissions and
     16  limitations under the License.
     17 -->
     18 
     19 <!--
     20  This is a stripped down config file used for a simple example...  
     21  It is *not* a good example to work from. 
     22 -->
     23 <config>
     24   <luceneMatchVersion>5.3.1</luceneMatchVersion>
     25   <!--  The DirectoryFactory to use for indexes.
     26         solr.StandardDirectoryFactory, the default, is filesystem based.
     27         solr.RAMDirectoryFactory is memory based, not persistent, and doesn't work with replication. -->
     28   <directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>
     29 
     30   <dataDir>${solr.data.dir:}</dataDir>
     31 
     32   <!-- To enable dynamic schema REST APIs, use the following for <schemaFactory>:
     33   
     34        <schemaFactory class="ManagedIndexSchemaFactory">
     35          <bool name="mutable">true</bool>
     36          <str name="managedSchemaResourceName">managed-schema</str>
     37        </schemaFactory>
     38        
     39        When ManagedIndexSchemaFactory is specified, Solr will load the schema from
     40        he resource named in 'managedSchemaResourceName', rather than from schema.xml.
     41        Note that the managed schema resource CANNOT be named schema.xml.  If the managed
     42        schema does not exist, Solr will create it after reading schema.xml, then rename
     43        'schema.xml' to 'schema.xml.bak'. 
     44        
     45        Do NOT hand edit the managed schema - external modifications will be ignored and
     46        overwritten as a result of schema modification REST API calls.
     47 
     48        When ManagedIndexSchemaFactory is specified with mutable = true, schema
     49        modification REST API calls will be allowed; otherwise, error responses will be
     50        sent back for these requests. 
     51   -->
     52   <codecFactory class="solr.SchemaCodecFactory"/>
     53   <schemaFactory class="ClassicIndexSchemaFactory"/>
     54 
     55   <updateHandler class="solr.DirectUpdateHandler2">
     56     <updateLog>
     57       <str name="dir">${solr.data.dir:}</str>
     58       <int name="numVersionBuckets">${solr.ulog.numVersionBuckets:65536}</int>
     59     </updateLog>
     60   </updateHandler>
     61 
     62   <query>
     63     <!-- Max Boolean Clauses
     64 
     65          Maximum number of clauses in each BooleanQuery,  an exception
     66          is thrown if exceeded.
     67 
     68          ** WARNING **
     69          
     70          This option actually modifies a global Lucene property that
     71          will affect all SolrCores.  If multiple solrconfig.xml files
     72          disagree on this property, the value at any given moment will
     73          be based on the last SolrCore to be initialized.
     74          
     75       -->
     76     <maxBooleanClauses>1024</maxBooleanClauses>
     77 
     78 
     79     <!-- Solr Internal Query Caches
     80 
     81          There are two implementations of cache available for Solr,
     82          LRUCache, based on a synchronized LinkedHashMap, and
     83          FastLRUCache, based on a ConcurrentHashMap.  
     84 
     85          FastLRUCache has faster gets and slower puts in single
     86          threaded operation and thus is generally faster than LRUCache
     87          when the hit ratio of the cache is high (> 75%), and may be
     88          faster under other scenarios on multi-cpu systems.
     89     -->
     90 
     91     <!-- Filter Cache
     92 
     93          Cache used by SolrIndexSearcher for filters (DocSets),
     94          unordered sets of *all* documents that match a query.  When a
     95          new searcher is opened, its caches may be prepopulated or
     96          "autowarmed" using data from caches in the old searcher.
     97          autowarmCount is the number of items to prepopulate.  For
     98          LRUCache, the autowarmed items will be the most recently
     99          accessed items.
    100 
    101          Parameters:
    102            class - the SolrCache implementation LRUCache or
    103                (LRUCache or FastLRUCache)
    104            size - the maximum number of entries in the cache
    105            initialSize - the initial capacity (number of entries) of
    106                the cache.  (see java.util.HashMap)
    107            autowarmCount - the number of entries to prepopulate from
    108                and old cache.  
    109       -->
    110     <filterCache class="solr.FastLRUCache"
    111                  size="512"
    112                  initialSize="512"
    113                  autowarmCount="0"/>
    114 
    115     <!-- Query Result Cache
    116 
    117         Caches results of searches - ordered lists of document ids
    118         (DocList) based on a query, a sort, and the range of documents requested.
    119         Additional supported parameter by LRUCache:
    120            maxRamMB - the maximum amount of RAM (in MB) that this cache is allowed
    121                       to occupy
    122      -->
    123     <queryResultCache class="solr.LRUCache"
    124                      size="512"
    125                      initialSize="512"
    126                      autowarmCount="0"/>
    127    
    128     <!-- Document Cache
    129 
    130          Caches Lucene Document objects (the stored fields for each
    131          document).  Since Lucene internal document ids are transient,
    132          this cache will not be autowarmed.  
    133       -->
    134     <documentCache class="solr.LRUCache"
    135                    size="512"
    136                    initialSize="512"
    137                    autowarmCount="0"/>
    138     
    139     <!-- custom cache currently used by block join --> 
    140     <cache name="perSegFilter"
    141       class="solr.search.LRUCache"
    142       size="30"
    143       initialSize="0"
    144       autowarmCount="30"
    145       regenerator="solr.NoOpRegenerator" />
    146 
    147     <!-- Lazy Field Loading
    148 
    149          If true, stored fields that are not requested will be loaded
    150          lazily.  This can result in a significant speed improvement
    151          if the usual case is to not load all stored fields,
    152          especially if the skipped fields are large compressed text
    153          fields.
    154     -->
    155     <enableLazyFieldLoading>true</enableLazyFieldLoading>
    156 
    157    <!-- Result Window Size
    158 
    159         An optimization for use with the queryResultCache.  When a search
    160         is requested, a superset of the requested number of document ids
    161         are collected.  For example, if a search for a particular query
    162         requests matching documents 10 through 19, and queryWindowSize is 50,
    163         then documents 0 through 49 will be collected and cached.  Any further
    164         requests in that range can be satisfied via the cache.  
    165      -->
    166    <queryResultWindowSize>20</queryResultWindowSize>
    167 
    168    <!-- Maximum number of documents to cache for any entry in the
    169         queryResultCache. 
    170      -->
    171    <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
    172 
    173     <!-- Use Cold Searcher
    174 
    175          If a search request comes in and there is no current
    176          registered searcher, then immediately register the still
    177          warming searcher and use it.  If "false" then all requests
    178          will block until the first searcher is done warming.
    179       -->
    180     <useColdSearcher>false</useColdSearcher>
    181 
    182     <!-- Max Warming Searchers
    183          
    184          Maximum number of searchers that may be warming in the
    185          background concurrently.  An error is returned if this limit
    186          is exceeded.
    187 
    188          Recommend values of 1-2 for read-only slaves, higher for
    189          masters w/o cache warming.
    190       -->
    191     <maxWarmingSearchers>2</maxWarmingSearchers>
    192 
    193   </query>
    194 
    195   <requestDispatcher handleSelect="true" >
    196     <requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048" formdataUploadLimitInKB="2048" />
    197   </requestDispatcher>
    198 
    199   <requestHandler name="/select" class="solr.SearchHandler">
    200     <lst name="defaults">
    201       <str name="echoParams">explicit</str>
    202       <str name="wt">xml</str>
    203       <str name="indent">true</str>
    204       <int name="rows">10</int>
    205     </lst>
    206   </requestHandler>
    207   
    208   <requestHandler name="/analysis/field" startup="lazy" class="solr.FieldAnalysisRequestHandler" />
    209      
    210   <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
    211     <lst name="invariants">
    212       <str name="q">*:*</str>
    213     </lst>
    214     <lst name="defaults">
    215       <str name="echoParams">all</str>
    216     </lst>
    217   </requestHandler>
    218   
    219     <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
    220         <lst name="defaults">
    221             <str name="config">website-data-config.xml</str>
    222         </lst>
    223     </requestHandler>
    224   
    225     <requestHandler name="/replication" class="solr.ReplicationHandler" >
    226         <lst name="master">
    227             <str name="replicateAfter">commit</str>
    228             <str name="replicateAfter">startup</str>
    229             <str name="confFiles">schema.xml</str>
    230         </lst>
    231     </requestHandler>
    232   
    233   <!-- config for the admin interface --> 
    234   <admin>
    235     <defaultQuery>*:*</defaultQuery>
    236   </admin>
    237 
    238 </config>
    solrconfig.xml

    schema.xml用来设置solr需要索引的字段

    完整的schema.xml

     1 <?xml version="1.0" ?>
     2 
     3 <schema name="website" version="1.5">
     4     <types>
     5         <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true" />
     6         <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true" />    
     7         <fieldType name="booleans" class="solr.BoolField" sortMissingLast="true" multiValued="true"/>    
     8         <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
     9         <fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
    10         <fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
    11         <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
    12         <fieldType name="date" class="solr.TrieDateField" omitNorms="true" precisionStep="0" positionIncrementGap="0" /> 
    13         <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0" />
    14         <fieldType name="sfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0" />
    15         <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true" positionIncrementGap="0" />
    16         <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true" positionIncrementGap="0" />
    17         <fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0" />
    18         <fieldType name="tdates" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0" multiValued="true"/>
    19         <fieldType name="tints" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0" multiValued="true"/>
    20         <fieldType name="tfloats" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0" multiValued="true"/>
    21         <fieldType name="tlongs" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0" multiValued="true"/>
    22         <fieldType name="tdoubles" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0" multiValued="true"/>
    23         <fieldType name="text" class="solr.TextField">
    24         <analyzer type="index"  class="org.apache.lucene.analysis.en.EnglishAnalyzer"/>
    25         <analyzer type="query"  class="org.apache.lucene.analysis.en.EnglishAnalyzer"/>
    26     </fieldType>
    27     </types>
    28   <!-- general -->
    29     <fields>
    30         <field name="_version_"                type="long" indexed="true" stored="true"/>
    31         <field name="CultureID"                type="int" indexed="false" stored="true" />    
    32         <field name="DescriptionFull"         type="text" indexed="true" stored="false" />
    33         <field name="DescriptionShort"         type="text" indexed="true" stored="false" />
    34         <field name="ImageJSON"                type="text" indexed="false" stored="true"  />
    35         <field name="IsHot"                    type="int" indexed="false" stored="true"  />
    36         <field name="IsMutilColor"            type="int" indexed="false" stored="true" default="" />
    37         <field name="LeiMuNameJSON"         type="text" indexed="true" stored="true"  />
    38         <field name="PID"                    type="string" indexed="true" stored="true"  />
    39         <field name="PropertyText"            type="text" indexed="true" stored="true"  />
    40         <field name="RequiredText"            type="text" indexed="true" stored="true"  />
    41         <field name="SPUID"                    type="long" indexed="true" stored="true" />
    42         <field name="Sort"                    type="int" indexed="true" stored="true"  />
    43         <field name="Status"                type="int" indexed="true" stored="true"  />
    44         <field name="Title"                    type="text" indexed="true" stored="true"  />
    45         <field name="UpTime"                type="date" indexed="true" stored="true"  />
    46         <field name="Price"                    type="double" indexed="true" stored="true"  />
    47         <field name="SaleCount"                type="long" indexed="true" stored="true"  />
    48         <field name="CustomerRatingCount"    type="long" indexed="false" stored="true"  />
    49         <field name="DisCount"                type="double" indexed="true" stored="true"  />
    50         <field name="Basic_search"             type="text" indexed="true" stored="false" multiValued="true"/>
    51     </fields>
    52  
    53     <!-- field to use to determine and enforce document uniqueness. -->
    54     <uniqueKey>SPUID</uniqueKey>
    55     <!-- field for the QueryParser to use when an explicit fieldname is absent -->
    56     <defaultSearchField>Basic_search</defaultSearchField>
    57     <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
    58     <solrQueryParser defaultOperator="OR"/>
    59     <copyField source="PID"                     dest="Basic_search" />
    60     <copyField source="DescriptionFull"         dest="Basic_search" />
    61     <copyField source="DescriptionShort"        dest="Basic_search" />
    62     <copyField source="LeiMuNameJSON"           dest="Basic_search" />
    63     <copyField source="PropertyText"            dest="Basic_search" />
    64     <copyField source="RequiredText"            dest="Basic_search" />
    65     <copyField source="Title"                    dest="Basic_search" />
    66 </schema>
    schema.xml

    website-data-config.xml设置数据源和数据源格式与schema.xml的字段匹配

    完整的website-data-config.xml

     1 <?xml version="1.0" encoding="UTF-8" ?>
     2 <dataConfig>
     3     <dataSource type="URLDataSource" encoding="UTF-8" />
     4     <document>
     5         <entity name="website"
     6                 processor="XPathEntityProcessor"
     7                 forEach="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel |/LuceneSpuXmlModel"
     8                 url="http://url/product?cultureId=1&amp;pageSize=100&amp;pageIndex=1&amp;siteId=6&amp;platform=1"
     9                 transformer="RegexTransformer,DateFormatTransformer"
    10                 connectionTimeout="120000"
    11                 readTimeout="300000"
    12                 stream="true">
    13             <field column="SPUID"                 xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/SPUID"  />
    14             <field column="PID"                 xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/PID" />
    15             <field column="Title"               xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/Title" />
    16             <field column="Status"               xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/Status" />
    17             <field column="CultureID"           xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/CultureID"          commonField="true" />
    18             <field column="LeiMuNameJSON"          xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/LeiMuNameJSON" />
    19             <field column="DescriptionShort"    xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/DescriptionShort"     commonField="true" />
    20             <field column="DescriptionFull"        xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/DescriptionFull"    commonField="true" />
    21             <field column="Sort"                xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/Sort" />
    22             <field column="ImageJSON"            xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/ImageJSON" />
    23             <field column="PropertyText"        xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/PropertyText" />
    24             <field column="RequiredText"           xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/RequiredText" />
    25             <field column="IsHot"                 xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/IsHot" />
    26             <field column="IsMutilColor"          xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/IsMutilColor" />
    27             <field column="UpTime"               xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/UpTime"             dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss"/>
    28             <field column="Price"               xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/Price" />
    29             <field column="SaleCount"           xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/SaleCount" />
    30             <field column="CustomerRatingCount" xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/CustomerRatingCount" />
    31             <field column="DisCount"               xpath="/LuceneSpuXmlModel/LuceneSpuModelList/LuceneSpuModel/DisCount" />
    32                         
    33             <field column="$hasMore"             xpath="/LuceneSpuXmlModel/HasMore" />
    34             <field column="$nextUrl"             xpath="/LuceneSpuXmlModel/NextPageUrl" />
    35         </entity>
    36     </document>
    37 </dataConfig>
    website-data-config.xml

    启动Tomcat运行solr出错:

    复制数据倒入的包:

    cp /data/solr-5.3.1/dist/solr-dataimporthandler-* /data/web/solr/solr_app/WEB-INF/lib/

    启动tomcat_solr成功界面如下:

    5、设置solr定时任务

    1、复制定时任务包(如果没有复制过)

    cp /data/solr-5.3.1/dist/solr-dataimporthandler-* /data/web/solr/solr_app/WEB-INF/lib/

    2、还需要一个jar也复制到/data/web/solr/solr_app/WEB-INF/lib/下面:

    apache-solr-dataimportscheduler-1.0.jar

    3、修改Web.xml,添加配置节点:

    <listener>
        <listener-class>
             org.apache.solr.handler.dataimport.scheduler.ApplicationListener
        </listener-class>
    </listener>

    4、回到solr_home目录创建conf目录,创建dataimport.properties定时任务文件:

    5、编辑dataimport.properties定时任务文件:

    a、设置syncCores,server,port

    b、设置时间间隔、开始时间:

    http://my.oschina.net/lsf930709/blog/620738(参考文章)

  • 相关阅读:
    linux系统安装过程记录
    Core WebAPI 部署问题
    Linux提示找不到ll命令的解决方法
    Deepin系统编辑新建文档菜单
    phpexcel多sheet分页设置背景色问题
    使用phpexcel时太多问题,建议使用PhpSpreadsheet
    QtBrowser浏览器
    git 查看某一行代码的修改历史
    PHPExcel设置打印区域
    scp 把linux上文件复制到本地报错
  • 原文地址:https://www.cnblogs.com/qiyebao/p/5432201.html
Copyright © 2020-2023  润新知