solr之schema.xml中文翻译
生活随笔
收集整理的這篇文章主要介紹了
solr之schema.xml中文翻译
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
譯者:張春玲
原文地址:http://blog.csdn.net/zcl_love_wx/article/details/51907488
翻譯中……………
聲明:faceting我也一直不清楚在solr到底表示什么,以下遇到該詞我也無能為力。
了解
此時solr已經(jīng)更新到6了,此篇翻譯入門學習之用是夠了。
1. schema.xml文件是solr的schema文件,在solr_home的conf目錄下。
2. 關于schema.xml的英文講解鏈接:http://wiki.apache.org/solr/SchemaXml
3. 約定優(yōu)于配置 (https://zh.wikipedia.org/wiki/%E7%BA%A6%E5%AE%9A%E4%BC%98%E4%BA%8E%E9%85%8D%E7%BD%AE) :
schema.xml文件
字段名由字母數(shù)字下劃線組成,且不能以數(shù)字開頭。兩端為下劃線的字段為保留字段,如(_version_)。
<?xml version="1.0" encoding="UTF-8" ?> <!--Licensed to the Apache Software Foundation (ASF) under one or morecontributor license agreements. See the NOTICE file distributed withthis work for additional information regarding copyright ownership.The ASF licenses this file to You under the Apache License, Version 2.0(the "License"); you may not use this file except in compliance withthe License. You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. --><!-- This is the Solr schema file. This file should be named "schema.xml" andshould be in the conf directory under the solr home(i.e. ./solr/conf/schema.xml by default) or located where the classloader for the Solr webapp can find it.This example schema is the recommended starting point for users.It should be kept correct and concise, usable out-of-the-box.For more information, on how to customize this file, please seehttp://wiki.apache.org/solr/SchemaXml性能須知: 這里包含了很多實際應用不需要的可選項。 為改善性能,你可以: - 盡量將僅用于搜索而不需要實際返回的字段設置stored=”false”; - 盡量將僅用于返回而不用于搜索的字段設置indexed=”false”; - 去掉所有不需要的copyField 語句; - 為了達到最佳的索引大小和搜索性能,對所有的文本字段設置indexed=”false”,使用copyField將他們拷貝到“整合字段”name=”text”的字段中,使用整合字段進行搜索; - 使用server模式來運行JVM,同時將log級別調高, 避免輸出所有請求的日志。 --><schema name="example-DIH-solr" version="1.5"><!-- 屬性"name"是該schema的名字,只用來展示目的。version="x.y" is Solr's version number for the schema syntax and semantics. It should not normally be changed by applications.1.0: multiValued attribute did not exist, all fields are multiValued by nature1.1: multiValued attribute introduced, false by default 1.2: omitTermFreqAndPositions attribute introduced, true by default except for text fields.1.3: removed optional field compress feature1.4: autoGeneratePhraseQueries attribute introduced to drive QueryParserbehavior when a single string produces multiple tokens. Defaults to off for version >= 1.41.5: omitNorms defaults to true for primitive field types (int, float, boolean, string...)--><!-- 字段的有效屬性:name: 字段名 (必須屬性)type: <types>中定義的字段類型 (必須屬性) indexed: 如果字段需要被索引(用于搜索或排序),值設置為truestored: 如果字段內容需要被返回,值設置為true;如果返回的字段在文檔(documents)里沒數(shù)據(jù),則不會返回,即沒有對應數(shù)據(jù)的字段不會被返回。docValues: 如果這個字段應該有文檔值(doc values),設置為true。文檔值在門面搜索,分組,排序和函數(shù)查詢中會非常有用。雖然不是必須的,而且會導致生成索引變大變慢,但這樣設置會使索引加載更快,NRT更加友好,內存使用效率更高。然而也有一些使用限制:目前僅支持StrField, UUIDField和所有 Trie*Fields, 并且依賴字段類型, 可能要求字段為單值(single-valued)的,必須的或者有默認值。multiValued: 如果這個字段在每個文檔中可能包含多個值,設置為truetermVectors: [false] 設置為true后,會保存所給字段的相關向量(vector)當使用MoreLikeThis時, 用于相似度判斷的字段需要設置為stored來達到最佳性能.termPositions: 保存和向量相關的位置信息,會增加存儲開銷 termOffsets: 保存 offset 和向量相關的信息,會增加存儲開銷required: 字段必須有值,否則會拋異常default: 在增加文檔時,可以根據(jù)需要為字段設置一個默認值,防止為空 --><!-- 如果刪除該字段,在solrconfig.xml中就不能更新日志了,且solr也不能啟動。而在SolrCloud中 _version_和更新日志是必須的--><field name="_version_" type="long" indexed="true" stored="true"/><!-- 指向嵌套文檔的一個塊的根文件。支持嵌套文檔需要,可以以其它方式去除 --><field name="_root_" type="string" indexed="true" stored="false"/><!-- 除非不刪不行,否則別刪。雖然不是嚴格要求,但強烈建議保留。 一個<uniqueKey>標簽幾乎出現(xiàn)在所有solr安裝當中。參見后面設置"id"的<uniqueKey>聲明。--><field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="sku" type="text_en_splitting_tight" indexed="true" stored="true" omitNorms="true"/><field name="name" type="text_general" indexed="true" stored="true"/><field name="manu" type="text_general" indexed="true" stored="true" omitNorms="true"/><field name="cat" type="string" indexed="true" stored="true" multiValued="true"/><field name="features" type="text_general" indexed="true" stored="true" multiValued="true"/><field name="includes" type="text_general" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" /><field name="weight" type="float" indexed="true" stored="true"/><field name="price" type="float" indexed="true" stored="true"/><field name="popularity" type="int" indexed="true" stored="true" /><field name="inStock" type="boolean" indexed="true" stored="true" /><field name="store" type="location" indexed="true" stored="true"/><!-- 當解析像Word、PDF這樣豐富的文檔時,普通元數(shù)據(jù)字段會被命一個特殊的名字來與SolrCell的元數(shù)據(jù)匹配。一些字段會有多個值,因為Tika系統(tǒng)可能會為它們返回多個值。一些元數(shù)據(jù)從文件中解析,而還有一些來自客戶端上下文:"content_type": 來自輸入流里的http請求的頭部"resourcename": 來自SolrCell請求參數(shù) resource.name--><field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/><field name="subject" type="text_general" indexed="true" stored="true"/><field name="description" type="text_general" indexed="true" stored="true"/><field name="comments" type="text_general" indexed="true" stored="true"/><field name="author" type="text_general" indexed="true" stored="true"/><field name="keywords" type="text_general" indexed="true" stored="true"/><field name="category" type="text_general" indexed="true" stored="true"/><field name="resourcename" type="text_general" indexed="true" stored="true"/><field name="url" type="text_general" indexed="true" stored="true"/><field name="content_type" type="string" indexed="true" stored="true" multiValued="true"/><field name="last_modified" type="date" indexed="true" stored="true"/><field name="links" type="string" indexed="true" stored="true" multiValued="true"/><!-- 由SolrCell提取文檔主體。注:此字段默認情況下不索引,因為使用copyField被拷貝到了名為text的字段中。這是為了節(jié)省空間。通過這個字段返回和突出(高亮)文檔內容。使用"text"字段搜索該內容 --><field name="content" type="text_general" indexed="false" stored="true" multiValued="true"/><!-- 包羅萬象的字段(整合字段),包含所有其他可供搜索的文本字段(通過copyField實現(xiàn)) --><field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/>catchall text field that indexes tokens both normally and in reverse for efficientleading wildcard queries<!--該字段(包羅萬象的文本字段)不管是正常地還是反向地創(chuàng)建的令牌是為了高效的引領通配符查詢 --><field name="text_rev" type="text_general_rev" indexed="true" stored="false" multiValued="true"/><!-- 沒有標記的制造商版本使得它更容易排序或將結果分組。通過copyField抄襲"manu" --><field name="manu_exact" type="string" indexed="true" stored="false"/><field name="payloads" type="payloads" indexed="true" stored="true"/><!-- 一些像popularity和manu_exact的字段被修改后能利用文檔值:<field name="popularity" type="int" indexed="true" stored="true" docValues="true" /><field name="manu_exact" type="string" indexed="false" stored="false" docValues="true" /><field name="cat" type="string" indexed="true" stored="true" docValues="true" multiValued="true"/>雖然它能會使索引明顯變慢變多,但它也使得加載索引更快,內存使用效率更高,NRT更友好-->//譯者注:所謂動態(tài)字段(Dynamic Field)就是不用指定具體的名稱,只要定義字段名稱的規(guī)則。 //動態(tài)字段允許 solr 去索引沒有在 schema 中 明確定義 的字段。 //假設你忘定義某個字段了,但只要該字段符合動態(tài)字段的定義規(guī)則就一樣能被索引。 //假設schema中定義了一個叫*_i的動態(tài)字段,當你要索引一個在schema中沒有(忘記)定義的 myField_i 字段時,myField_i 能夠被索引到。<!-- 為了字段通過模式匹配字段名的規(guī)范,定義動態(tài)字段允許約定優(yōu)于配置 例: name="*_i" 會匹配任何以_i結尾的字段(如 myid_i,z_i)限制: 這種glob-like匹配模式的名字屬性只能在開頭或結尾必須有一個"*"--> <dynamicField name="*_i" type="int" indexed="true" stored="true"/><dynamicField name="*_is" type="int" indexed="true" stored="true" multiValued="true"/><dynamicField name="*_s" type="string" indexed="true" stored="true" /><dynamicField name="*_ss" type="string" indexed="true" stored="true" multiValued="true"/><dynamicField name="*_l" type="long" indexed="true" stored="true"/><dynamicField name="*_ls" type="long" indexed="true" stored="true" multiValued="true"/><dynamicField name="*_t" type="text_general" indexed="true" stored="true"/><dynamicField name="*_txt" type="text_general" indexed="true" stored="true" multiValued="true"/><dynamicField name="*_en" type="text_en" indexed="true" stored="true" multiValued="true"/><dynamicField name="*_b" type="boolean" indexed="true" stored="true"/><dynamicField name="*_bs" type="boolean" indexed="true" stored="true" multiValued="true"/><dynamicField name="*_f" type="float" indexed="true" stored="true"/><dynamicField name="*_fs" type="float" indexed="true" stored="true" multiValued="true"/><dynamicField name="*_d" type="double" indexed="true" stored="true"/><dynamicField name="*_ds" type="double" indexed="true" stored="true" multiValued="true"/><!-- 經(jīng)常用來索引name="location"的FieldType的組件。 --><dynamicField name="*_dt" type="date" indexed="true" stored="true"/><dynamicField name="*_dts" type="date" indexed="true" stored="true" multiValued="true"/><dynamicField name="*_p" type="location" indexed="true" stored="true"/><!-- 一些像trie-coded的動態(tài)字段范圍查詢更快 --><dynamicField name="*_ti" type="tint" indexed="true" stored="true"/><dynamicField name="*_tl" type="tlong" indexed="true" stored="true"/><dynamicField name="*_tf" type="tfloat" indexed="true" stored="true"/><dynamicField name="*_td" type="tdouble" indexed="true" stored="true"/><dynamicField name="*_tdt" type="tdate" indexed="true" stored="true"/><dynamicField name="*_c" type="currency" indexed="true" stored="true"/><dynamicField name="ignored_*" type="ignored" multiValued="true"/><dynamicField name="attr_*" type="text_general" indexed="true" stored="true" multiValued="true"/><dynamicField name="random_*" type="random" /><!-- 取消所有不匹配已經(jīng)存在的字段或動態(tài)字段,而不是報告一個錯誤。如果你想默認不知道的字段被索引或排序(又或既被索引也被排序),你可以將type的"ignored"值改成其它,如"text" --><!--dynamicField name="*" type="ignored" multiValued="true" /--><!-- 該字段用于標記文檔的唯一性。這是一個必須的字段,除非標記required="false" --><uniqueKey>id</uniqueKey><!-- 棄用:當解析一個沒有明確字段的查詢字符串時,默認查詢字段會被多個查詢解析器訪問。最好確定哪些是非用戶產生的查詢,否則可以使用"df"請求參數(shù)優(yōu)先于該查詢。注意:如果你的請求處理程序在solrconfig.xml里定義了優(yōu)先的"df",注釋的defaultSearchField將是不夠的,那么這些就要被刪除。<defaultSearchField>text</defaultSearchField>--><!-- 棄用:棄用了就不翻譯了。<solrQueryParser defaultOperator="OR"/> -->//譯者注:copyField 將多個字段的內容添加到一個字段中。 //source屬性 表示要復制的屬性 //dest屬性 表示要復制到哪個字段 //source和destination都支持通配符 //還有一個maxChars屬性 表示復制內容的最大字數(shù)<!-- 當向document里添加索引時,copyField命令將復制一個字段。為了更快更容易地搜索,copyField要么給同一個字段創(chuàng)建不同索引,要么向同一個字段里添加多個字段。--><copyField source="cat" dest="text"/><copyField source="name" dest="text"/><copyField source="manu" dest="text"/><copyField source="features" dest="text"/><copyField source="includes" dest="text"/><copyField source="manu" dest="manu_exact"/><!-- 復制price到激活的貨幣字段中,默認美元 --><copyField source="price" dest="price_c"/><!-- 在所有字段中,文本字段默認從SolrCell搜索 --><copyField source="title" dest="text"/><copyField source="author" dest="text"/><copyField source="description" dest="text"/><copyField source="keywords" dest="text"/><copyField source="content" dest="text"/><copyField source="content_type" dest="text"/><copyField source="resourcename" dest="text"/><copyField source="url" dest="text"/><!-- 創(chuàng)建author字段的一個字符串版本 --><copyField source="author" dest="author_s"/><!-- 上面將多個字段復制到了text字段。另一種將多個字段映射到同一目的字段的辦法是使用動態(tài)字段語法。copyField還支持maxChars在復制時設置 --><!-- <copyField source="*_t" dest="text" maxChars="3000"/> --><!-- 將name字段內容復制到按名字排序的alphaNameSort字段中 --><!-- <copyField source="name" dest="alphaNameSort"/> -->//譯者注:fieldType告訴solr如何去處理某個字段的數(shù)據(jù),以及在查詢這個字段地如何處理。 //class其實就是引用的java類,說有該字段的類型 //如果FieldType 是 TextFiled類型,則還有analysis屬性<!-- fieldType 定義:name屬性:只是一個定義字段的標簽class屬性:class屬性和其它屬性決定fieldType的真正行為。class屬性值以solr開頭,指向一個標準包(如org.apache.solr.analysis)里java類--><!-- StrField 類型是逐字索引(indexed)或存儲(stored),而不是解析。它支持doc值,但在這種情況下,字段需要單值和一個必須的默認值。--><fieldType name="string" class="solr.StrField" sortMissingLast="true" /><!-- boolean 類型: "true" or "false" --><fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/><!-- sortMissingLast和sortMissingFirst屬性是可選屬性,目前支持字符串和數(shù)字的內部排序類型。這包括"string","boolean", and, as of 3.5 (and 4.x), int, float, long, date, double, 包括“單詞查找樹”的不同版本。.如果sortMissingLast = " true ",那么將導致 沒有該字段的文檔排在有該字段的文檔的后面,不管規(guī)定的排序方式(asc 或 desc).如果sortMissingFirst= " true ",那么將導致 沒有該字段的文檔排在有該字段的文檔的前面,不管規(guī)定的排序方式.如果 sortMissingLast="false" 和 sortMissingFirst="false" (默認),則將使用默認的lucene排序,即將沒有該字段的文檔放在升序排序中的第一個,降序排序中的最后一個。-->//譯者注: //Solr至少有五種不同的域類型來保存一個整型,如果你算上string類型,那就是六種! //float,double,long和date也是至少有五種類型。以"Trie"開頭的類型可以滿足90%以上的需求。 以Integer為例: // tieIntField(設置precisionStep=0),通常稱為"int"。它對大部分應用場景都是適合的。 // tieIntField(設置precisionStep>0),通常稱為"tint"。如果你想在很多值的域上進行數(shù)值區(qū)間查詢(包括facet區(qū)間),那么這種域類型在查詢時有著極好的性能,但需要在索引時多消耗一些索引空間和時間。 //在Solr的示例配置中數(shù)值類型的precisionStep設置為8,date類型為6,建義使用這種默認設置。因為如果你選擇更小的precisionStep值(但都>0),會導致Solr會為提高區(qū)間查詢性能而增加索引的空間和時間。 // srtableIntField,通常稱為 "sint"。與Trie相似(precisionStep=0),這個類型還支持sortMissingFirst和SortMissingLast屬性。 // intField,通常稱為 "pint"。過時不用了。 // BCDIntFIeld(Binary Coded Decimal)。//所有的數(shù)值類型都按它們的數(shù)值排序而不是按字典序排序。 //ExternalFileField類型是用于處理要進行排序或影響打分的域變化很快的情況的(比如,按投票或是點擊率排序),用這種類型不用再對文檔建索引。<!-- TrieIntField是默認的數(shù)值字段類型。為了范圍查詢更快,最好是tint/ tfloat / tlong / tdouble類型。這些字段支持doc values,但是這些字段只能是單值(默認或是實際需要的值)字段--><fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/><fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/><fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/><fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/><!-- 當查詢范圍內值的數(shù)量很多時,數(shù)值字段類型就在各個精度級別創(chuàng)建索引來加快范圍查詢。NumericRangeQuery 的內部實現(xiàn)細節(jié)參見java文檔較小的precisionStep 值(特別是bits)將導致更多的令牌索引每個值,更多的索引空間,更快地范圍查詢。為0的precisionStep 會禁用不同精度級別的索上。--><fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/><fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0"/><fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0"/><fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0"/><!-- The format for this date field is of the form 1995-12-31T23:59:59Z, andis a more restricted form of the canonical representation of dateTimehttp://www.w3.org/TR/xmlschema-2/#dateTime The trailing "Z" designates UTC time and is mandatory.Optional fractional seconds are allowed: 1995-12-31T23:59:59.999ZAll other components are mandatory.Expressions can also be used to denote calculations that should beperformed relative to "NOW" to determine the value, ie...NOW/HOUR... Round to the start of the current hourNOW-1DAY... Exactly 1 day prior to nowNOW/DAY+6MONTHS+3DAYS... 6 months and 3 days in the future from the start ofthe current dayConsult the TrieDateField javadocs for more information.Note: For faster range queries, consider the tdate type--><fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/><!-- 單詞查找樹基于date字段,使得日期范圍查詢和日期字段和日期分塊更快。 --><fieldType name="tdate" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0"/><!-- 二進制數(shù)據(jù)類型。被檢索或是要發(fā)送的數(shù)據(jù)應該以 Base64編碼成字符串 --><fieldType name="binary" class="solr.BinaryField"/><!-- "RandomSortField"不用來存儲和搜索任何數(shù)據(jù)。你可以在schema中聲明這種類型的字段來生成排序或功能文檔的偽隨機排序。該排序根據(jù)索引的字段名和版本生成。只有該索引的版本不變,且不能有相同的字段名,該文檔的排序就會是一致的。如果你想要字段的同一版本有不同的文檔偽隨機排序,你可使用dynamicField且在請求中更改字段名。--><fieldType name="random" class="solr.RandomSortField" indexed="true" /><!-- solr.TextField 允許指定自定義文本解析器的規(guī)范作為分詞器和令牌過渡器,不同的解析器可以指定索引和查詢。為了防止錯誤短語匹配跨域字段,可選的positionIncrementGap將該類型的多個字段之間的空間放在同一個文檔里。獲取更多關于自定義解析器鏈的信息,請參見:http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters --><!-- 還可以通過解析器的class屬性指定一個現(xiàn)有的有默認構造器的解析器類如:<fieldType name="text_greek" class="solr.TextField"><analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/></fieldType>--><!-- 文本字段只有以空格分割才能精確匹配單詞 --><fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"><analyzer><tokenizer class="solr.WhitespaceTokenizerFactory"/></analyzer></fieldType><!-- 合理的,通用的跨語言的普通字段:標記了StandardTokenizer,刪除"stopwords. txt"(默認空)阻止單詞大小寫不敏感。在查詢時,也適和于同義詞。 --><fieldType name="text_general" class="solr.TextField" positionIncrementGap="100"><analyzer type="index"><tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /><!-- 在這上例子里,我們在查詢 ,只用同義詞。<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>--><filter class="solr.LowerCaseFilterFactory"/></analyzer><analyzer type="query"><tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /><filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/><filter class="solr.LowerCaseFilterFactory"/></analyzer></fieldType><!-- 一個適用于英文的有默認值的文本字段: ittokenizes with StandardTokenizer, removes English stop words(lang/stopwords_en.txt), down cases, protects words from protwords.txt, andfinally applies Porter's stemming. The query time analyzeralso applies synonyms from synonyms.txt. --><fieldType name="text_en" class="solr.TextField" positionIncrementGap="100"><analyzer type="index"><tokenizer class="solr.StandardTokenizerFactory"/><!-- 在該例中,我們在查詢時將使用同義詞。<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>--><!-- 阻止單詞刪除不區(qū)分大小寫--><filter class="solr.StopFilterFactory"ignoreCase="true"words="lang/stopwords_en.txt"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.EnglishPossessiveFilterFactory"/><filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/><!-- 你可選擇沒那么積極的stemmer來替代PorterStemFilterFactory<filter class="solr.EnglishMinimalStemFilterFactory"/>--><filter class="solr.PorterStemFilterFactory"/></analyzer><analyzer type="query"><tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/><filter class="solr.StopFilterFactory"ignoreCase="true"words="lang/stopwords_en.txt"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.EnglishPossessiveFilterFactory"/><filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/><!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:<filter class="solr.EnglishMinimalStemFilterFactory"/>--><filter class="solr.PorterStemFilterFactory"/></analyzer></fieldType><!-- 一個適用于英文的有默認值的文本字段,添加啟用了分割單詞和自動解析功能。該字段除了添加WordDelimiterFilter 使結構發(fā)生變化的單詞,邊界是字母和數(shù)字的單詞,以及沒有字母數(shù)字字符的單詞能分割和匹配外,共他與text_en字段一樣。這意味著某些復合詞情況將起作用,例如查詢"wi fi"將匹配"WiFi"或"wi-fi"。--><fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"><analyzer type="index"><tokenizer class="solr.WhitespaceTokenizerFactory"/><!-- 在該例中,我們在查詢時將只使用同義詞。 <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>--><!-- 阻止單詞刪除時不區(qū)分大小寫 --><filter class="solr.StopFilterFactory"ignoreCase="true"words="lang/stopwords_en.txt"/><filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/><filter class="solr.PorterStemFilterFactory"/></analyzer><analyzer type="query"><tokenizer class="solr.WhitespaceTokenizerFactory"/><filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/><filter class="solr.StopFilterFactory"ignoreCase="true"words="lang/stopwords_en.txt"/><filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/><filter class="solr.PorterStemFilterFactory"/></analyzer></fieldType><!-- 匹配不靈在,匹配出錯就越少。對于產品名稱來說可能不理想,但可能有益于SKU。在錯誤的地方插入破折號依然能匹配。 --><fieldType name="text_en_splitting_tight" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"><analyzer><tokenizer class="solr.WhitespaceTokenizerFactory"/><filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/><filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/><filter class="solr.EnglishMinimalStemFilterFactory"/><!-- 該過濾器有刪除出現(xiàn)在同一位置的任何重復的令牌 - sometimespossible with WordDelimiterFilter in conjuncton with stemming. --><filter class="solr.RemoveDuplicatesTokenFilterFactory"/></analyzer></fieldType><!-- 除了為使通配符查詢更有效率而反轉每個令牌的角色外,其它和text_general 一樣。 --><fieldType name="text_general_rev" class="solr.TextField" positionIncrementGap="100"><analyzer type="index"><tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.ReversedWildcardFilterFactory" withOriginal="true"maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/></analyzer><analyzer type="query"><tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /><filter class="solr.LowerCaseFilterFactory"/></analyzer></fieldType><!-- charFilter + WhitespaceTokenizer --><!--<fieldType name="text_char_norm" class="solr.TextField" positionIncrementGap="100" ><analyzer><charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/><tokenizer class="solr.WhitespaceTokenizerFactory"/></analyzer></fieldType>--><!-- 這是一個使用KeywordTokenizer(分詞器)和許多TokenFilterFactories一同產生一個可排序的字段,不包括源文件的一些屬性。--><fieldType name="alphaOnlySort" class="solr.TextField" sortMissingLast="true" omitNorms="true"><analyzer><!-- KeywordTokenizer沒有真正的分詞,因此輸入的全部字符串保存為一個單一的令牌。--><tokenizer class="solr.KeywordTokenizerFactory"/><!-- 當你想讓你的排序不區(qū)分大小寫,下面這個小寫字母TokenFilter 就是你所需要的。 --><filter class="solr.LowerCaseFilterFactory" /><!-- TrimFilter刪除開頭和結尾的所有空格 --><filter class="solr.TrimFilterFactory" /><!-- PatternReplaceFilter讓你靈活地使用Java正則表達式替換任何字符序列來匹配一個任意替換字符串的正則,這可能包括部分通過正則匹配的原始字符串。更多關于匹配和替換字符串的語法,請參見Java正則表達式文檔:http://docs.oracle.com/javase/7/docs/api/java/util/regex/package-summary.html--><filter class="solr.PatternReplaceFilterFactory"pattern="([^a-z])" replacement="" replace="all"/></analyzer></fieldType><fieldType name="phonetic" stored="false" indexed="true" class="solr.TextField" ><analyzer><tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.DoubleMetaphoneFilterFactory" inject="false"/></analyzer></fieldType><fieldType name="payloads" stored="false" indexed="true" class="solr.TextField" ><analyzer><tokenizer class="solr.WhitespaceTokenizerFactory"/><!-- DelimitedPayloadTokenFilter 可在令牌中設置載荷,例如:"foo|1.4"令技將被索引成載荷為1.4f的"foo"DelimitedPayloadTokenFilterFactory 的屬性:"delimiter": 一個單字符分割符,默認是 |(管道字符);"encoder": 如何在載荷中編碼以下幾種值float -> org.apache.lucene.analysis.payloads.FloatEncoder,integer -> o.a.l.a.p.IntegerEncoderidentity -> o.a.l.a.p.IdentityEncoder完全限定類名實現(xiàn)了PayloadEncoder,編碼器必須有一個無參數(shù)的構造函數(shù)。--><filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float"/></analyzer></fieldType><!-- 作為一個令牌,要小寫整個字段值 --><fieldType name="lowercase" class="solr.TextField" positionIncrementGap="100"><analyzer><tokenizer class="solr.KeywordTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory" /></analyzer></fieldType><!-- 索引時使用PathHierarchyTokenizerFactory 的例子:查詢路徑匹配文檔路徑或派生路徑。 --><fieldType name="descendent_path" class="solr.TextField"><analyzer type="index"><tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" /></analyzer><analyzer type="query"><tokenizer class="solr.KeywordTokenizerFactory" /></analyzer></fieldType><!-- 索引時使用PathHierarchyTokenizerFactory 的例子:查詢路徑匹配文檔路徑或祖先路徑。 --><fieldType name="ancestor_path" class="solr.TextField"><analyzer type="index"><tokenizer class="solr.KeywordTokenizerFactory" /></analyzer><analyzer type="query"><tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" /></analyzer></fieldType><!-- 因為這種類型的字段默認不存儲或索引,添加任何數(shù)據(jù)都將被忽略。 --> <fieldType name="ignored" stored="false" indexed="false" multiValued="true" class="solr.StrField" /><!-- This point type indexes the coordinates as separate fields (subFields)If subFieldType is defined, it references a type, and a dynamic fielddefinition is created matching *___<typename>. Alternately, if subFieldSuffix is defined, that is used to create the subFields.Example: if subFieldType="double", then the coordinates would beindexed in fields myloc_0___double,myloc_1___double.Example: if subFieldSuffix="_d" then the coordinates would be indexedin fields myloc_0_d,myloc_1_dThe subFields are an implementation detail of the fieldType, and endusers normally should not need to know about them.--><fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/><!-- A specialized field for geospatial search. If indexed, this fieldType must not be multivalued. --><fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/><!-- An alternative geospatial field type new to Solr 4. It supports multiValued and polygon shapes.For more information about this and other Spatial fields new to Solr 4, see:http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4--><fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType"geo="true" distErrPct="0.025" maxDistErr="0.001" distanceUnits="kilometers" /><!-- Money/currency field type. See http://wiki.apache.org/solr/MoneyFieldTypeParameters:defaultCurrency: Specifies the default currency if none specified. Defaults to "USD"precisionStep: Specifies the precisionStep for the TrieLong field used for the amountproviderClass: Lets you plug in other exchange provider backend:solr.FileExchangeRateProvider is the default and takes one parameter:currencyConfig: name of an xml file holding exchange ratessolr.OpenExchangeRatesOrgProvider uses rates from openexchangerates.org:ratesFileLocation: URL or path to rates JSON file (default latest.json on the web)refreshInterval: Number of minutes between each rates fetch (default: 1440, min: 60)--><fieldType name="currency" class="solr.CurrencyField" precisionStep="8" defaultCurrency="USD" currencyConfig="currency.xml" /><!-- 不同語言的一些示例(通常由國際化標準織組制定)--><!-- Arabic --><fieldType name="text_ar" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><!-- for any non-arabic --><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ar.txt" /><!-- normalizes ﻯ to ﻱ, etc --><filter class="solr.ArabicNormalizationFilterFactory"/><filter class="solr.ArabicStemFilterFactory"/></analyzer></fieldType><!-- Bulgarian --><fieldType name="text_bg" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_bg.txt" /> <filter class="solr.BulgarianStemFilterFactory"/> </analyzer></fieldType><!-- Catalan --><fieldType name="text_ca" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><!-- removes l', etc --><filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_ca.txt"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ca.txt" /><filter class="solr.SnowballPorterFilterFactory" language="Catalan"/> </analyzer></fieldType><!-- CJK bigram (see text_ja for a Japanese configuration using morphological analysis) --><fieldType name="text_cjk" class="solr.TextField" positionIncrementGap="100"><analyzer><tokenizer class="solr.StandardTokenizerFactory"/><!-- normalize width before bigram, as e.g. half-width dakuten combine --><filter class="solr.CJKWidthFilterFactory"/><!-- for any non-CJK --><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.CJKBigramFilterFactory"/></analyzer></fieldType><!-- Kurdish --><fieldType name="text_ckb" class="solr.TextField" positionIncrementGap="100"><analyzer><tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.SoraniNormalizationFilterFactory"/><!-- for any latin text --><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ckb.txt"/><filter class="solr.SoraniStemFilterFactory"/></analyzer></fieldType><!-- Czech --><fieldType name="text_cz" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_cz.txt" /><filter class="solr.CzechStemFilterFactory"/> </analyzer></fieldType><!-- Danish --><fieldType name="text_da" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_da.txt" format="snowball" /><filter class="solr.SnowballPorterFilterFactory" language="Danish"/> </analyzer></fieldType><!-- German --><fieldType name="text_de" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball" /><filter class="solr.GermanNormalizationFilterFactory"/><filter class="solr.GermanLightStemFilterFactory"/><!-- less aggressive: <filter class="solr.GermanMinimalStemFilterFactory"/> --><!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="German2"/> --></analyzer></fieldType><!-- Greek --><fieldType name="text_el" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><!-- greek specific lowercase for sigma --><filter class="solr.GreekLowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="false" words="lang/stopwords_el.txt" /><filter class="solr.GreekStemFilterFactory"/></analyzer></fieldType><!-- Spanish --><fieldType name="text_es" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_es.txt" format="snowball" /><filter class="solr.SpanishLightStemFilterFactory"/><!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Spanish"/> --></analyzer></fieldType><!-- Basque --><fieldType name="text_eu" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_eu.txt" /><filter class="solr.SnowballPorterFilterFactory" language="Basque"/></analyzer></fieldType><!-- Persian --><fieldType name="text_fa" class="solr.TextField" positionIncrementGap="100"><analyzer><!-- for ZWNJ --><charFilter class="solr.PersianCharFilterFactory"/><tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.ArabicNormalizationFilterFactory"/><filter class="solr.PersianNormalizationFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fa.txt" /></analyzer></fieldType><!-- Finnish --><fieldType name="text_fi" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fi.txt" format="snowball" /><filter class="solr.SnowballPorterFilterFactory" language="Finnish"/><!-- less aggressive: <filter class="solr.FinnishLightStemFilterFactory"/> --></analyzer></fieldType><!-- French --><fieldType name="text_fr" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><!-- removes l', etc --><filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_fr.txt"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fr.txt" format="snowball" /><filter class="solr.FrenchLightStemFilterFactory"/><!-- less aggressive: <filter class="solr.FrenchMinimalStemFilterFactory"/> --><!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="French"/> --></analyzer></fieldType><!-- Irish --><fieldType name="text_ga" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><!-- removes d', etc --><filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_ga.txt"/><!-- removes n-, etc. position increments is intentionally false! --><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/hyphenations_ga.txt"/><filter class="solr.IrishLowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ga.txt"/><filter class="solr.SnowballPorterFilterFactory" language="Irish"/></analyzer></fieldType><!-- Galician --><fieldType name="text_gl" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_gl.txt" /><filter class="solr.GalicianStemFilterFactory"/><!-- less aggressive: <filter class="solr.GalicianMinimalStemFilterFactory"/> --></analyzer></fieldType><!-- Hindi --><fieldType name="text_hi" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><!-- normalizes unicode representation --><filter class="solr.IndicNormalizationFilterFactory"/><!-- normalizes variation in spelling --><filter class="solr.HindiNormalizationFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hi.txt" /><filter class="solr.HindiStemFilterFactory"/></analyzer></fieldType><!-- Hungarian --><fieldType name="text_hu" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hu.txt" format="snowball" /><filter class="solr.SnowballPorterFilterFactory" language="Hungarian"/><!-- less aggressive: <filter class="solr.HungarianLightStemFilterFactory"/> --> </analyzer></fieldType><!-- Armenian --><fieldType name="text_hy" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hy.txt" /><filter class="solr.SnowballPorterFilterFactory" language="Armenian"/></analyzer></fieldType><!-- Indonesian --><fieldType name="text_id" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_id.txt" /><!-- for a less aggressive approach (only inflectional suffixes), set stemDerivational to false --><filter class="solr.IndonesianStemFilterFactory" stemDerivational="true"/></analyzer></fieldType><!-- Italian --><fieldType name="text_it" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><!-- removes l', etc --><filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_it.txt"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_it.txt" format="snowball" /><filter class="solr.ItalianLightStemFilterFactory"/><!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Italian"/> --></analyzer></fieldType><!-- Japanese using morphological analysis (see text_cjk for a configuration using bigramming)NOTE: If you want to optimize search for precision, use default operator AND in your queryparser config with <solrQueryParser defaultOperator="AND"/> further down in this file. Use OR if you would like to optimize for recall (default).--><fieldType name="text_ja" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="false"><analyzer><!-- Kuromoji Japanese morphological analyzer/tokenizer (JapaneseTokenizer)Kuromoji has a search mode (default) that does segmentation useful for search. A heuristicis used to segment compounds into its parts and the compound itself is kept as synonym.Valid values for attribute mode are:normal: regular segmentationsearch: segmentation useful for search with synonyms compounds (default)extended: same as search mode, but unigrams unknown words (experimental)For some applications it might be good to use search mode for indexing and normal mode forqueries to reduce recall and prevent parts of compounds from being matched and highlighted.Use <analyzer type="index"> and <analyzer type="query"> for this and mode normal in query.Kuromoji also has a convenient user dictionary feature that allows overriding the statisticalmodel with your own entries for segmentation, part-of-speech tags and readings without a needto specify weights. Notice that user dictionaries have not been subject to extensive testing.User dictionary attributes are:userDictionary: user dictionary filenameuserDictionaryEncoding: user dictionary encoding (default is UTF-8)See lang/userdict_ja.txt for a sample user dictionary file.Punctuation characters are discarded by default. Use discardPunctuation="false" to keep them.See http://wiki.apache.org/solr/JapaneseLanguageSupport for more on Japanese language support.--><tokenizer class="solr.JapaneseTokenizerFactory" mode="search"/><!--<tokenizer class="solr.JapaneseTokenizerFactory" mode="search" userDictionary="lang/userdict_ja.txt"/>--><!-- Reduces inflected verbs and adjectives to their base/dictionary forms (辭書形) --><filter class="solr.JapaneseBaseFormFilterFactory"/><!-- Removes tokens with certain part-of-speech tags --><filter class="solr.JapanesePartOfSpeechStopFilterFactory" tags="lang/stoptags_ja.txt" /><!-- Normalizes full-width romaji to half-width and half-width kana to full-width (Unicode NFKC subset) --><filter class="solr.CJKWidthFilterFactory"/><!-- Removes common tokens typically not useful for search, but have a negative effect on ranking --><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ja.txt" /><!-- Normalizes common katakana spelling variations by removing any last long sound character (U+30FC) --><filter class="solr.JapaneseKatakanaStemFilterFactory" minimumLength="4"/><!-- Lower-cases romaji characters --><filter class="solr.LowerCaseFilterFactory"/></analyzer></fieldType><!-- Latvian --><fieldType name="text_lv" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_lv.txt" /><filter class="solr.LatvianStemFilterFactory"/></analyzer></fieldType><!-- Dutch --><fieldType name="text_nl" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_nl.txt" format="snowball" /><filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_nl.txt" ignoreCase="false"/><filter class="solr.SnowballPorterFilterFactory" language="Dutch"/></analyzer></fieldType><!-- Norwegian --><fieldType name="text_no" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_no.txt" format="snowball" /><filter class="solr.SnowballPorterFilterFactory" language="Norwegian"/><!-- less aggressive: <filter class="solr.NorwegianLightStemFilterFactory" variant="nb"/> --><!-- singular/plural: <filter class="solr.NorwegianMinimalStemFilterFactory" variant="nb"/> --><!-- The "light" and "minimal" stemmers support variants: nb=Bokm?l, nn=Nynorsk, no=Both --></analyzer></fieldType><!-- Portuguese --><fieldType name="text_pt" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_pt.txt" format="snowball" /><filter class="solr.PortugueseLightStemFilterFactory"/><!-- less aggressive: <filter class="solr.PortugueseMinimalStemFilterFactory"/> --><!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Portuguese"/> --><!-- most aggressive: <filter class="solr.PortugueseStemFilterFactory"/> --></analyzer></fieldType><!-- Romanian --><fieldType name="text_ro" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ro.txt" /><filter class="solr.SnowballPorterFilterFactory" language="Romanian"/></analyzer></fieldType><!-- Russian --><fieldType name="text_ru" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt" format="snowball" /><filter class="solr.SnowballPorterFilterFactory" language="Russian"/><!-- less aggressive: <filter class="solr.RussianLightStemFilterFactory"/> --></analyzer></fieldType><!-- Swedish --><fieldType name="text_sv" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_sv.txt" format="snowball" /><filter class="solr.SnowballPorterFilterFactory" language="Swedish"/><!-- less aggressive: <filter class="solr.SwedishLightStemFilterFactory"/> --></analyzer></fieldType><!-- Thai --><fieldType name="text_th" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/><filter class="solr.ThaiWordFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_th.txt" /></analyzer></fieldType><!-- Turkish --><fieldType name="text_tr" class="solr.TextField" positionIncrementGap="100"><analyzer> <tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.ApostropheFilterFactory"/><filter class="solr.TurkishLowerCaseFilterFactory"/><filter class="solr.StopFilterFactory" ignoreCase="false" words="lang/stopwords_tr.txt" /><filter class="solr.SnowballPorterFilterFactory" language="Turkish"/></analyzer></fieldType><!-- Similarity is the scoring routine for each document vs. a query.A custom Similarity or SimilarityFactory may be specified here, but the default is fine for most applications. For more info: http://wiki.apache.org/solr/SchemaXml#Similarity--><!--<similarity class="com.example.solr.CustomSimilarityFactory"><str name="paramkey">param value</str></similarity>--></schema> 《新程序員》:云原生和全面數(shù)字化實踐50位技術專家共同創(chuàng)作,文字、視頻、音頻交互閱讀總結
以上是生活随笔為你收集整理的solr之schema.xml中文翻译的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: solr从数据库为solr_home导入
- 下一篇: angular项目如何分层