Lucene系列-facet--转
https://blog.csdn.net/whuqin/article/details/42524825
1.facet的直觀認識
facet:面、切面、方面。個人理解就是維度,在滿足query的前提下,觀察結果在各維度上的分布(一個維度下各子類的數目)。
如jd上搜“手機”,得到4009個商品。其中品牌、網絡、價格就是商品的維度(facet),點擊某個品牌或者網絡,獲取更細分的結果。
?
?
點擊品牌小米,獲得小米手機的結果,顯示27個。
?
?
點擊移動4G,獲得移動4G、小米手機,顯示4個。
?
?
2.facet特性
facet counting:返回一個facet下某子類的結果數。如上面的品牌維度下小米子類中滿足查詢"手機"的結果有27個。
facet associations:一個文檔與某子類的關聯度,如一本書30%講lucene,70%講solor,這個百分比就是書與分類的關聯度(匹配度、信心度)。
multiple facet requests:支持多facet查詢(多維度查詢)。如查詢品牌為小米、網絡為移動4G的手機。
3.實例
一個facet簡單使用例子,依賴于lucene-facet-4.10.0。講述了從搜手機到品牌、到網絡向下browser的過程。
public class SimpleFacetsExample {
private final Directory indexDir = new RAMDirectory();
private final Directory taxoDir = new RAMDirectory();
private final FacetsConfig config = new FacetsConfig();
/** Empty constructor */
public SimpleFacetsExample() {
config.setHierarchical("Publish Date", true);
}
/** Build the example index. */
private void index() throws IOException {
IndexWriter indexWriter = new IndexWriter(indexDir, new IndexWriterConfig(Version.LUCENE_4_10_0,
new WhitespaceAnalyzer()));
// Writes facet ords to a separate directory from the main index
DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir);
Document doc = new Document();
doc.add(new TextField("device", "手機", Field.Store.YES));
doc.add(new TextField("name", "米1", Field.Store.YES));
doc.add(new FacetField("brand", "小米"));
doc.add(new FacetField("network", "移動4G"));
indexWriter.addDocument(config.build(taxoWriter, doc));
doc = new Document();
doc.add(new TextField("device", "手機", Field.Store.YES));
doc.add(new TextField("name", "米4", Field.Store.YES));
doc.add(new FacetField("brand", "小米"));
doc.add(new FacetField("network", "聯通4G"));
indexWriter.addDocument(config.build(taxoWriter, doc));
doc = new Document();
doc.add(new TextField("device", "手機", Field.Store.YES));
doc.add(new TextField("name", "榮耀6", Field.Store.YES));
doc.add(new FacetField("brand", "華為"));
doc.add(new FacetField("network", "移動4G"));
indexWriter.addDocument(config.build(taxoWriter, doc));
doc = new Document();
doc.add(new TextField("device", "電視", Field.Store.YES));
doc.add(new TextField("name", "小米電視2", Field.Store.YES));
doc.add(new FacetField("brand", "小米"));
indexWriter.addDocument(config.build(taxoWriter, doc));
taxoWriter.close();
indexWriter.close();
}
private void facetsWithSearch() throws IOException {
DirectoryReader indexReader = DirectoryReader.open(indexDir);
IndexSearcher searcher = new IndexSearcher(indexReader);
TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoDir);
FacetsCollector fc = new FacetsCollector();
//1.查詢手機
System.out.println("-----手機-----");
TermQuery query = new TermQuery(new Term("device", "手機"));
FacetsCollector.search(searcher, query, 10, fc);
Facets facets = new FastTaxonomyFacetCounts(taxoReader, config, fc);
List<FacetResult> results = facets.getAllDims(10);
//手機總共有3個,品牌維度:小米2個,華為1個;網絡維度:移動4G 2個,聯通4G 1個
for (FacetResult tmp : results) {
System.out.println(tmp);
}
//2.drill down,品牌選小米
System.out.println("-----小米手機-----");
DrillDownQuery drillDownQuery = new DrillDownQuery(config, query);
drillDownQuery.add("brand", "小米");
FacetsCollector fc1 = new FacetsCollector();//要new新collector,否則會累加
FacetsCollector.search(searcher, drillDownQuery, 10, fc1);
facets = new FastTaxonomyFacetCounts(taxoReader, config, fc1);
results = facets.getAllDims(10);
//獲得小米手機的分布,總數2個,網絡:移動4G 1個,聯通4G 1個
for (FacetResult tmp : results) {
System.out.println(tmp);
}
//3.drill down,小米移動4G手機
System.out.println("-----移動4G小米手機-----");
drillDownQuery.add("network", "移動4G");
FacetsCollector fc2 = new FacetsCollector();
FacetsCollector.search(searcher, drillDownQuery, 10, fc2);
facets = new FastTaxonomyFacetCounts(taxoReader, config, fc2);
results = facets.getAllDims(10);
for (FacetResult tmp : results) {
System.out.println(tmp);
}
//4.drill sideways,橫向瀏覽
//如果已經進入了小米手機,但是還想看到其他牌子(華為)的手機數目,就用到了sideways
System.out.println("-----小米手機drill sideways-----");
DrillSideways ds = new DrillSideways(searcher, config, taxoReader);
DrillDownQuery drillDownQuery1 = new DrillDownQuery(config, query);
drillDownQuery1.add("brand", "小米");
DrillSidewaysResult result = ds.search(drillDownQuery1, 10);
results = result.facets.getAllDims(10);
for (FacetResult tmp : results) {
System.out.println(tmp);
}
indexReader.close();
taxoReader.close();
}
/** Runs the search and drill-down examples and prints the results. */
public static void main(String[] args) throws Exception {
SimpleFacetsExample example = new SimpleFacetsExample();
example.index();
example.facetsWithSearch();
}
}
輸出:
-----手機-----
//總數3個,2個子類
dim=brand path=[] value=3 childCount=2
小米 (2)
華為 (1)
dim=network path=[] value=3 childCount=2
移動4G (2)
聯通4G (1)
-----小米手機-----
//普通向下瀏覽,丟失了同一維度,其他子類的統計
dim=brand path=[] value=2 childCount=1
小米 (2)
dim=network path=[] value=2 childCount=2
移動4G (1)
聯通4G (1)
-----移動4G小米手機-----
dim=brand path=[] value=1 childCount=1
小米 (1)
dim=network path=[] value=1 childCount=1
移動4G (1)
-----小米手機drill sideways-----
//drill sideways, 保留了該drill維度的其他子類統計
dim=brand path=[] value=3 childCount=2
小米 (2)
華為 (1)
//小米手機中的網絡分布
dim=network path=[] value=2 childCount=2
移動4G (1)
聯通4G (1)
轉載于:https://www.cnblogs.com/davidwang456/p/10001465.html
總結
以上是生活随笔為你收集整理的Lucene系列-facet--转的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: lucene源码分析(5)lucence
- 下一篇: 分库分表技术演进最佳实践-修订篇