如何使用Lucene.net实现自定义过滤器?

时间:2021-06-16 03:06:53

The code below is from the Lucene In Action book (originally in Java). It's for building a list of 'allowed' documents (from a user permission point of view) to filter search results with. The problem is the termsDocs.Read() method does not accept the 'doc' and 'freq' arrays to be passed by reference, so they're still empty when it comes to setting the bit in the bit array.

下面的代码来自Lucene In Action一书(最初是Java)。它用于构建“允许”文档列表(从用户权限的角度来看)以过滤搜索结果。问题是termsDocs.Read()方法不接受通过引用传递的'doc'和'freq'数组,因此在设置位数组中的位时它们仍然是空的。

Can anyone help, examples of using Lucene custom filters (especially in .net) seem to be thin on the ground. Thanks.

任何人都可以提供帮助,使用Lucene自定义过滤器(尤其是.net)的例子似乎很薄。谢谢。

public class LuceneCustomFilter : Lucene.Net.Search.Filter
{
    string[] _luceneIds;

    public LuceneCustomFilter(string[] luceneIds)
    {
        _luceneIds = luceneIds;
    }

    public override BitArray Bits(Lucene.Net.Index.IndexReader indexReader)
    {
        BitArray bitarray = new BitArray(indexReader.MaxDoc());

        int[] docs = new int[1];
        int[] freq = new int[1];

        for (int i = 0; i < _luceneIds.Length; i++)
        {
            if (!string.IsNullOrEmpty(_luceneIds[i]))
            {
                Lucene.Net.Index.TermDocs termDocs = indexReader.TermDocs(
                    new Lucene.Net.Index.Term(@"luceneId", _luceneIds[i]));

                int count = termDocs.Read(docs, freq);

                if (count == 1)
                {
                    bitarray.Set(docs[0], true);
                }
            }
        }

        return bitarray;
    }
}

I'm using Lucene.net 2.0.0.4, but the TermDocs interface still appears to be the same in the latest branch here: https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Index/TermDocs.cs

我正在使用Lucene.net 2.0.0.4,但TermDocs界面在最新的分支中看起来仍然相同:https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/C %23 / src目录/ Lucene.Net /索引/ TermDocs.cs

2 个解决方案

#1


Here's a working example of Lucene.NET using a custom filter you might take a look at:

以下是使用自定义过滤器的Lucene.NET的工作示例,您可以查看:

using System;
using System.Collections;
using Lucene.Net.Analysis;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.Search;
using Lucene.Net.Store;

class Program
{
    static void Main(string[] args)
    {
        Directory index = new RAMDirectory();
        Analyzer analyzer = new KeywordAnalyzer();
        IndexWriter writer = new IndexWriter(index, analyzer, true);

        Document doc = new Document();
        doc.Add(new Field("title", "t1", Field.Store.YES, 
            Field.Index.TOKENIZED));
        writer.AddDocument(doc);
        doc = new Document();
        doc.Add(new Field("title", "t2", Field.Store.YES, 
            Field.Index.TOKENIZED));
        writer.AddDocument(doc);

        writer.Close();

        Searcher searcher = new IndexSearcher(index);
        Query query = new MatchAllDocsQuery();
        Filter filter = new LuceneCustomFilter();
        Sort sort = new Sort("title", true);
        Hits hits = searcher.Search(query, filter, sort);
        IEnumerator hitsEnumerator = hits.Iterator();

        while (hitsEnumerator.MoveNext())
        {
            Hit hit = (Hit)hitsEnumerator.Current;
            Console.WriteLine(hit.GetDocument().GetField("title").
                StringValue());
        }
    }
}

public class LuceneCustomFilter : Filter
{
    public override BitArray Bits(IndexReader indexReader)
    {
        BitArray bitarray = new BitArray(indexReader.MaxDoc());

        int[] docs = new int[1];
        int[] freq = new int[1];

        TermDocs termDocs = indexReader.TermDocs(
                new Term(@"title", "t1"));

        int count = termDocs.Read(docs, freq);
        if (count == 1)
        {
            bitarray.Set(docs[0], true);
        }
        return bitarray;
    }
}

#2


A bit confused here because passing an array does in fact pass it by reference. For instance the following blurb will print 10 10 10 10 10 showing that the array values have been updated.

这里有点困惑,因为传递数组确实通过引用传递它。例如,下面的blurb将打印10 10 10 10 10,表明数组值已更新。

Am I missing something here?

我在这里错过了什么吗?

    public void TestPassing()
    {
        int[] stuff = new int[] {5, 5, 5, 5};

        Add(stuff, 5);
        for (int i = 0; i < stuff.Length; i++)
        {
            Console.Write(stuff[i]);
        }
    }

    public void Add(int[] stuff, int x)
    {
        for(int i = 0; i < stuff.Length; i++)
        {
            stuff[i] = stuff[i] + x;
        }
    }

#1


Here's a working example of Lucene.NET using a custom filter you might take a look at:

以下是使用自定义过滤器的Lucene.NET的工作示例,您可以查看:

using System;
using System.Collections;
using Lucene.Net.Analysis;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.Search;
using Lucene.Net.Store;

class Program
{
    static void Main(string[] args)
    {
        Directory index = new RAMDirectory();
        Analyzer analyzer = new KeywordAnalyzer();
        IndexWriter writer = new IndexWriter(index, analyzer, true);

        Document doc = new Document();
        doc.Add(new Field("title", "t1", Field.Store.YES, 
            Field.Index.TOKENIZED));
        writer.AddDocument(doc);
        doc = new Document();
        doc.Add(new Field("title", "t2", Field.Store.YES, 
            Field.Index.TOKENIZED));
        writer.AddDocument(doc);

        writer.Close();

        Searcher searcher = new IndexSearcher(index);
        Query query = new MatchAllDocsQuery();
        Filter filter = new LuceneCustomFilter();
        Sort sort = new Sort("title", true);
        Hits hits = searcher.Search(query, filter, sort);
        IEnumerator hitsEnumerator = hits.Iterator();

        while (hitsEnumerator.MoveNext())
        {
            Hit hit = (Hit)hitsEnumerator.Current;
            Console.WriteLine(hit.GetDocument().GetField("title").
                StringValue());
        }
    }
}

public class LuceneCustomFilter : Filter
{
    public override BitArray Bits(IndexReader indexReader)
    {
        BitArray bitarray = new BitArray(indexReader.MaxDoc());

        int[] docs = new int[1];
        int[] freq = new int[1];

        TermDocs termDocs = indexReader.TermDocs(
                new Term(@"title", "t1"));

        int count = termDocs.Read(docs, freq);
        if (count == 1)
        {
            bitarray.Set(docs[0], true);
        }
        return bitarray;
    }
}

#2


A bit confused here because passing an array does in fact pass it by reference. For instance the following blurb will print 10 10 10 10 10 showing that the array values have been updated.

这里有点困惑,因为传递数组确实通过引用传递它。例如,下面的blurb将打印10 10 10 10 10,表明数组值已更新。

Am I missing something here?

我在这里错过了什么吗?

    public void TestPassing()
    {
        int[] stuff = new int[] {5, 5, 5, 5};

        Add(stuff, 5);
        for (int i = 0; i < stuff.Length; i++)
        {
            Console.Write(stuff[i]);
        }
    }

    public void Add(int[] stuff, int x)
    {
        for(int i = 0; i < stuff.Length; i++)
        {
            stuff[i] = stuff[i] + x;
        }
    }