.NET Lucene:为什么不会在BooleanQuery中使用MultiFieldQueryParser和WhitespaceAnalyzer查询产生正确的结果,如何解决?

时间:2021-06-18 03:10:05

So, I have a short query that I want to construct. I'm using a boolean query to specify that the "type" field of the Document matched from the index must be "Idea", and then I have a search string given by a user that may be one or more words. I want to be able to restrict the results programatically for the client to only contain docs in the index that have the Field "type" equal to "index", but I also want their search term to be able to match any word in the search phrase with a word in the result. I think my code below explains what I want exactly.


WhitespaceAnalyzer analyzer = new WhitespaceAnalyzer();

MultiFieldQueryParser parser = new MultiFieldQueryParser(
    Version.LUCENE_30, new string[] { "company", "description", 
    "name", "posterName"},

parser.AllowLeadingWildcard = true;

Lucene.Net.Search.Query query = parser.Parse(searchParam); 

BooleanQuery bq = new BooleanQuery(); 

TermQuery tQuery = new TermQuery(new Lucene.Net.Index.Term("type", "Idea"));

bq.Add(tQuery, Lucene.Net.Search.Occur.MUST);

bq.Add(query, Lucene.Net.Search.Occur.MUST);

The way that I am indexing data is described in a short amount of the pertinent code below:


Document doc = new Document();
doc.Add(new Field("type",
doc.Add(new Field("company",
    (_idea.Company==null ?
      "Company Not Set for Idea" 
      : _idea.Company.Name),
doc.Add(new Field("description",
doc.Add(new Field("name",
if (_idea.Poster != null)
    doc.Add(new Field("posterName",
      _idea.Poster.FirstName + " " + _idea.Poster.LastName,
      Field.Store.YES, Field.Index.ANALYZED));
doc.Add(new Field("ID",
    _idea.ID.ToString(), Field.Store.YES,

What I don't understand, is that when I search for a given word that I KNOW exists in the index, it returns no results. Its only if I search with a wildcard like "*" or something that I get any results. What I would think is, if the code does exactly what it says it does for the documentation on a MultiFieldQueryParser, it would return matches if any piece of any field in the parameters of company, description, name ect were to be found in a doc. But it doesn't. For example, in one of the docs, I know I have a name field of "Another Idea". When I search for "Another"/"another"/"Idea"/ ect it should return that particular doc. But it doesn't... it does, however, correctly filter the results by the type.


What do I need to do to get this short code snippet to return matches that I want?


2 个解决方案



I figured out how to solve this question, and it turns out to be a no brainer (depending on how much you know about lucene and using Visual Studio asp projects, which I'm not that familiar with). This is my first.

我想出了如何解决这个问题,结果证明这是一个没有道理的(取决于你对lucene的了解程度以及使用我不熟悉的Visual Studio asp项目)。这是我的第一次。

Turns out that you can use the BooleanQuery object to add different queries together, and specify how you want them to operate together. Then you can pass the final sum of all queries to the searcher.


Turns out, I just wasn't splitting the objects and creating queries off of them: I have attached the sample solution that works for me below:


    StandardAnalyzer analyzer =
        new StandardAnalyzer(Version.LUCENE_30);
    MultiFieldQueryParser mfqp = new MultiFieldQueryParser(
         Version.LUCENE_30, new string[] {"company", "description", 
         "name", "posterName"},
    mfqp.DefaultOperator = MultiFieldQueryParser.OR_OPERATOR;
         mfqp.AllowLeadingWildcard = true;
         BooleanQuery innerExpr = new BooleanQuery();
         foreach (string s in searchParam.Split(new char[] {' '})) {
             innerExpr.Add(mfqp.Parse(s), Occur.SHOULD);
   innerExpr.Add(new WildcardQuery(new Term("company", searchParam)), Occur.SHOULD);
   innerExpr.Add(new WildcardQuery(new Term("description", searchParam)), Occur.SHOULD);
   innerExpr.Add(new WildcardQuery(new Term("name", searchParam)), Occur.SHOULD);
   innerExpr.Add(new WildcardQuery(new Term("posterName", searchParam)), Occur.SHOULD);

   TermQuery tQuery = new TermQuery(new Term("type", "Idea"));

   //bq.Add(mfqp.Parse(searchParam), Lucene.Net.Search.Occur.MUST);
   TopDocs hits = sharedIndex.Search(innerExpr,
       new QueryWrapperFilter(tQuery), 1000, 
       new Sort(SortField.FIELD_DOC));

This entire route wasn't clear to me when I started on this.




One improvement you can make to that solution, in order to accommodate future changes to your index, would be to create a string array variable to hold your field names, e.g.:


string[] allFields = new string[] {"company", "description", 
     "name", "posterName"};

which in turn will give you a value to put into your parser:


MultiFieldQueryParser mfqp = new MultiFieldQueryParser(
     Version.LUCENE_30, allFields, analyzer);

and the ability to iterate through the fields and have a single line to add your wildcard queries:


foreach (string searchField in allFields) {
    innerExpr.Add(new WildcardQuery(new Term(searchField, searchParam)), Occur.SHOULD);

Then, in the future, you need only add/change/remove field names to the array, and not have to manage your list of queries.




