Django和PostgreSQL全文搜索:通过搜索查找无法找到的一些术语

时间:2022-09-13 08:31:51

I am using Django 2.0 and postgres (PostgreSQL) 9.6.1

我正在使用Django 2.0和postgres (PostgreSQL) 9.6.1

I am having the below model with headline and body_text:

我有下面的模型标题和body_text:

class Entry(models.Model):
    headline = models.CharField(max_length=255)
    body_text = models.TextField()

    def __str__(self):
        return self.headline

The below is my content

以下是我的内容

headline: cheese making

body_text:

The simplest way to use full text search is to search a single term against a single column in the database. For example: >>> Entry.objects.filter(body_text__search='Cheese') [<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]. This creates a to_tsvector in the database from the body_text field and a plainto_tsquery ...

The following the search results using the the search lookup. I have added 'django.contrib.postgres' in INSTALLED_APPS.

下面的搜索结果使用搜索查找。我添加了“django.contrib。在INSTALLED_APPS postgres。

Case 1: Works
In [1]: Entry.objects.filter(body_text__search='Cheese')
Out[1]: <QuerySet [<Entry: cheese making>]>

Case 2: Not working
In [2]: Entry.objects.filter(body_text__search='Pizza')
Out[2]: <QuerySet []>
(the word Pizza is there in the body_text still is not searching)

Case 3: Not working
In [3]: Entry.objects.filter(body_text__search='vector')
Out[3]: <QuerySet []>
(the word vector is there in to_tsvector

Case 4: Not working
In [9]: Entry.objects.filter(body_text__search='Entry')
Out[9]: <QuerySet []>

Case 5: Not working
In [10]: Entry.objects.filter(body_text__search='data')
Out[10]: <QuerySet []>

How to search for the terms which are not working.

如何搜索无效的术语。

1 个解决方案

#1


1  

We used the postgresql's full-text search module in django for some projects at work and I think that full text search is striping html tags from your Entry's body_text, and it strip <Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes> because < and >.

我们在django中使用了postgresql的全文搜索模块,我认为全文搜索是从条目的body_text中删除html标记,它从吐司食谱>中删除 因为 <和> 。

I tried to apply to_tsvector on your example, with < and > and without them, and the resulting vectors are different:

我尝试在您的示例中应用to_tsvector,使用 <和> 并没有它们,结果向量是不同的:

SQL Fiddle

SQL小提琴

SELECT to_tsvector('The simplest way to use full text search is to search a single term against a single column in the database. For example: >>> Entry.objects.filter(body_text__search=''Cheese'') [<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]. This creates a to_tsvector in the database from the body_text field and a plainto_tsquery ...');

'bodi':25,39 'chees':28 'column':18 'creat':30 'databas':21,36 'entry.objects.filter':24 'exampl':23 'field':41 'full':6 'plainto':44 'search':8,11,27 'simplest':2 'singl':13,17 'term':14 'text':7,26,40 'tsqueri':45 'tsvector':33 'use':5 'way':3

'bodi':25,39 'chees':28 'column':18 'creat':30 'databas':21,36 'entry.objects。过滤':24 'exampl':23 '字段':' ' ' ' ' ' ':' ' ' ':' ':' ' ' ':' ' ' ':' ' ' ':' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '

SELECT to_tsvector('The simplest way to use full text search is to search a single term against a single column in the database. For example: >>> Entry.objects.filter(body_text__search=''Cheese'') [Entry: Cheese on Toast recipes, Entry: Pizza Recipes]. This creates a to_tsvector in the database from the body_text field and a plainto_tsquery ...');

'bodi':25,47 'chees':28,30 'column':18 'creat':38 'databas':21,44 'entri':29,34 'entry.objects.filter':24 'exampl':23 'field':49 'full':6 'pizza':35 'plainto':52 'recip':33,36 'search':8,11,27 'simplest':2 'singl':13,17 'term':14 'text':7,26,48 'toast':32 'tsqueri':53 'tsvector':41 'use':5 'way':3

'bodi':25,47 'chees':28,30 'column:18 'creat':38 'databas:21,44 'entri:29,34 'entry.objects '。过滤':24 'exampl':23 'field:49 'full':6 'pizza:35 'plainto:52 'recip:33,36 'search:8,11,27 'simple ':2 'singl':13,17 'term':14 'text':7,26,48 'toast:32 'tsqueri' way:32

So try your query removing < and > from your body_text.

因此,尝试从body_text中删除 <和> 。

Note

"to_tsvector" is a PostgreSQL function for converting a document to the tsvector data type.

“to_tsvector”是一个PostgreSQL函数,用于将文档转换为tsvector数据类型。

https://www.postgresql.org/docs/9.6/static/textsearch-controls.html

https://www.postgresql.org/docs/9.6/static/textsearch-controls.html

The django.contrib.postgres use it internally to provide the search lookup ( __search )

django.contrib。postgres内部使用它来提供搜索查找(__search)

https://docs.djangoproject.com/en/2.0/ref/contrib/postgres/search/#the-search-lookup

https://docs.djangoproject.com/en/2.0/ref/contrib/postgres/search/ the-search-lookup

#1


1  

We used the postgresql's full-text search module in django for some projects at work and I think that full text search is striping html tags from your Entry's body_text, and it strip <Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes> because < and >.

我们在django中使用了postgresql的全文搜索模块,我认为全文搜索是从条目的body_text中删除html标记,它从吐司食谱>中删除 因为 <和> 。

I tried to apply to_tsvector on your example, with < and > and without them, and the resulting vectors are different:

我尝试在您的示例中应用to_tsvector,使用 <和> 并没有它们,结果向量是不同的:

SQL Fiddle

SQL小提琴

SELECT to_tsvector('The simplest way to use full text search is to search a single term against a single column in the database. For example: >>> Entry.objects.filter(body_text__search=''Cheese'') [<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]. This creates a to_tsvector in the database from the body_text field and a plainto_tsquery ...');

'bodi':25,39 'chees':28 'column':18 'creat':30 'databas':21,36 'entry.objects.filter':24 'exampl':23 'field':41 'full':6 'plainto':44 'search':8,11,27 'simplest':2 'singl':13,17 'term':14 'text':7,26,40 'tsqueri':45 'tsvector':33 'use':5 'way':3

'bodi':25,39 'chees':28 'column':18 'creat':30 'databas':21,36 'entry.objects。过滤':24 'exampl':23 '字段':' ' ' ' ' ' ':' ' ' ':' ':' ' ' ':' ' ' ':' ' ' ':' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '

SELECT to_tsvector('The simplest way to use full text search is to search a single term against a single column in the database. For example: >>> Entry.objects.filter(body_text__search=''Cheese'') [Entry: Cheese on Toast recipes, Entry: Pizza Recipes]. This creates a to_tsvector in the database from the body_text field and a plainto_tsquery ...');

'bodi':25,47 'chees':28,30 'column':18 'creat':38 'databas':21,44 'entri':29,34 'entry.objects.filter':24 'exampl':23 'field':49 'full':6 'pizza':35 'plainto':52 'recip':33,36 'search':8,11,27 'simplest':2 'singl':13,17 'term':14 'text':7,26,48 'toast':32 'tsqueri':53 'tsvector':41 'use':5 'way':3

'bodi':25,47 'chees':28,30 'column:18 'creat':38 'databas:21,44 'entri:29,34 'entry.objects '。过滤':24 'exampl':23 'field:49 'full':6 'pizza:35 'plainto:52 'recip:33,36 'search:8,11,27 'simple ':2 'singl':13,17 'term':14 'text':7,26,48 'toast:32 'tsqueri' way:32

So try your query removing < and > from your body_text.

因此,尝试从body_text中删除 <和> 。

Note

"to_tsvector" is a PostgreSQL function for converting a document to the tsvector data type.

“to_tsvector”是一个PostgreSQL函数,用于将文档转换为tsvector数据类型。

https://www.postgresql.org/docs/9.6/static/textsearch-controls.html

https://www.postgresql.org/docs/9.6/static/textsearch-controls.html

The django.contrib.postgres use it internally to provide the search lookup ( __search )

django.contrib。postgres内部使用它来提供搜索查找(__search)

https://docs.djangoproject.com/en/2.0/ref/contrib/postgres/search/#the-search-lookup

https://docs.djangoproject.com/en/2.0/ref/contrib/postgres/search/ the-search-lookup