php / mysql / ajax:带有建议的谷歌样式搜索

时间:2022-10-28 21:55:20

I have an ajax script that searches database tables for expressions similar to google search. The SELECT statement just uses LIKE and finds matches in the relevant fields. It worked fine at first but as content has grown, it is giving way too many matches for most search strings.

我有一个ajax脚本,可以在数据库表中搜索类似于谷歌搜索的表达式。 SELECT语句只使用LIKE并在相关字段中查找匹配项。它起初工作正常但随着内容的增长,它为大多数搜索字符串提供了太多的匹配。

For example, if you search for att, you get att but also attention, attaboy, buratta etc.

例如,如果你搜索att,你会得到吸引力,但也要注意,attaboy,buratta等。

Good search engines such as Google seem to have an intermediate table of suggestions that have been vetted by others. Rather than search the data directly, they seem to search the approved phrases such as AT&T and succeed in narrowing the number of results. Has anyone coded something like this and suggest the right dbase schema and query to get relevant results.

像谷歌这样的好搜索引擎似乎有一个中间的建议表,这些建议已被其他人审查过。他们似乎不是直接搜索数据,而是搜索AT&T等批准的短语,并成功缩小结果数量。有没有人编写过类似的东西,并建议正确的dbase架构和查询以获得相关结果。

Right now I am searching table of say names directly with something like

现在我正在用类似的东西直接搜索名字表

$sql = "SELECT lastname from people WHERE lastname LIKE '%$searchstring%'";

I imagine besides people I should create some intermediate table along the lines of

我想,除了人,我应该创建一些中间表

people

id|firstname|lastname|description

niceterms

id|niceterm|peopleid

Then the query could be:

那么查询可以是:

$sql = "SELECT p.lastname,p.peopleid, n.niceterm, n.peopleid,
FROM `people` p
LEFT JOIN `niceterms` n
on p.id = n.peopleid
WHERE niceterm LIKE '%$searchterm%'";

..so when you type something in the search box, you get nice search terms that will yield better results.

..因此,当您在搜索框中键入内容时,您将获得更好的搜索字词,从而产生更好的结果。

But how do I populate the niceterms table. Is this the right approach? I'm not trying to create a whole backweb or pagerank. Just want to narrow search results so they are relevant.

但是我如何填充niceterms表。这是正确的方法吗?我不打算创建一个完整的backweb或pagerank。只想缩小搜索结果,使它们相关。

Thanks for any suggestions.

谢谢你的任何建议。

1 个解决方案

#1


0  

You might want to take a look at FULLTEXT search in Mysql. It allowes you to create powerfull query's based on relevance. You can for example create a BOOLEAN search which allowes you to create a scorerow in your result. The score will be based on rules like does the text start with a karakter combination (yes? +2, no but it does contain the combination: +1)

您可能想看看Mysql中的FULLTEXT搜索。它允许您根据相关性创建强大的查询。例如,您可以创建一个BOOLEAN搜索,允许您在结果中创建一个记分。分数将基于规则,例如文本以karakter组合开头(是吗?+ 2,否,但确实包含组合:+1)

The below code is just another column and it has 3 rules in it:

下面的代码只是另一个列,它有3个规则:

  • Does the p1.name field contain Bl or rock? if yes -> add score
  • p1.name字段是否包含Bl或rock?如果是 - >添加分数

  • Does the p1.name field start with either Bl or rock? if yes -> add score
  • p1.name字段是以Bl还是rock开头的?如果是 - >添加分数

  • IS the p1.name equal to Bl rock? if yes -> add score

    p1.name是否等于Bl rock?如果是 - >添加分数

    MATCH p1.name AGAINST('>Bl* >rock* >((+Bl*) (+rock*)) >("Bl rock")' IN BOOLEAN MODE) AS match

    MATCH p1.name AGAINST('> Bl *> rock *>((+ Bl *)(+ rock *))>(“Bl rock”)'在BOOLEAN MODE中)AS匹配

Now just order by match and it will show you the most relevant searches. You can also combine the order by with multiple statements and add a limit like below:

现在只需按比赛排序,它会显示最相关的搜索。您还可以将订单与多个报表合并,并添加如下限制:

Orders by most recent date, highest match and then orders the matches that have the same score by their character length

按最近日期,最高匹配排序,然后按字符长度对具有相同分数的匹配进行排序

ORDER BY `date` DESC, `match` DESC, LENGTH(`p1`.`name`) ASC

Keep in mind that the above code somehow creates a relevant result based on common cases. Copying Google will be imposible since their algorithms for optimal results / speed are incredible.

请记住,上面的代码以某种方式根据常见情况创建了相关结果。复制Google将是不可能的,因为他们的算法可以获得最佳的结果/速度。

If FULLTEXT search is a step to much, try to make a tag system. Tagging content with unique tag combinations will also result in a more reliable search result

如果FULLTEXT搜索是一个很大的步骤,请尝试制作标签系统。使用唯一标记组合标记内容也会产生更可靠的搜索结果

#1


0  

You might want to take a look at FULLTEXT search in Mysql. It allowes you to create powerfull query's based on relevance. You can for example create a BOOLEAN search which allowes you to create a scorerow in your result. The score will be based on rules like does the text start with a karakter combination (yes? +2, no but it does contain the combination: +1)

您可能想看看Mysql中的FULLTEXT搜索。它允许您根据相关性创建强大的查询。例如,您可以创建一个BOOLEAN搜索,允许您在结果中创建一个记分。分数将基于规则,例如文本以karakter组合开头(是吗?+ 2,否,但确实包含组合:+1)

The below code is just another column and it has 3 rules in it:

下面的代码只是另一个列,它有3个规则:

  • Does the p1.name field contain Bl or rock? if yes -> add score
  • p1.name字段是否包含Bl或rock?如果是 - >添加分数

  • Does the p1.name field start with either Bl or rock? if yes -> add score
  • p1.name字段是以Bl还是rock开头的?如果是 - >添加分数

  • IS the p1.name equal to Bl rock? if yes -> add score

    p1.name是否等于Bl rock?如果是 - >添加分数

    MATCH p1.name AGAINST('>Bl* >rock* >((+Bl*) (+rock*)) >("Bl rock")' IN BOOLEAN MODE) AS match

    MATCH p1.name AGAINST('> Bl *> rock *>((+ Bl *)(+ rock *))>(“Bl rock”)'在BOOLEAN MODE中)AS匹配

Now just order by match and it will show you the most relevant searches. You can also combine the order by with multiple statements and add a limit like below:

现在只需按比赛排序,它会显示最相关的搜索。您还可以将订单与多个报表合并,并添加如下限制:

Orders by most recent date, highest match and then orders the matches that have the same score by their character length

按最近日期,最高匹配排序,然后按字符长度对具有相同分数的匹配进行排序

ORDER BY `date` DESC, `match` DESC, LENGTH(`p1`.`name`) ASC

Keep in mind that the above code somehow creates a relevant result based on common cases. Copying Google will be imposible since their algorithms for optimal results / speed are incredible.

请记住,上面的代码以某种方式根据常见情况创建了相关结果。复制Google将是不可能的,因为他们的算法可以获得最佳的结果/速度。

If FULLTEXT search is a step to much, try to make a tag system. Tagging content with unique tag combinations will also result in a more reliable search result

如果FULLTEXT搜索是一个很大的步骤,请尝试制作标签系统。使用唯一标记组合标记内容也会产生更可靠的搜索结果