从mysql中的文本字段中提取所有单词

时间:2022-09-13 09:49:06

I have a table that contains text fields. In those fields I store text. There are around 20 to 50 sentences in each field depending on the row. I am making an auto-complete HTML object with HTML and PHP, and I would like to start typing the beginning of a word and that the database return sentences containing those words (Like Microsoft office 2007/2010 navigation pane).

我有一个包含文本字段的表。在那些字段中我存储文本。根据行,每个字段中有大约20到50个句子。我正在使用HTML和PHP制作一个自动完成的HTML对象,我想开始输入一个单词的开头,并且数据库返回包含这些单词的句子(如Microsoft Office 2007/2010导航窗格)。

I need mysql to return those words or sentences as a separate result, so i can manipulate them further.

我需要mysql将这些单词或句子作为单独的结果返回,所以我可以进一步操作它们。

Example:

--------------------------------------------------------------------
| id    | title |content                                            |
--------------------------------------------------------------------
1 |  test 1    |  PHP is a very nice language and has nice features.
2 |  test 2    |  Spain is a nice country to visit and has nice language.
3 |  test 3    |  Perl isn\'t as nice a language as PHP.

I need mysql query to return following as different result:

我需要mysql查询返回以下不同的结果:

1,"nice language"
1,"nice features"
2,"nice country"
2,"nice langugage"
3,"nicea a language"

Here is my sql query:

这是我的SQL查询:

SELECT id, SUBSTR(content,POSITION('nice' IN content),50)
    FROM entries 
          MATCH (title,entry) AGAINST ('nice' WITH QUERY EXPANSION)

1 个解决方案

#1


3  

New Answer

OP is actually asking nothing to do with and - his question concerns doing string manipulation directly within MySQL.

OP实际上与php和javascript没有任何关系 - 他的问题涉及在MySQL中直接进行字符串操作。

String manipulation isn't really the main focus of a DBMS. When dealing with "words" in a fluid text sense, there's a lot of logic required to determine where the next word boundary is, and you don't want your database doing this really. Plus, any queries written to do this will probably be incredibly difficult to read.

字符串操作实际上并不是DBMS的主要关注点。在流畅的文本意义上处理“单词”时,需要很多逻辑来确定下一个单词边界的位置,并且您不希望数据库真的这样做。此外,为此而编写的任何查询都可能非常难以阅读。

It depends exactly what you are doing, but it's quite likely that a DB only approach will be slower because there will be more function calls: SQL functions are pretty limited.

它完全取决于你在做什么,但很可能只有DB的方法会更慢,因为会有更多的函数调用:SQL函数非常有限。

And for re-usability and best practice, what if you wanted to change your database in the future to say MongoDB? You'd need to re-write the whole damned awkward query.

对于可重用性和最佳实践,如果您希望将来更改数据库以说MongoDB怎么办?你需要重新编写整个该死的笨拙查询。

No, my suggestion would be to pull the whole value using standard MySQL into PHP, throw it into PCRE, very simple regex, job done. It's better to show what you're actually doing in your PHP code as it's more "intention revealing".

不,我的建议是使用标准MySQL将整个价值提取到PHP中,将其投入PCRE,非常简单的正则表达式,完成工作。最好显示你在PHP代码中实际执行的操作,因为它更具“意图揭示”。

At least 33% of a developer's work is picking the right tool for the job. PHP is the right tool in this example.

至少有33%的开发人员正在为工作挑选合适的工具。 PHP是此示例中的正确工具。

Original Answer

You have included the tags and , so I'm guessing (although your question needs more clarification on this) that you obviously want this 'autocomplete' running client-side. So as a result, you have to get your data from server-side to client-side first.

你已经包含了标签php和javascript,所以我猜测(虽然你的问题需要更多澄清),你显然希望这个'自动完成'运行客户端。因此,您必须首先从服务器端到客户端获取数据。

Twitter Bootstrap has something really cool called Typeahead. This uses JavaScript to perform (what I think) you require: the example on that page shows how you can type a country and it'll auto-complete it for you. It looks like this:

Twitter Bootstrap有一些非常酷的东西叫做Typeahead。这使用JavaScript来执行(我认为)您需要的内容:该页面上的示例显示了如何键入国家/地区,它将自动为您完成。它看起来像这样:

从mysql中的文本字段中提取所有单词

How do you get this working? Include the required JavaScript file first, and then write your HTML.

你是如何得到这个工作的?首先包含所需的JavaScript文件,然后编写HTML。

Here's some from the source code of the bootstrap page so you can see how it works:

这里有一些来自bootstrap页面的源代码,所以你可以看到它是如何工作的:

<input type="text" data-provide="typeahead" data-items="4" data-source='["Alabama","Alaska","Arizona","Arkansas","California"]'>

Can you see how the data-source attribute is the one that gives the typeahead the information you want? You want to connect to MySQL, grab your data, and shove these into the data-source array for the JavaScript to work with, as above.

你能看到数据源属性是如何为typeahead提供你想要的信息吗?如上所述,您希望连接到MySQL,获取数据并将这些数据推送到数据源数组中以供JavaScript使用。

So, on your page load, you connect to MySQL and you pull all the relevant strings you would like to be "auto-complete-able" from the Database. You then put these as new Data attributes for the typeahead, and that's pretty much it!

因此,在页面加载时,您连接到MySQL并从数据库中提取您希望“自动完成”的所有相关字符串。然后,您将这些作为typeahead的新数据属性,这就是它!

--

Edit: There's a fork of twitter bootstrap's typeahead that allows AJAX calls, so you could use this to perform the data retrieval asynchronously (if you can figure it out, I'd recommend this approach).

编辑:有一个twitter bootstrap的typeahead的分支允许AJAX调用,所以你可以使用它来异步执行数据检索(如果你能搞清楚,我推荐这种方法)。

#1


3  

New Answer

OP is actually asking nothing to do with and - his question concerns doing string manipulation directly within MySQL.

OP实际上与php和javascript没有任何关系 - 他的问题涉及在MySQL中直接进行字符串操作。

String manipulation isn't really the main focus of a DBMS. When dealing with "words" in a fluid text sense, there's a lot of logic required to determine where the next word boundary is, and you don't want your database doing this really. Plus, any queries written to do this will probably be incredibly difficult to read.

字符串操作实际上并不是DBMS的主要关注点。在流畅的文本意义上处理“单词”时,需要很多逻辑来确定下一个单词边界的位置,并且您不希望数据库真的这样做。此外,为此而编写的任何查询都可能非常难以阅读。

It depends exactly what you are doing, but it's quite likely that a DB only approach will be slower because there will be more function calls: SQL functions are pretty limited.

它完全取决于你在做什么,但很可能只有DB的方法会更慢,因为会有更多的函数调用:SQL函数非常有限。

And for re-usability and best practice, what if you wanted to change your database in the future to say MongoDB? You'd need to re-write the whole damned awkward query.

对于可重用性和最佳实践,如果您希望将来更改数据库以说MongoDB怎么办?你需要重新编写整个该死的笨拙查询。

No, my suggestion would be to pull the whole value using standard MySQL into PHP, throw it into PCRE, very simple regex, job done. It's better to show what you're actually doing in your PHP code as it's more "intention revealing".

不,我的建议是使用标准MySQL将整个价值提取到PHP中,将其投入PCRE,非常简单的正则表达式,完成工作。最好显示你在PHP代码中实际执行的操作,因为它更具“意图揭示”。

At least 33% of a developer's work is picking the right tool for the job. PHP is the right tool in this example.

至少有33%的开发人员正在为工作挑选合适的工具。 PHP是此示例中的正确工具。

Original Answer

You have included the tags and , so I'm guessing (although your question needs more clarification on this) that you obviously want this 'autocomplete' running client-side. So as a result, you have to get your data from server-side to client-side first.

你已经包含了标签php和javascript,所以我猜测(虽然你的问题需要更多澄清),你显然希望这个'自动完成'运行客户端。因此,您必须首先从服务器端到客户端获取数据。

Twitter Bootstrap has something really cool called Typeahead. This uses JavaScript to perform (what I think) you require: the example on that page shows how you can type a country and it'll auto-complete it for you. It looks like this:

Twitter Bootstrap有一些非常酷的东西叫做Typeahead。这使用JavaScript来执行(我认为)您需要的内容:该页面上的示例显示了如何键入国家/地区,它将自动为您完成。它看起来像这样:

从mysql中的文本字段中提取所有单词

How do you get this working? Include the required JavaScript file first, and then write your HTML.

你是如何得到这个工作的?首先包含所需的JavaScript文件,然后编写HTML。

Here's some from the source code of the bootstrap page so you can see how it works:

这里有一些来自bootstrap页面的源代码,所以你可以看到它是如何工作的:

<input type="text" data-provide="typeahead" data-items="4" data-source='["Alabama","Alaska","Arizona","Arkansas","California"]'>

Can you see how the data-source attribute is the one that gives the typeahead the information you want? You want to connect to MySQL, grab your data, and shove these into the data-source array for the JavaScript to work with, as above.

你能看到数据源属性是如何为typeahead提供你想要的信息吗?如上所述,您希望连接到MySQL,获取数据并将这些数据推送到数据源数组中以供JavaScript使用。

So, on your page load, you connect to MySQL and you pull all the relevant strings you would like to be "auto-complete-able" from the Database. You then put these as new Data attributes for the typeahead, and that's pretty much it!

因此,在页面加载时,您连接到MySQL并从数据库中提取您希望“自动完成”的所有相关字符串。然后,您将这些作为typeahead的新数据属性,这就是它!

--

Edit: There's a fork of twitter bootstrap's typeahead that allows AJAX calls, so you could use this to perform the data retrieval asynchronously (if you can figure it out, I'd recommend this approach).

编辑:有一个twitter bootstrap的typeahead的分支允许AJAX调用,所以你可以使用它来异步执行数据检索(如果你能搞清楚,我推荐这种方法)。