Gmail搜索速度如此之快?

时间:2021-05-19 16:59:54

What is the most efficient way to search through so many characters? What do you think?

搜索这么多角色的最有效方法是什么?你怎么看?

Let's say website built in PHP and MySQL.

假设用PHP和MySQL构建的网站。

What should I learn to be able to build this as much efficiently as possible? Are there any algorythms I should learn or something?

我应该学到什么才能尽可能高效地构建它?我应该学习什么algorythms或什么?

4 个解决方案

#1


2  

Text indexing algorithm

文本索引算法

#2


1  

Google uses a custom-made database solution called BigTable, http://en.wikipedia.org/wiki/Big_table, which is run linked over hundreds of servers all over the world. So they're fast because they wrote the software specifically to be fast, and set up the hardware in such a way that they could squeeze the most out of it.

Google使用名为BigTable的定制数据库解决方案http://en.wikipedia.org/wiki/Big_table,该解决方案在全球数百台服务器上运行。所以他们很快,因为他们专门编写了快速的软件,并以这样的方式设置硬件,以便他们可以最大限度地利用它。

You can get to a decent set with PHP and MySQL, but once you start dealing with very large data sets, MySQL, and any other generic database, will start to buckle under the stress. If you want to learn more about this, a good place to start is to do a search for concurrency in database design (briefly explained in http://en.wikipedia.org/wiki/Concurrency_control amongst others), which is a topic way too large to cover in a * reply =)

你可以使用PHP和MySQL来获得一个不错的设置,但是一旦你开始处理非常大的数据集,MySQL和任何其他通用数据库,就会开始在压力下屈服。如果你想了解更多关于这一点,一个好的起点是在数据库设计中搜索并发性(在http://en.wikipedia.org/wiki/Concurrency_control等中简要解释),这是一个主题方式太大而无法覆盖*回复=)

#3


0  

Google goes beyond simply optimizing the databases and the code. They also do a lot of distributed programming. While the exact mechanisms they use to power systems such as Gmail are guarded secrets, it is known that they have entire farms of computers networked, each working on parts of the index at any given time, rather than just one server.

Google不仅仅是优化数据库和代码。他们还做了很多分布式编程。虽然他们用来为像Gmail这样的系统供电的确切机制是保密的,但众所周知,他们将整个计算机网络联网,每个计算机在任何给定时间处理索引的某些部分,而不是仅仅一个服务器。

#4


0  

For MySQL, look at the Full-Text Search Functions.

对于MySQL,请查看全文搜索功能。

This is assuming your content is stored in the database (such as in a CMS).

这假设您的内容存储在数据库中(例如在CMS中)。

#1


2  

Text indexing algorithm

文本索引算法

#2


1  

Google uses a custom-made database solution called BigTable, http://en.wikipedia.org/wiki/Big_table, which is run linked over hundreds of servers all over the world. So they're fast because they wrote the software specifically to be fast, and set up the hardware in such a way that they could squeeze the most out of it.

Google使用名为BigTable的定制数据库解决方案http://en.wikipedia.org/wiki/Big_table,该解决方案在全球数百台服务器上运行。所以他们很快,因为他们专门编写了快速的软件,并以这样的方式设置硬件,以便他们可以最大限度地利用它。

You can get to a decent set with PHP and MySQL, but once you start dealing with very large data sets, MySQL, and any other generic database, will start to buckle under the stress. If you want to learn more about this, a good place to start is to do a search for concurrency in database design (briefly explained in http://en.wikipedia.org/wiki/Concurrency_control amongst others), which is a topic way too large to cover in a * reply =)

你可以使用PHP和MySQL来获得一个不错的设置,但是一旦你开始处理非常大的数据集,MySQL和任何其他通用数据库,就会开始在压力下屈服。如果你想了解更多关于这一点,一个好的起点是在数据库设计中搜索并发性(在http://en.wikipedia.org/wiki/Concurrency_control等中简要解释),这是一个主题方式太大而无法覆盖*回复=)

#3


0  

Google goes beyond simply optimizing the databases and the code. They also do a lot of distributed programming. While the exact mechanisms they use to power systems such as Gmail are guarded secrets, it is known that they have entire farms of computers networked, each working on parts of the index at any given time, rather than just one server.

Google不仅仅是优化数据库和代码。他们还做了很多分布式编程。虽然他们用来为像Gmail这样的系统供电的确切机制是保密的,但众所周知,他们将整个计算机网络联网,每个计算机在任何给定时间处理索引的某些部分,而不是仅仅一个服务器。

#4


0  

For MySQL, look at the Full-Text Search Functions.

对于MySQL,请查看全文搜索功能。

This is assuming your content is stored in the database (such as in a CMS).

这假设您的内容存储在数据库中(例如在CMS中)。