在网站上实现搜索功能的最简单方法是什么?

时间:2021-06-15 20:01:48

The website is almost entirely d/x/html, and is hosted on a linux/apache server.

该网站几乎完全是d / x / html,并托管在linux / apache服务器上。

While I'm not opposed to using a database, I've been told that I can implement a solution that parses through the html documents and returns my search results without mucking about too much with asp/php/cgi (which I am most certainly a novice in).

虽然我不反对使用数据库,但我被告知我可以实现一个解析html文档的解决方案并返回我的搜索结果,而不会过多地使用asp / php / cgi(我肯定是这样)一个新手)。

Is this possible? Is there a better way? Should I look to a specific third party application?

这可能吗?有没有更好的办法?我应该查看特定的第三方应用程序吗?

THANKS!!!

8 个解决方案

#1


Instead of paying for search appliances, you can also pay Google to have it crawl your site and present customized search results. It's inexpensive and Google does a good job indexing everything (including PDFs). If I remember correctly its ad-supported version is free (i.e. you pay to remove the ads)

您还可以向Google支付费用,以便抓取您的网站并展示自定义搜索结果,而不是为搜索设备付费。它价格便宜,谷歌可以很好地索引所有内容(包括PDF)。如果我没记错的话,其广告支持的版本是免费的(即您付费删除广告)

#2


There are "spiders" that will crawl your site and generate some form of search index. How reliable these are and how well they perform I really can't say. We recently purchased two Google search appliances here at work and use one for our intranet and one for our external web. They do a very nice job of indexing exactly the content you want as well as setting up specialized "search zones" and even keyword mapping.

有些“蜘蛛”会抓取您的网站并生成某种形式的搜索索引。这些是多么可靠以及它们的表现如何我真的不能说。我们最近在这里购买了两个Google搜索设备,其中一个用于我们的内部网,一个用于我们的外部网。他们可以很好地为您想要的内容编制索引,并设置专门的“搜索区域”甚至是关键字映射。

I highly recommend them: http://www.google.com/enterprise/mini/

我强烈推荐他们:http://www.google.com/enterprise/mini/

  • Nicholas

#3


The google search is the easiest route. The only thing I would suggest is that you add a google sitemap to your site. That way you can notify google of updates or new pages to make sure the search listing is as up-to-date as possible.

谷歌搜索是最简单的路线。我建议的唯一一件事就是将谷歌站点地图添加到您的站点。这样,您可以通知谷歌更新或新页面,以确保搜索列表尽可能最新。

#4


If you can write some code in your favorite programing language you can also have a look at Apache Solr (url). The concept is simple: You get a seperate Search-Server, already implemented and as a seperated program. You can put in Documents by Posting (HTTP-Post) them to the Search-Server. You can make searches by issuing a GET-Request and getting back a XML-File with the search results.

如果您可以用您喜欢的编程语言编写一些代码,您还可以查看Apache Solr(url)。这个概念很简单:你得到一个单独的Search-Server,已经实现并作为一个单独的程序。您可以通过发布(HTTP-Post)将文档放入Search-Server。您可以通过发出GET-Request并使用搜索结果获取XML文件来进行搜索。

What you have to write is the code to send the files to the search-search (only some lines of code) and the parsing of the xml-search-results (can be done easily with xslt)

你需要写的是将文件发送到搜索搜索的代码(只有一些代码行)和解析xml-search-results(可以使用xslt轻松完成)

I dont know how many documents you are talking about but this solution scales very well, I currently use it with 2.5 Mio Pages in the Index and get results in under 50 ms.

我不知道你说的文件有多少,但这个解决方案非常好,我目前在索引中使用2.5 Mio页面,并在50毫秒内得到结果。

#5


Add a link to Google that only returns results for your domain (with a site: delimiter). I don't know how to do this but it shouldn't be hard

添加一个只返回您域名结果的Google链接(带有网站:分隔符)。我不知道该怎么做,但这应该不难

#6


Thanks all! I'm currently looking into a google custom search engine. The search bars with logos are cumbersome, but if all google wants for the legwork on this is a watermarked search bar and a couple ads served, then that's the solution for me!

谢谢大家!我目前正在寻找谷歌自定义搜索引擎。带有徽标的搜索栏很麻烦,但是如果所有google都希望通过水印搜索栏和几个广告投放,那么这就是我的解决方案!

#7


Here's how I did the search on my blog (using Google)... don't remember where I got this template from originally but from the comments I guess it originally came from javascriptkit.com. :)

这是我在我的博客上搜索的方式(使用谷歌)...不记得我从哪里获得这个模板,但从评论中我猜它最初来自javascriptkit.com。 :)

<script type="text/javascript">

// Google Internal Site Search script- By JavaScriptKit.com(http://www.javascriptkit.com) 
// For this and over 400+ free scripts, visit JavaScript Kit-http://www.javascriptkit.com/ 
// This notice must stay intact for use

//Enter domain of site to search. 
var domainroot="ericasberry.com"

function Gsitesearch(curobj) 
{ 
    curobj.q.value="site:"+domainroot+" "+curobj.qfront.value 
}

</script>


<form action="http://www.google.com/search" method="get"
    onSubmit="Gsitesearch(this)"&gt;

<p>Search ericasberry.com:<br /> 
<input name="q" type="hidden" /> 
<input name="qfront" type="text" style="width: 180px" /> 
<input type="submit" value="Search" /></p>

</form>

#8


Google Ajax Search API

Google Ajax Search API

#1


Instead of paying for search appliances, you can also pay Google to have it crawl your site and present customized search results. It's inexpensive and Google does a good job indexing everything (including PDFs). If I remember correctly its ad-supported version is free (i.e. you pay to remove the ads)

您还可以向Google支付费用,以便抓取您的网站并展示自定义搜索结果,而不是为搜索设备付费。它价格便宜,谷歌可以很好地索引所有内容(包括PDF)。如果我没记错的话,其广告支持的版本是免费的(即您付费删除广告)

#2


There are "spiders" that will crawl your site and generate some form of search index. How reliable these are and how well they perform I really can't say. We recently purchased two Google search appliances here at work and use one for our intranet and one for our external web. They do a very nice job of indexing exactly the content you want as well as setting up specialized "search zones" and even keyword mapping.

有些“蜘蛛”会抓取您的网站并生成某种形式的搜索索引。这些是多么可靠以及它们的表现如何我真的不能说。我们最近在这里购买了两个Google搜索设备,其中一个用于我们的内部网,一个用于我们的外部网。他们可以很好地为您想要的内容编制索引,并设置专门的“搜索区域”甚至是关键字映射。

I highly recommend them: http://www.google.com/enterprise/mini/

我强烈推荐他们:http://www.google.com/enterprise/mini/

  • Nicholas

#3


The google search is the easiest route. The only thing I would suggest is that you add a google sitemap to your site. That way you can notify google of updates or new pages to make sure the search listing is as up-to-date as possible.

谷歌搜索是最简单的路线。我建议的唯一一件事就是将谷歌站点地图添加到您的站点。这样,您可以通知谷歌更新或新页面,以确保搜索列表尽可能最新。

#4


If you can write some code in your favorite programing language you can also have a look at Apache Solr (url). The concept is simple: You get a seperate Search-Server, already implemented and as a seperated program. You can put in Documents by Posting (HTTP-Post) them to the Search-Server. You can make searches by issuing a GET-Request and getting back a XML-File with the search results.

如果您可以用您喜欢的编程语言编写一些代码,您还可以查看Apache Solr(url)。这个概念很简单:你得到一个单独的Search-Server,已经实现并作为一个单独的程序。您可以通过发布(HTTP-Post)将文档放入Search-Server。您可以通过发出GET-Request并使用搜索结果获取XML文件来进行搜索。

What you have to write is the code to send the files to the search-search (only some lines of code) and the parsing of the xml-search-results (can be done easily with xslt)

你需要写的是将文件发送到搜索搜索的代码(只有一些代码行)和解析xml-search-results(可以使用xslt轻松完成)

I dont know how many documents you are talking about but this solution scales very well, I currently use it with 2.5 Mio Pages in the Index and get results in under 50 ms.

我不知道你说的文件有多少,但这个解决方案非常好,我目前在索引中使用2.5 Mio页面,并在50毫秒内得到结果。

#5


Add a link to Google that only returns results for your domain (with a site: delimiter). I don't know how to do this but it shouldn't be hard

添加一个只返回您域名结果的Google链接(带有网站:分隔符)。我不知道该怎么做,但这应该不难

#6


Thanks all! I'm currently looking into a google custom search engine. The search bars with logos are cumbersome, but if all google wants for the legwork on this is a watermarked search bar and a couple ads served, then that's the solution for me!

谢谢大家!我目前正在寻找谷歌自定义搜索引擎。带有徽标的搜索栏很麻烦,但是如果所有google都希望通过水印搜索栏和几个广告投放,那么这就是我的解决方案!

#7


Here's how I did the search on my blog (using Google)... don't remember where I got this template from originally but from the comments I guess it originally came from javascriptkit.com. :)

这是我在我的博客上搜索的方式(使用谷歌)...不记得我从哪里获得这个模板,但从评论中我猜它最初来自javascriptkit.com。 :)

<script type="text/javascript">

// Google Internal Site Search script- By JavaScriptKit.com(http://www.javascriptkit.com) 
// For this and over 400+ free scripts, visit JavaScript Kit-http://www.javascriptkit.com/ 
// This notice must stay intact for use

//Enter domain of site to search. 
var domainroot="ericasberry.com"

function Gsitesearch(curobj) 
{ 
    curobj.q.value="site:"+domainroot+" "+curobj.qfront.value 
}

</script>


<form action="http://www.google.com/search" method="get"
    onSubmit="Gsitesearch(this)"&gt;

<p>Search ericasberry.com:<br /> 
<input name="q" type="hidden" /> 
<input name="qfront" type="text" style="width: 180px" /> 
<input type="submit" value="Search" /></p>

</form>

#8


Google Ajax Search API

Google Ajax Search API