lucen .net可以用于基于标签的搜索系统吗?

时间:2021-07-29 03:04:46

I'm developing a ASP.Net MVC3 app which will have few hundred videos. I want to create a search system based on tags and other parameters like the user type that uploaded the video, the date of the video, video category, etc..

我开发一个ASP。Net MVC3应用程序将有几百个视频。我想创建一个基于标签和其他参数的搜索系统,比如上传视频的用户类型,视频的日期,视频类别等等。

I have been looking around and Lucene.NET seems really good tool for full text search, but I don't know if it's the best solution for my project... I have read the tutorials and they recommend to keep the search index to a minimum but also that you should NOT hit your database for retrieving extra data that is not stored in the search index...


How this can be possible?


Lets put an example: I have a video row (as a concept, this is really held in different SQL tables) which has columns for the video id, the video name, the video file name, the full path, user id, user type, tags, creation date, video category, video subcategory, video location, etc... If I want to create a lucene search index I think I will have to put all the information in there so that later on I can query on every parameter, right?


This seems to me a duplicate of the SQL Database but with the overload of adding, editing and removing documents from lucene search index. Is this the standard scenario when using lucene? All the examples I have seen with lucene are based on a post id, post title and post body..

在我看来,这是SQL数据库的一个副本,但是在lucene搜索索引中添加、编辑和删除文档的工作量太大了。这是使用lucene时的标准场景吗?我在lucene上看到的所有例子都是基于一个post id, post title和post body。

What do you think? Can you give me some light?


1 个解决方案



Yes, if you want to query multiple fields (including things like tags) from within lucene, you'll need to make that data available to lucene. It might sound like this is duplication, but it is not redundant duplication - it is restructuring the data into a very different layout - indexed for search.


It should work fine; it is pretty much how search works here on * (which is using to perform the search).


It should be noted, however, that a few hundred is not a large sample: frankly you could do that any way you like, and it'll take about the same amount of time. Writing a complex SQL query should work, as should full-text-search in the database (that is how *'s search used to work), as should filtering objects in-memory (at the few-hundred level, you could trivially just cache all the data excluding video frames in memory).




