在DB或文件中缓存数据?

时间:2020-11-29 03:51:29

I am writing an application, where it is necessary to fetch data from a third party website. Unfortunately, a specific type of info needed (a hotel name) can only be obtained by CURLing the webpage, and then parsing it (I'm using XPATHs) looking for an < h1> DOM element.

我正在编写一个应用程序,需要从第三方网站获取数据。不幸的是,需要的特定类型的信息(酒店名称)只能通过将页面卷起来获取,然后解析它(我使用XPATHs),查找< h1> DOM元素。

Since I'm going to run this script many times within the day, and I'll probably have to fetch the same hotel names again and again, I thought that a caching mechanism would be good: Checking if the hotel has been parsed in the past and then decide whether to make the webpage request or not.

因为我要多次运行此脚本在一天内,我可能会去拿相同的酒店名字一次又一次,我认为一个缓存机制就好了:检查如果酒店已经解析过去,然后决定是否让网页请求或不是。

However I have two concerns: this implementation is better to be made in a DB (since there will be an ID-Hotel name matching) or in a file? The second one is whether this "optimization" worth the whole trouble. Will I gain some significant speed up?

但是,我有两个问题:这个实现最好是在DB(因为会有一个ID-Hotel名称匹配)中实现,还是在文件中实现?第二个问题是,这种“优化”是否值得付出全部代价。我是否会有明显的加速?

1 个解决方案

#1


2  

Go with DB, because it will give to you more flexibility and functionality for the data manipulation (filtering, sorting, etc.) by default.

使用DB,因为它将为默认的数据操作(过滤、排序等)提供更多的灵活性和功能。

#1


2  

Go with DB, because it will give to you more flexibility and functionality for the data manipulation (filtering, sorting, etc.) by default.

使用DB,因为它将为默认的数据操作(过滤、排序等)提供更多的灵活性和功能。