
时间:2021-07-06 13:09:35

I run IMDbAPI.com and have been using Bing's Search API for finding IMDb ID's from title searches. Bing is currently changing their API over to the Azure Marketplace (August 1st) and is no longer available for free. I started testing my API using Freebase to resolve these ID's and hit their 100k limit in the first 8 hours (my site currently gets about 3 million requests a day, but only 200-300k are title searches)

我运行IMDbAPI.com,一直在使用Bing的搜索API从标题搜索中查找IMDb ID。Bing目前正在将其API更改为Azure Marketplace(8月1日),不再免费使用。我开始使用Freebase测试我的API来解决这些ID,并在最初的8小时内达到它们的100k限制(我的站点目前每天收到大约300万个请求,但只有200-300k是标题搜索)

This is exactly why they offer the data dump files,


I downloaded most of the files in the Film folder but cannot find where they are storing the "/authority/imdb/title" imdb id namespace data.

我下载了影片文件夹中的大部分文件,但无法找到它们存储“/authority/imdb/title”imdb id名称空间数据的位置。


https://www.googleapis.com/freebase/v1/mqlread?query= {“类型”:“/电影/电影”,“名字”:“真正的% 20丸”、“imdb_id”:空,“initial_release_date > =”:“1969 - 01年”、“限制”:1 }

This is how I'm currently accessing the ID.


Does anyone know which file contains this information? and how to link back to it from the film title/id?


2 个解决方案



That imdb_id property is backed by a key in the /authority/imdb/title namespace, so you're looking for the line:


/m/015gxt       /type/object/key        /authority/imdb/title   tt0065126

in the file http://download.freebase.com/datadumps/latest/freebase-datadump-quadruples.tsv.bz2


That's a 4 GB file, so be prepared to wait a little while for the download. Note that everything is keyed by MID, so you'll need to figure that out first if you don't have it in your database.

这是一个4 GB的文件,所以要做好等待下载的准备。请注意,一切都是由MID键控的,所以如果数据库中没有这些,您需要先弄清楚。

The equivalent query using MQL instead of the data dumps is https://www.googleapis.com/freebase/v1/mqlread?query=%7B%22type%22%3a%22/film/film%22,%22name%22%3a%22True%20Grit%22,%22imdb_id%22%3anull,%22initial_release_date%3E=%22%3a%221969-01%22,%22mid%22:null,%22key%22:[{%22namespace%22:%22/authority/imdb/title%22}],%22limit%22:1%7D&indent=1


EDIT: p.s. I'm pretty sure the files in the Browse directory are going away, so I wouldn't depend on them even if you could find the info there.




The previous answer works fine, it's just that a snappier version of such a query could be:


query = [{
          'type': '/film/film',
          'name': 'prometheus',
          'imdb_id': null,

The rest of the MQL request isn't mentionned as it doesn't differ from the aforementioned. Hope that helps.




That imdb_id property is backed by a key in the /authority/imdb/title namespace, so you're looking for the line:


/m/015gxt       /type/object/key        /authority/imdb/title   tt0065126

in the file http://download.freebase.com/datadumps/latest/freebase-datadump-quadruples.tsv.bz2


That's a 4 GB file, so be prepared to wait a little while for the download. Note that everything is keyed by MID, so you'll need to figure that out first if you don't have it in your database.

这是一个4 GB的文件,所以要做好等待下载的准备。请注意,一切都是由MID键控的,所以如果数据库中没有这些,您需要先弄清楚。

The equivalent query using MQL instead of the data dumps is https://www.googleapis.com/freebase/v1/mqlread?query=%7B%22type%22%3a%22/film/film%22,%22name%22%3a%22True%20Grit%22,%22imdb_id%22%3anull,%22initial_release_date%3E=%22%3a%221969-01%22,%22mid%22:null,%22key%22:[{%22namespace%22:%22/authority/imdb/title%22}],%22limit%22:1%7D&indent=1


EDIT: p.s. I'm pretty sure the files in the Browse directory are going away, so I wouldn't depend on them even if you could find the info there.




The previous answer works fine, it's just that a snappier version of such a query could be:


query = [{
          'type': '/film/film',
          'name': 'prometheus',
          'imdb_id': null,

The rest of the MQL request isn't mentionned as it doesn't differ from the aforementioned. Hope that helps.
