我如何解决Twitter API缓存问题？

I'm building a Twitter app that requires to check user data somewhat frequently, but I'm facing trouble with a cache that's oddly on Twitter's side, not mine.

我正在建立一个Twitter应用程序,需要经常检查用户数据,但我遇到的问题是在Twitter方面奇怪的缓存,而不是我的。

Try the following user:

请尝试以下用户:

users/show in XML: http://twitter.com/users/show.xml?screen_name=technolocus

用户/以XML格式显示:http://twitter.com/users/show.xml?screen_name = technicalocus

users/show in JSON: http://twitter.com/users/show.json?screen_name=technolocus

用户/在JSON中显示:http://twitter.com/users/show.json?screen_name = technicalocus

normal page: http://twitter.com/technolocus

普通页面:http://twitter.com/technolocus

All these methods of accessing data should return the same values, right? Check the statuses_count for each of them.

所有这些访问数据的方法应该返回相同的值,对吧?检查每个的statuses_count。

XML: 12548

JSON: 12513

normal: 12498

The normal method (i.e. just visiting the profile non-programatically) serves up the most correct value of 12498. If I post or delete tweets to this account, it gets updated on the profile page instantly, but the XML and JSON methods still return cached data.

普通方法(即只是以非编程方式访问配置文件)提供最正确的值12498.如果我发布或删除此帐户的推文,它会立即在配置文件页面上更新,但XML和JSON方法仍然返回缓存数据。

At this point, the values of the XML and JSON methods are 12 to 18 hours old respectively.

此时,XML和JSON方法的值分别为12到18小时。

I first tried to access these methods from my website (hosted on Dreamhost). I thought it was Dreamhost caching the responses. Then I tried to access the API directly from my browser. I did a cURL from the command line from my machine after that. It wasn't dreamhost. I thought it was probably my ISP (I think they use NetApp or something like that). Then I asked a friend in another corner of India to try it. He's getting the exact same cached responses as I am.

我首先尝试从我的网站(托管在Dreamhost上)访问这些方法。我以为是Dreamhost缓存响应。然后我尝试直接从我的浏览器访问API。之后我从我的机器命令行做了一个cURL。这不是梦想家。我认为这可能是我的ISP(我认为他们使用NetApp或类似的东西)。然后我问了印度另一个角落的朋友试试。他和我一样得到了完全相同的缓存响应。

So it isn't Dreamhost's cache; it isn't my ISP or my country's cache. There's only one conclusion - Twitter is caching responses.

所以它不是Dreamhost的缓存;它不是我的ISP或我国家的缓存。只有一个结论 - 推特正在缓解回应。

How in the heavens do I get around this?!?

天哪,我怎么绕过这个?!?

Forgot to mention this: The script on the server is in PHP and is using cURL to retrieve the XML and JSON data from Twitter, while the local tests have been just using the browser. Both have the exact same result!

忘记提到这一点:服务器上的脚本是PHP,并使用cURL从Twitter检索XML和JSON数据,而本地测试只是使用浏览器。两者都有完全相同的结果!

3 个解决方案

#1

First, I think you should report this a a bug to Twitter. I see the same discrepancy as you, and no matter what that seems like a bug. Even if they're caching, I'd expect that a cache on their side would store an abstract form that would then be rendered into HTML, JSON, and XML. I wonder if what's actually going on is that these requests are performing similar but different queries.

首先,我认为你应该向Twitter报告一个bug。我看到了与你相同的差异,无论这看起来像什么错误。即使他们正在缓存,我也希望他们这边的缓存会存储一个抽象形式,然后将其呈现为HTML,JSON和XML。我想知道实际发生的是这些请求是执行类似但不同的查询。

Are you sure that the values are "old"? For example, did you actually delete about 50 updates recently (since you say the HTML one is newest but shows a lower count than the other two)? If you create another update do you see the HTML number increment while the other numbers stay the same, or do they all increment simultaneously?

你确定价值是“老”吗?例如,你最近是否真的删除了大约50个更新(因为你说HTML一个是最新的,但显示的计数低于其他两个)?如果您创建另一个更新,您是否看到HTML数字增量而其他数字保持不变,或者它们是否同时增加?

#2

If what you are saying is accurate, and it probably is, generally, you can't get around it. Twitter would want to be caching its responses since they are costly to reproduce every single time.

如果你所说的是准确的,而且通常情况下,你可能无法绕过它。 Twitter希望缓存其响应,因为每次重放都很昂贵。

When you use Twitter's APIs, you end up being bound by its conventions, even if that includes caching.

当您使用Twitter的API时,您最终会被其约定约束,即使这包括缓存。

Your best bet is to tweet to @twitterapi and get them to give you a response as to why the two representations are divergent.

你最好的选择是推特到@twitterapi并让他们回答为什么这两个表示是分歧的。

#3

Add ?blah=xxxx to all urls.

将?blah = xxxx添加到所有网址。

I don't develop anything against twitter and ocassionaly manually "follow" three tweets by going to them in my browser. They always lag behind by half a day. I add ?asdsadsadsad to the url (everytime something different) and it always updates. I don't know what Twitter is doing here and came here while searching for the problem. But I guess this trick of appending a random value to the url via GET will probably work for your api requests, too.

我没有针对twitter和ocassionaly开发任何东西,在我的浏览器中手动“跟随”三条推文。他们总是落后半天。我将?asdsadsadsad添加到url(每次不同的东西)并且它总是更新。我不知道Twitter在这里做了什么,并在搜索问题时来到这里。但是我猜这个通过GET向URL添加随机值的技巧也可能适用于你的api请求。

#1

#2