我如何保护我的AJAX服务?

时间:2021-07-04 07:26:16

Right now I'm working on a service that handles reviews/recommendations of local restaurants overlayed on Google Maps. Basically Yelp, but restricted to a certain niche. Anyhow, since I don't want to have to load every location and review at once, I'm finally getting into using jQuery and AJAX calls.

现在我正在开发一种服务,处理覆盖在谷歌地图上的当地餐馆的评论/推荐。基本上是Yelp,但仅限于一定的利基。无论如何,由于我不想加载每个位置并立即查看,我终于开始使用jQuery和AJAX调用了。

The question I have is: How do I prevent other people from 'scraping' data from my ajax scripts on the server?

我的问题是:如何阻止其他人从服务器上的ajax脚本中“抓取”数据?

The main map/location info functionality needs to be public, in that users should not have to log in to use the application, so it may simply boil down to making it difficult to scrape. I'm hoping that one of you AJAX veteran out there can point me in the direction of a better idea, or some 'best practices' docs that I haven't been able to find yet.

主要的地图/位置信息功能需要是公共的,因为用户不必登录以使用该应用程序,因此可能简单地归结为难以刮擦。我希望你们中的一位AJAX老手可以指出我更好的想法,或者一些我还没有找到的“最佳实践”文档。

So far all I've been able to come up with is:

到目前为止,我所能想到的只有:

  • The user-facing scripts open a short-lived session on the server and the AJAX calls will not run without an active session.
  • 面向用户的脚本在服务器上打开一个短暂的会话,如果没有活动会话,AJAX调用将无法运行。

  • Send some sort of access key along with the application code and require that in all of the AJAX calls. But not sure how to best implement this in a way that's not trivially easy to get around.
  • 发送某种访问密钥以及应用程序代码,并要求在所有AJAX调用中。但不确定如何以一种非常容易解决的方式最好地实现这一点。

2 个解决方案

#1


5  

You can't completely protect your AJAX web services. Even if you mangle your data and obfuscate your source code, it is trivial to just fire up a packet sniffer or debugging proxy, figure it out, and scrape from it.

您无法完全保护您的AJAX Web服务。即使你破坏了你的数据并混淆了你的源代码,只需启动一个数据包嗅探器或调试代理,弄清楚它并从中汲取它,这是微不足道的。

What I would do is exactly what you propose... only users with an active session on the site can make calls. Then from there, throttle requests.

我要做的就是你提出的建议......只有在网站上有活动会话的用户才能拨打电话。然后从那里,油门请求。

Even a busy normal user won't make more than a handful of requests per minute. You can analyze your logs to figure out what a good number would be. Even if you limited your service to 20 calls per minute, that kind of limitation makes it fairly useless for folks that want to duplicate all of your content.

即使是繁忙的普通用户也不会每分钟发出少量请求。您可以分析日志以确定好的数字。即使您将服务限制为每分钟20次,这种限制也会使那些想要复制所有内容的用户毫无用处。

Don't limit just on session data either... keep an eye on IP addresses. It's entirely possible to fire off a request and get a new session at any time. Periodically check your logs to see if anything is getting through, and adjust your strategy accordingly.

不要仅限于会话数据......请密切关注IP地址。完全有可能随时发起请求并获得新会话。定期检查日志以查看是否有任何问题,并相应地调整您的策略。

Finally, regularly search for your content. Google is a great tool for finding copyright infringers. If you use specific data, such as GPS coordinates, you can actually watermark the coordinates with a specific value in the noise area of the coordinate.

最后,定期搜索您的内容。 Google是查找版权侵权者的绝佳工具。如果使用特定数据(例如GPS坐标),您实际上可以使用坐标的噪声区域中的特定值对坐标进行水印。

#2


1  

From what I hear, you want to protect the JavaScript side of the service. This is not possible as JavaScript is essentially fully open source (albeit not public domain)

据我所知,您希望保护服务的JavaScript方面。这是不可能的,因为JavaScript本质上是完全开源的(虽然不是公共域)

Google offers a tool called Google Closure which can compact the script by removing white space and tabs. It can also obfuscate a document for you by replacing variable names and function names with random characters. It is customizable so you can tell it what you want. From what I can tell, Google uses it for their own website (evident by viewing the source of their pages)

Google提供了一个名为Google Closure的工具,它可以通过删除空格和制表符来压缩脚本。它还可以通过用随机字符替换变量名称和函数名称来为您混淆文档。它是可定制的,因此您可以告诉它您想要的是什么。据我所知,谷歌将它用于他们自己的网站(通过查看他们的网页来源显而易见)

#1


5  

You can't completely protect your AJAX web services. Even if you mangle your data and obfuscate your source code, it is trivial to just fire up a packet sniffer or debugging proxy, figure it out, and scrape from it.

您无法完全保护您的AJAX Web服务。即使你破坏了你的数据并混淆了你的源代码,只需启动一个数据包嗅探器或调试代理,弄清楚它并从中汲取它,这是微不足道的。

What I would do is exactly what you propose... only users with an active session on the site can make calls. Then from there, throttle requests.

我要做的就是你提出的建议......只有在网站上有活动会话的用户才能拨打电话。然后从那里,油门请求。

Even a busy normal user won't make more than a handful of requests per minute. You can analyze your logs to figure out what a good number would be. Even if you limited your service to 20 calls per minute, that kind of limitation makes it fairly useless for folks that want to duplicate all of your content.

即使是繁忙的普通用户也不会每分钟发出少量请求。您可以分析日志以确定好的数字。即使您将服务限制为每分钟20次,这种限制也会使那些想要复制所有内容的用户毫无用处。

Don't limit just on session data either... keep an eye on IP addresses. It's entirely possible to fire off a request and get a new session at any time. Periodically check your logs to see if anything is getting through, and adjust your strategy accordingly.

不要仅限于会话数据......请密切关注IP地址。完全有可能随时发起请求并获得新会话。定期检查日志以查看是否有任何问题,并相应地调整您的策略。

Finally, regularly search for your content. Google is a great tool for finding copyright infringers. If you use specific data, such as GPS coordinates, you can actually watermark the coordinates with a specific value in the noise area of the coordinate.

最后,定期搜索您的内容。 Google是查找版权侵权者的绝佳工具。如果使用特定数据(例如GPS坐标),您实际上可以使用坐标的噪声区域中的特定值对坐标进行水印。

#2


1  

From what I hear, you want to protect the JavaScript side of the service. This is not possible as JavaScript is essentially fully open source (albeit not public domain)

据我所知,您希望保护服务的JavaScript方面。这是不可能的,因为JavaScript本质上是完全开源的(虽然不是公共域)

Google offers a tool called Google Closure which can compact the script by removing white space and tabs. It can also obfuscate a document for you by replacing variable names and function names with random characters. It is customizable so you can tell it what you want. From what I can tell, Google uses it for their own website (evident by viewing the source of their pages)

Google提供了一个名为Google Closure的工具,它可以通过删除空格和制表符来压缩脚本。它还可以通过用随机字符替换变量名称和函数名称来为您混淆文档。它是可定制的,因此您可以告诉它您想要的是什么。据我所知,谷歌将它用于他们自己的网站(通过查看他们的网页来源显而易见)