Ruby on Rails,如何确定请求是由机器人还是搜索引擎蜘蛛制作的?

时间:2022-10-12 23:17:07

I've Rails apps, that record an IP-address from every request to specific URL, but in my IP database i've found facebook blok IP like 66.220.15.* and Google IP (i suggest it come from bot). Is there any formula to determine an IP from request was made by a robot or search engine spider ? Thanks

我有Rails应用程序,记录从每个请求到特定URL的IP地址,但在我的IP数据库中,我发现facebook blok IP如66.220.15。*和Google IP(我建议它来自bot)。是否有任何公式可以通过机器人或搜索引擎蜘蛛来确定请求中的IP?谢谢

3 个解决方案

#1


13  

Robots are required (by common sense / courtesy more than any kind of law) to send along a User-Agent with their request. You can check for this using request.env["HTTP_USER_AGENT"] and filter as you please.

机器人需要(通过常识/礼貌比任何类型的法律更多)向用户代理发送他们的请求。您可以使用request.env [“HTTP_USER_AGENT”]进行检查,并根据需要进行过滤。

#2


27  

Since the well behaved bots at least typically include a reference URI in the UA string they send, something like:

由于表现良好的机器人至少通常在它们发送的UA字符串中包含引用URI,例如:

request.env["HTTP_USER_AGENT"].match(/\(.*https?:\/\/.*\)/)

is an easy way to see if the request is from a bot vs. a human user's agent. This seems to be more robust than trying to match against a comprehensive list.

是一种简单的方法,可以查看请求是来自机器人与人类用户的代理。这似乎比尝试匹配综合列表更强大。

#3


11  

I think you can use browser gem for check bots.

我认为你可以使用浏览器宝石检查机器人。

if browser.bot?
  # code here
end

https://github.com/fnando/browser

https://github.com/fnando/browser

#1


13  

Robots are required (by common sense / courtesy more than any kind of law) to send along a User-Agent with their request. You can check for this using request.env["HTTP_USER_AGENT"] and filter as you please.

机器人需要(通过常识/礼貌比任何类型的法律更多)向用户代理发送他们的请求。您可以使用request.env [“HTTP_USER_AGENT”]进行检查,并根据需要进行过滤。

#2


27  

Since the well behaved bots at least typically include a reference URI in the UA string they send, something like:

由于表现良好的机器人至少通常在它们发送的UA字符串中包含引用URI,例如:

request.env["HTTP_USER_AGENT"].match(/\(.*https?:\/\/.*\)/)

is an easy way to see if the request is from a bot vs. a human user's agent. This seems to be more robust than trying to match against a comprehensive list.

是一种简单的方法,可以查看请求是来自机器人与人类用户的代理。这似乎比尝试匹配综合列表更强大。

#3


11  

I think you can use browser gem for check bots.

我认为你可以使用浏览器宝石检查机器人。

if browser.bot?
  # code here
end

https://github.com/fnando/browser

https://github.com/fnando/browser