如何抑制我的站点的API用户?

时间:2021-09-22 07:38:07

The legitimate users of my site occasionally hammer the server with API requests that cause undesirable results. I want to institute a limit of no more than say one API call every 5 seconds or n calls per minute (haven't figured out the exact limit yet). I could obviously log every API call in a DB and do the calculation on every request to see if they're over the limit, but all this extra overhead on EVERY request would be defeating the purpose. What are other less resource-intensive methods I could use to institute a limit? I'm using PHP/Apache/Linux, for what it's worth.

我的站点的合法用户偶尔会向服务器发出API请求,导致不希望的结果。我想设定一个不超过每5秒一次API调用或每分钟n次调用的限制(还没算出确切的限制)。显然,我可以将每个API调用都记录在数据库中,并对每个请求进行计算,看看它们是否超过了极限,但是所有这些额外的开销都将会破坏这个目标。我还可以用什么资源密集型的方法来限制呢?我使用的是PHP/Apache/Linux,不管它有什么价值。

6 个解决方案

#1


49  

Ok, there's no way to do what I asked without any writes to the server, but I can at least eliminate logging every single request. One way is by using the "leaky bucket" throttling method, where it only keeps track of the last request ($last_api_request) and a ratio of the number of requests/limit for the time frame ($minute_throttle). The leaky bucket never resets its counter (unlike the Twitter API's throttle which resets every hour), but if the bucket becomes full (user reached the limit), they must wait n seconds for the bucket to empty a little before they can make another request. In other words it's like a rolling limit: if there are previous requests within the time frame, they are slowly leaking out of the bucket; it only restricts you if you fill the bucket.

好吧,如果不对服务器进行任何写入,就无法执行我的请求,但我至少可以消除对每个请求的日志记录。一种方法是使用“leaky bucket”节流方法,它只跟踪最后一个请求($last_api_request)和时间帧的请求/限制数量的比率($ minute_油门)。leaky bucket从来不会重置它的计数器(不像Twitter API的节流阀每小时重置一次),但是如果这个桶已经满了(用户达到了限制),他们必须等待n秒的时间,以便在他们可以发出另一个请求之前空出一个小桶。换句话说,它就像一个滚动限制:如果在时间范围内有先前的请求,它们会慢慢地从桶中泄漏出来;你只填满桶,它才会限制你。

This code snippet will calculate a new $minute_throttle value on every request. I specified the minute in $minute_throttle because you can add throttles for any time period, such as hourly, daily, etc... although more than one will quickly start to make it confusing for the users.

这段代码将计算每个请求的$minute_throttle值。我在$minute_throttle中指定了分钟,因为您可以在任何时间段(如每小时、每日等)添加节流阀……虽然不止一个会很快让用户感到困惑。

$minute = 60;
$minute_limit = 100; # users are limited to 100 requests/minute
$last_api_request = $this->get_last_api_request(); # get from the DB; in epoch seconds
$last_api_diff = time() - $last_api_request; # in seconds
$minute_throttle = $this->get_throttle_minute(); # get from the DB
if ( is_null( $minute_limit ) ) {
    $new_minute_throttle = 0;
} else {
    $new_minute_throttle = $minute_throttle - $last_api_diff;
    $new_minute_throttle = $new_minute_throttle < 0 ? 0 : $new_minute_throttle;
    $new_minute_throttle += $minute / $minute_limit;
    $minute_hits_remaining = floor( ( $minute - $new_minute_throttle ) * $minute_limit / $minute  );
    # can output this value with the request if desired:
    $minute_hits_remaining = $minute_hits_remaining >= 0 ? $minute_hits_remaining : 0;
}

if ( $new_minute_throttle > $minute ) {
    $wait = ceil( $new_minute_throttle - $minute );
    usleep( 250000 );
    throw new My_Exception ( 'The one-minute API limit of ' . $minute_limit 
        . ' requests has been exceeded. Please wait ' . $wait . ' seconds before attempting again.' );
}
# Save the values back to the database.
$this->save_last_api_request( time() );
$this->save_throttle_minute( $new_minute_throttle );

#2


7  

You can control the rate with the token bucket algorithm, which is comparable to the leaky bucket algorithm. Note that you will have to share the state of the bucket (i.e. the amount of tokens) over processes (or whatever scope you want to control). So you might want to think about locking to avoid race conditions.

您可以使用令牌桶算法来控制速率,它与leaky桶算法相当。注意,您必须在进程(或您想控制的任何范围)上共享bucket的状态(即令牌数量)。因此,您可能需要考虑锁定,以避免竞态条件。

The good news: I did all of that for you: bandwidth-throttle/token-bucket

好消息是:我为你做了所有这些:带宽-节流阀/token-bucket。

use bandwidthThrottle\tokenBucket\Rate;
use bandwidthThrottle\tokenBucket\TokenBucket;
use bandwidthThrottle\tokenBucket\storage\FileStorage;

$storage = new FileStorage(__DIR__ . "/api.bucket");
$rate    = new Rate(10, Rate::SECOND);
$bucket  = new TokenBucket(10, $rate, $storage);
$bucket->bootstrap(10);

if (!$bucket->consume(1, $seconds)) {
    http_response_code(429);
    header(sprintf("Retry-After: %d", floor($seconds)));
    exit();
}

#3


4  

I don't know if this thread is still alive or not but I would suggest to keep these statistics in memory cache like memcached. This will reduce the overhead of logging the request to the DB but still serve the purpose.

我不知道这个线程是否还活着,但是我建议将这些统计信息保存在内存缓存中,比如memcached。这将减少将请求记录到DB的开销,但仍然可以达到目的。

#4


3  

Simplest solution would be to just give each API key a limited number of requests per 24 hours, and reset them at some known, fixed, time.

最简单的解决方案是每24小时只给每个API密钥一个有限的请求数,然后在某个已知的、固定的时间重新设置它们。

If they exhaust their API requests (ie. the counter reaches zero, or the limit, depending on the direction you're counting), stop serving them data until you reset their counter.

如果他们用尽了API请求(即。计数器达到零,或者根据计数方向的限制),停止为它们提供数据,直到重新设置计数器。

This way, it will be in their best interest to not hammer you with requests.

这样,他们最关心的就是不要向你提出要求。

#5


1  

You say that "all thos extra overhead on EVERY request would be defeating the purpose", but I'm not sure that's correct. Isn't the purpose to prevent hammering of your server? This is probably the way I would implement it, as it really only requires a quick read/write. You could even farm out the API server checks to a different DB/disk if you were worried about the performance.

你会说“所有的额外开销都是为了达到目的”,但我不确定这是否正确。这不是防止服务器被锤击的目的吗?这可能是我实现它的方式,因为它只需要快速读/写。如果您担心性能问题,您甚至可以将API服务器检查分发到另一个DB/disk。

However, if you want alternatives, you should check out mod_cband, a third-party apache module designed to assist in bandwidth throttling. Despite being primarily for bandwidth limiting, it can throttle based on requests-per-second as well. I've never used it, so I'm not sure what kind of results you'd get. There was another module called mod-throttle as well, but that project appears to be closed now, and was never released for anything above the Apache 1.3 series.

但是,如果您想要替代方案,您应该检查mod_cband,这是一个第三方apache模块,旨在帮助进行带宽限制。尽管它主要是为了限制带宽,但它也可以基于每秒请求来限制带宽。我从未用过它,所以我不确定你会得到什么样的结果。还有一个模块叫做mod-throttle,但该项目现在似乎已经关闭,Apache 1.3系列之上的任何内容都没有发布。

#6


1  

In addition to implementation from scratch you you can also take a look at API infrastructure like 3scale (http://www.3scale.net) which does rate limiting as well as a bunch of other stuff (analytics etc.). There's a PHP plugin for it: https://github.com/3scale/3scale_ws_api_for_php.

除了从头开始实现之外,您还可以查看API基础设施,比如3scale (http://www.3scale.net),它确实限制了速率,还有其他一些东西(分析等)。这里有一个PHP插件:https://github.com/3scale/3scale_ws_api_for_php。

You can also stick something like Varnish infront of the API and do the API rate limiting like that.

你也可以在API的前面粘上清漆这样做API速率的限制。

#1


49  

Ok, there's no way to do what I asked without any writes to the server, but I can at least eliminate logging every single request. One way is by using the "leaky bucket" throttling method, where it only keeps track of the last request ($last_api_request) and a ratio of the number of requests/limit for the time frame ($minute_throttle). The leaky bucket never resets its counter (unlike the Twitter API's throttle which resets every hour), but if the bucket becomes full (user reached the limit), they must wait n seconds for the bucket to empty a little before they can make another request. In other words it's like a rolling limit: if there are previous requests within the time frame, they are slowly leaking out of the bucket; it only restricts you if you fill the bucket.

好吧,如果不对服务器进行任何写入,就无法执行我的请求,但我至少可以消除对每个请求的日志记录。一种方法是使用“leaky bucket”节流方法,它只跟踪最后一个请求($last_api_request)和时间帧的请求/限制数量的比率($ minute_油门)。leaky bucket从来不会重置它的计数器(不像Twitter API的节流阀每小时重置一次),但是如果这个桶已经满了(用户达到了限制),他们必须等待n秒的时间,以便在他们可以发出另一个请求之前空出一个小桶。换句话说,它就像一个滚动限制:如果在时间范围内有先前的请求,它们会慢慢地从桶中泄漏出来;你只填满桶,它才会限制你。

This code snippet will calculate a new $minute_throttle value on every request. I specified the minute in $minute_throttle because you can add throttles for any time period, such as hourly, daily, etc... although more than one will quickly start to make it confusing for the users.

这段代码将计算每个请求的$minute_throttle值。我在$minute_throttle中指定了分钟,因为您可以在任何时间段(如每小时、每日等)添加节流阀……虽然不止一个会很快让用户感到困惑。

$minute = 60;
$minute_limit = 100; # users are limited to 100 requests/minute
$last_api_request = $this->get_last_api_request(); # get from the DB; in epoch seconds
$last_api_diff = time() - $last_api_request; # in seconds
$minute_throttle = $this->get_throttle_minute(); # get from the DB
if ( is_null( $minute_limit ) ) {
    $new_minute_throttle = 0;
} else {
    $new_minute_throttle = $minute_throttle - $last_api_diff;
    $new_minute_throttle = $new_minute_throttle < 0 ? 0 : $new_minute_throttle;
    $new_minute_throttle += $minute / $minute_limit;
    $minute_hits_remaining = floor( ( $minute - $new_minute_throttle ) * $minute_limit / $minute  );
    # can output this value with the request if desired:
    $minute_hits_remaining = $minute_hits_remaining >= 0 ? $minute_hits_remaining : 0;
}

if ( $new_minute_throttle > $minute ) {
    $wait = ceil( $new_minute_throttle - $minute );
    usleep( 250000 );
    throw new My_Exception ( 'The one-minute API limit of ' . $minute_limit 
        . ' requests has been exceeded. Please wait ' . $wait . ' seconds before attempting again.' );
}
# Save the values back to the database.
$this->save_last_api_request( time() );
$this->save_throttle_minute( $new_minute_throttle );

#2


7  

You can control the rate with the token bucket algorithm, which is comparable to the leaky bucket algorithm. Note that you will have to share the state of the bucket (i.e. the amount of tokens) over processes (or whatever scope you want to control). So you might want to think about locking to avoid race conditions.

您可以使用令牌桶算法来控制速率,它与leaky桶算法相当。注意,您必须在进程(或您想控制的任何范围)上共享bucket的状态(即令牌数量)。因此,您可能需要考虑锁定,以避免竞态条件。

The good news: I did all of that for you: bandwidth-throttle/token-bucket

好消息是:我为你做了所有这些:带宽-节流阀/token-bucket。

use bandwidthThrottle\tokenBucket\Rate;
use bandwidthThrottle\tokenBucket\TokenBucket;
use bandwidthThrottle\tokenBucket\storage\FileStorage;

$storage = new FileStorage(__DIR__ . "/api.bucket");
$rate    = new Rate(10, Rate::SECOND);
$bucket  = new TokenBucket(10, $rate, $storage);
$bucket->bootstrap(10);

if (!$bucket->consume(1, $seconds)) {
    http_response_code(429);
    header(sprintf("Retry-After: %d", floor($seconds)));
    exit();
}

#3


4  

I don't know if this thread is still alive or not but I would suggest to keep these statistics in memory cache like memcached. This will reduce the overhead of logging the request to the DB but still serve the purpose.

我不知道这个线程是否还活着,但是我建议将这些统计信息保存在内存缓存中,比如memcached。这将减少将请求记录到DB的开销,但仍然可以达到目的。

#4


3  

Simplest solution would be to just give each API key a limited number of requests per 24 hours, and reset them at some known, fixed, time.

最简单的解决方案是每24小时只给每个API密钥一个有限的请求数,然后在某个已知的、固定的时间重新设置它们。

If they exhaust their API requests (ie. the counter reaches zero, or the limit, depending on the direction you're counting), stop serving them data until you reset their counter.

如果他们用尽了API请求(即。计数器达到零,或者根据计数方向的限制),停止为它们提供数据,直到重新设置计数器。

This way, it will be in their best interest to not hammer you with requests.

这样,他们最关心的就是不要向你提出要求。

#5


1  

You say that "all thos extra overhead on EVERY request would be defeating the purpose", but I'm not sure that's correct. Isn't the purpose to prevent hammering of your server? This is probably the way I would implement it, as it really only requires a quick read/write. You could even farm out the API server checks to a different DB/disk if you were worried about the performance.

你会说“所有的额外开销都是为了达到目的”,但我不确定这是否正确。这不是防止服务器被锤击的目的吗?这可能是我实现它的方式,因为它只需要快速读/写。如果您担心性能问题,您甚至可以将API服务器检查分发到另一个DB/disk。

However, if you want alternatives, you should check out mod_cband, a third-party apache module designed to assist in bandwidth throttling. Despite being primarily for bandwidth limiting, it can throttle based on requests-per-second as well. I've never used it, so I'm not sure what kind of results you'd get. There was another module called mod-throttle as well, but that project appears to be closed now, and was never released for anything above the Apache 1.3 series.

但是,如果您想要替代方案,您应该检查mod_cband,这是一个第三方apache模块,旨在帮助进行带宽限制。尽管它主要是为了限制带宽,但它也可以基于每秒请求来限制带宽。我从未用过它,所以我不确定你会得到什么样的结果。还有一个模块叫做mod-throttle,但该项目现在似乎已经关闭,Apache 1.3系列之上的任何内容都没有发布。

#6


1  

In addition to implementation from scratch you you can also take a look at API infrastructure like 3scale (http://www.3scale.net) which does rate limiting as well as a bunch of other stuff (analytics etc.). There's a PHP plugin for it: https://github.com/3scale/3scale_ws_api_for_php.

除了从头开始实现之外,您还可以查看API基础设施,比如3scale (http://www.3scale.net),它确实限制了速率,还有其他一些东西(分析等)。这里有一个PHP插件:https://github.com/3scale/3scale_ws_api_for_php。

You can also stick something like Varnish infront of the API and do the API rate limiting like that.

你也可以在API的前面粘上清漆这样做API速率的限制。