https://dotat.at/@/2026-01-12-hqlr.html
A while back I wrote about the linear rate limit algorithms leaky bucket and GCRA. Since then I have been vexed by how common it is to implement rate limiting using complicated and wasteful algorithms (for example).
But linear (and exponential) rate limiters have a disadvantage: they can be slow to throttle clients whose request rate is above the limit but not super fast. And I just realised that this disadvantage can be unacceptable in some situations, when it's imperative that no more than some quota of requests is accepted within a window of time.
In this article I'll explore a way to enforce rate limit quotas more precisely, without undue storage costs, and without encouraging clients to oscillate between bursts and pauses. However I'm not sure it's a good idea.
( Read more... )