What Are The API Limits?

What is API rate limiting?

The basic principle of API rate limiting is fairly simple:  if access to the API is unlimited, anyone (or anything) can use the API as much as they want at any time, potentially preventing other legitimate users from accessing the API.

API rate limiting is, in a nutshell, limiting access for people (and bots) to access the API based on the rules/policies set by the API’s operator or owner.

We can think of rate limiting as a form of both security and quality control. This is why rate limiting is integral for any API product’s growth and scalability. Many API owners would welcome growth, but high spikes in the number of users can cause a massive slowdown in the API’s performance. Rate limiting can ensure the API is properly prepared to handle this sort of spike.

An API’s processing limits are typically measured in a metric called Transactions Per Second (TPS), and API rate limiting is essentially enforcing a limit to the number of TPS or the quantity of data users can consume. That is, we either limit the number of transactions or the amount of data in each transaction.

Why is API rate limiting necessary?

API rate limiting can be used as a defensive security measure for the API, and also a quality control method. As a shared service, the API must protect itself from excessive use to encourage an optimal experience for anyone using the API.

Rate limiting on both server-side and client-side is extremely important for maximizing reliability and minimizing latency, and the larger the systems/APIs, the more crucial rate limiting will be.

Here are some key benefits in implementing API rate limiting:

Protecting Resource Usage

All APIs operate on finite resources, and rate limiting is essential to improve the availability of API service for as many users as possible by avoiding excessive resource usages. While resource starvation can be caused by attackers via DDoS attacks, there are actually many DoS incidents that are caused by errors in software rather than outside attacks.

This is often called friendly-fire denial of service (DoS), and implementing rate limiting is crucial to avoid this issue.

Controlling Data Flow

This is especially important in APIs that process and transmit large volumes of data. Rate limiting can be implemented to control data flow, for example by merging many data streams into a single service.

For example, we can distribute data more evenly between two elements of the APIs by limiting the flow into each element. Thus, we can prevent a single API data processor from processing too many items while other processors are currently idle. This function is especially useful in complex APIs that involve different data streams.

Maximizing Cost-Efficiency

Rate limiting can be implemented to control cost, for example, to prevent using too many resources, which may accumulate large costs. Any resource consumed will always generate a cost, and the more requests an API gets, the more costs it will accumulate. Rate limiting can be extremely important to ensure the profitability of the API.

Controlling Quotas Between Users

When the capacity of an API’s service is shared among many users, rate limiting can (and should) be applied to individual users’ usage to ensure fair use without disrupting other users’ access. We can do this by applying the rate limit over a certain time period (i.e. per day) or by limiting the resource’s quantity when it’s possible. These allocation limits are often referred to as quotas.

How does API rate limiting work?

An API is a method to request a specific functionality of a program. While APIs are invisible to most users, they are essential for the application to perform optimally.

For example, when we order a ride on a rideshare service, an API is executed so that we, as a user, will get an accurate fare for the trip. We don’t interact directly with this API, but through the rideshare app’s interface we are making a request to the API, probably without our knowledge.

Every time an API responds to a request, the owner of the API has to pay for resources. In the example above, the rideshare app’s API integration will cause the fare calculation service to pay for compute time whenever an app user requests a ride.

Thus, any service that offers API for developers will implement a rate limit on how many API calls can be made. The limiting can be performed in various different ways, like limiting the number of API calls per hour, day, or unique user, or limiting the amount of data generated per call, among others.

API rate limiting can also help protect the API from malicious bot attacks and DDoS attacks. Bots can make repeated requests to an API to block its service from legitimate users, slow down its performance, or completely shut the API down for a time as a form of DDoS attack.