API Request Rate Limiting Guide

What is request limiting?

As an API user, you can only send a limited number of requests within a given period. If you exceed this limit:

  • Your request is rejected with the HTTP 429 Too Many Requests code
  • You receive information about how long to wait before trying again

The limits are as follows:

  • Maximum of 3 points/second
  • Temporary “burst” allowance up to 10

What is a “burst”?

  • A burst is a temporary tolerance that allows you to send multiple rapid requests beyond the average authorized rate (3 points/second).
  • You can send up to 10 requests in quick succession without being immediately blocked.
  • The system then automatically adjusts the flow back to the average of 3 requests/second.
  • If the high rate continues, the limiting mechanism activates and returns an HTTP 429 error (or a SOAP fault for the SOAP API).
This flexibility, inspired by the Leaky Bucket algorithm, helps manage activity spikes without abruptly interrupting the service.
  • 1 point normally equals 1 request but may change depending on the resource cost of certain requests.
  • See the description of the “Leaky Bucket” algorithm here for more details: https://fr.wikipedia.org/wiki/Seau_perc%C3%A9

Behaviours you will observe:

Authorized requests (below the limit):

HTTP/1.1 200 OK 
X-Rate-Limit-Remaining: 42 
Content-Type: application/json 
... 

The X-Rate-Limit-Remaining header indicates how many requests you can still make within the current time window.

Limit exceeded (HTTP 429):

HTTP/1.1 429 Too Many Requests 
X-Rate-Limit-Retry-After-Milliseconds: 5000 
Content-Type: application/json 
 
{"error": "Too many requests"} 
  

 

Response headers to watch

  • X-Rate-Limit-Remaining: number of requests remaining before rate limiting

  • X-Rate-Limit-Retry-After-Milliseconds: milliseconds to wait before retrying (in case of a 429)

Recommended client-side handling

JavaScript (fetch) : 

async function appellerAPI(url) { 
  const response = await fetch(url); 
   
  if (response.status === 429) { 
    const attenteMs = parseInt(response.headers.get('X-Rate-Limit-Retry-After-Milliseconds') || '1000'); 
    console.warn(`Limite atteinte. Attendre ${attenteMs} ms avant de réessayer.`); 
    await new Promise(resolve => setTimeout(resolve, attenteMs)); 
    return null; 
  } 
   
  const restant = response.headers.get('X-Rate-Limit-Remaining'); 
  console.log(`Requêtes restantes : ${restant}`); 
   
  return await response.json(); 

  

 

Python (requests) : 

import time 
import requests 
 
def appeler_api(url): 
    response = requests.get(url) 
     
    if response.status_code == 429: 
        attente_ms = int(response.headers.get('X-Rate-Limit-Retry-After-Milliseconds', '1000')) 
        print(f'Limite atteinte. Attente de {attente_ms} ms') 
        time.sleep(attente_ms / 1000.0) 
        return None 
     
    restant = response.headers.get('X-Rate-Limit-Remaining') 
    print(f'Requêtes restantes : {restant}') 
     
    return response.json() 
  

 

C# (HttpClient) : 

async Task<string> AppelerAPI(string url) 

    using var client = new HttpClient(); 
    var response = await client.GetAsync(url); 
     
    if ((int)response.StatusCode == 429) 
    { 
        var attenteMsStr = response.Headers.GetValues("X-Rate-Limit-Retry-After-Milliseconds").FirstOrDefault() ?? "1000"; 
        if (int.TryParse(attenteMsStr, out var attentMs)) 
        { 
            Console.WriteLine($"Limite atteinte. Attente de {attentMs} ms"); 
            await Task.Delay(attentMs); 
        } 
        return null; 
    } 
     
    var restant = response.Headers.GetValues("X-Rate-Limit-Remaining").FirstOrDefault(); 
    Console.WriteLine($"Requêtes restantes : {restant}"); 
     
    return await response.Content.ReadAsStringAsync(); 

  

Recommended retry strategy

  • First 429: wait exactly the delay indicated

  • Repeated 429s: add a gradually increasing delay (exponential backoff)

  • Monitor X-Rate-Limit-Remaining: if it’s low (≤ 3), space out your requests

Common mistakes to avoid

  • Ignoring HTTP 429 and continuing to send requests immediately

  • Failing to respect the indicated wait time

  • Sending bursts of parallel requests that instantly exhaust your quota

  • Assuming the limit is fixed (it can change)

Best practices

  • Caching: avoid redundant API calls

  • Spacing: don’t send all your requests at once

  • Monitoring: watch X-Rate-Limit-Remaining to anticipate throttling

  • Error handling: implement fallbacks in case of 429 responses

Error 429 response structure

The JSON response will always include at least:


  "error": "Too many requests" 
}   

Complete integration example

class APIClient { 
  constructor(baseURL) { 
    this.baseURL = baseURL; 
    this.dernierAppel = 0; 
    this.delaiMinimum = 100; // ms entre les appels 
  } 
 
  async appel(endpoint) { 
    // Respecter un délai minimum entre appels 
    const maintenant = Date.now(); 
    const tempsEcoule = maintenant - this.dernierAppel; 
    if (tempsEcoule < this.delaiMinimum) { 
      await new Promise(r => setTimeout(r, this.delaiMinimum - tempsEcoule)); 
    } 
 
    const response = await fetch(`${this.baseURL}${endpoint}`); 
    this.dernierAppel = Date.now(); 
 
    if (response.status === 429) { 
      const attenteMs = parseInt(response.headers.get('X-Rate-Limit-Retry-After-Milliseconds') || '1000'); 
      console.warn(`Limitation activée. Attente de ${attenteMs} ms`); 
      await new Promise(r => setTimeout(r, attenteMs)); 
      return this.appel(endpoint); // Retry une fois 
    } 
 
    const restant = response.headers.get('X-Rate-Limit-Remaining'); 
    if (restant && parseInt(restant) < 5) { 
      console.info('Quota faible, ralentissement des appels'); 
      this.delaiMinimum = 500; // Ralentir 
    } else { 
      this.delaiMinimum = 100; // Vitesse normale 
    } 
 
    return response.json(); 
  } 

Summary

  • Monitor X-Rate-Limit-Remaining to anticipate throttling

  • On 429, wait for the duration specified in X-Rate-Limit-Retry-After-Milliseconds

  • Implement a smart retry strategy

  • Space out your requests to avoid hitting rate limits

Application of rate limiting to the SOAP API

The request throttling mechanism also applies to the SOAP API, following the same logic and thresholds as for the REST API.

 

However, the way a limit exceedance is reported differs:

  • The SOAP API does not return an HTTP 429 status code, but instead a SOAP Fault.

  • The error message is encapsulated within the SOAP response, in a format consistent with the protocol.

Example of SOAP response when the limit is exceeded:

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> 
   <soap:Body> 
       <soap:Fault> 
           <faultcode>soap:Client</faultcode> 
           <faultstring>Too many requests - throttling limit reached</faultstring> 
           <detail> 
               <rateLimitRetryAfterMilliseconds>5000</rateLimitRetryAfterMilliseconds> 
           </detail> 
       </soap:Fault> 
   </soap:Body> 
</soap:Envelope> 

Expected client behavior

The SOAP client must catch the fault and read the value of rateLimitRetryAfterMilliseconds to determine how long to wait before making another call.

The retry and delay management logic is the same as described for the REST API.

The same thresholds (3 points per second, temporary burst up to 10) apply.


In summary

  • Same logic and limits as for the REST API

  • Only the format and return code differ

  • The same wait time (rateLimitRetryAfterMilliseconds) must be respected