Rate Limiting

Overview

Terminal49 API implements rate limiting to ensure fair usage and maintain service quality for all users. The default API limit is 100 requests per minute per API key/account on a rolling 60-second window. Some high-volume or expensive endpoints use their own bucket.

Rate limit details

Bucket	Limit	Scope	Notes
Default API requests	100 requests per minute	Per API key/account	Applies unless an endpoint documents a separate bucket
Create Tracking Request	100 tracking requests per minute	Per API key/account	`POST /v2/tracking_requests`
Infer Tracking Number	200 requests per minute	Per API key	`POST /v2/tracking_requests/infer_number` has its own bucket
Refresh Container	10 requests per minute	Per API key/account	`POST /v2/containers/{id}/refresh`

Rate limit response

When you exceed the rate limit, the API will return: HTTP Status Code: 429 Too Many Requests Response Headers:

Retry-After: Number of seconds to wait before making another request against the same rate-limit bucket

Retry-After is the response header clients should rely on for 429 handling. If your HTTP client also receives RateLimit-Limit, RateLimit-Remaining, or RateLimit-Reset headers, treat them as informational and continue to respect Retry-After on 429 responses. Response Body:

{
  "errors": [
    {
      "status": "429",
      "title": "Too Many Requests",
      "detail": "Your account has exceeded its API rate limit. Please reduce request frequency or contact support to increase your limit. Consider using webhooks for real-time updates instead of polling."
    }
  ]
}

Best practices

1. Use webhooks instead of polling

The most effective way to avoid rate limits is to use webhooks for real-time updates instead of repeatedly polling the API:

Configure webhooks to receive push notifications when shipment data changes
Eliminates the need for frequent polling
Provides instant updates without consuming your rate limit
See the Webhooks section for setup instructions

2. Implement exponential backoff

If you receive a 429 response:

Check the Retry-After header
Wait for the specified number of seconds
Implement exponential backoff for subsequent failures
Don’t retry immediately, as this will consume your limit further

3. Batch your requests

Use list endpoints with filtering instead of multiple individual requests
Leverage the include parameter to fetch related resources in a single request
Cache responses when appropriate to reduce redundant calls

4. Monitor your usage

Track your request patterns
Identify and optimize high-frequency operations
Consider spreading requests over time rather than bursting

Need a higher limit?

If your use case requires a higher rate limit:

Evaluate webhook usage first - Most polling use cases can be replaced with webhooks
Contact support at support@terminal49.com
Provide details about your use case and expected request volume
Our team will work with you to find an appropriate solution

Example: handling rate limits

Here’s an example of how to properly handle rate limit responses in Python:

import time
import requests

def make_request_with_retry(url, headers, max_retries=3):
    """
    Make an API request with automatic retry on rate limit.

    Args:
        url: The API endpoint URL
        headers: Request headers including Authorization
        max_retries: Maximum number of retry attempts

    Returns:
        Response object if successful

    Raises:
        Exception: If max retries exceeded
    """
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)

        if response.status_code == 429:
            # Get the retry-after value from header (default to 60 seconds)
            retry_after = int(response.headers.get('Retry-After', 60))
            print(f"Rate limited. Waiting {retry_after} seconds...")
            time.sleep(retry_after)
            continue

        # Return response for any other status code
        return response

    raise Exception("Max retries exceeded")

# Example usage
headers = {
    'Authorization': 'Token YOUR_API_KEY'
}
response = make_request_with_retry(
    'https://api.terminal49.com/v2/shipments',
    headers
)

Tips for high-volume applications

If you’re building a high-volume application:

Design for webhooks from the start: Don’t rely on polling for data updates
Implement request queuing: Spread your requests evenly across the rate limit window
Use pagination efficiently: Fetch larger pages less frequently rather than small pages frequently
Cache aggressively: Store and reuse data that doesn’t change frequently
Monitor rate limit headers: Use Retry-After for 429 backoff. If RateLimit-* headers are present, log them for observability rather than making your retry logic depend on them.

Getting Started

Tracking Fundamentals

Webhooks & Events

Integrations & Embeds

Platform & Reference

Overview

Rate limit details

Rate limit response

Best practices

1. Use webhooks instead of polling

2. Implement exponential backoff

3. Batch your requests

4. Monitor your usage

Need a higher limit?

Example: handling rate limits

Tips for high-volume applications

Getting Started

Tracking Fundamentals

Webhooks & Events

Integrations & Embeds

Platform & Reference

Documentation Index

​Overview

​Rate limit details

​Rate limit response

​Best practices

​1. Use webhooks instead of polling

​2. Implement exponential backoff

​3. Batch your requests

​4. Monitor your usage

​Need a higher limit?

​Example: handling rate limits

​Tips for high-volume applications

Overview

Rate limit details

Rate limit response

Best practices

1. Use webhooks instead of polling

2. Implement exponential backoff

3. Batch your requests

4. Monitor your usage

Need a higher limit?

Example: handling rate limits

Tips for high-volume applications