Skip to main content
This guide covers message delivery throughput. For API request limits, see API Rate Limiting.

Rate Limits

The following are the default rate limits applied by Telnyx for each message type and sender type.

Account

Message TypeDefault Rate LimitMax Queue Length
SMS50 messages/second720,000
MMS15 messages/second216,000
RCS1 message/second14,400

Sender

Sender TypeRate LimitPerMax Queue Length
Long Code0.1 MPSNumber1,440
Toll-Free20 MPSNumber288,000
Short Code1,000 MPSNumber14,400,000
Alphanumeric0.1 MPSSender ID1,440
The default Long Code rate limit applies to non-US destinations. For US destinations, throughput is determined at the campaign level based on your 10DLC registration. See 10DLC for carrier-specific limits.
If you need an increased rate limit, contact Telnyx sales to discuss your options.

10DLC

When using US long codes for A2P messaging, throughput is determined by mobile network operators (MNOs) based on your registered 10DLC campaign. Each carrier has different throughput systems.
AT&T assigns throughput per campaign based on “Message Class,” determined by use case type and vetting score.
Message ClassUse Case TypeVetting ScoreSMS TPMMMS TPM
AStandard (Dedicated)75-1004,5002,400
BStandard (Mixed/Marketing)75-1004,5002,400
CStandard (Dedicated)50-742,4001,200
DStandard (Mixed/Marketing)50-742,4001,200
EStandard (Dedicated)1-49240150
FStandard (Mixed/Marketing)1-49240150
TLow Volume Mixed-7550
KPolitical-4,5002,400
PCharity-2,4001,200
SSocial-9,0002,400
XEmergency / Public Safety-4,5002,400
WSole Proprietor-1550
GProxy-60/number50/number
NAgents and Franchises-60/number50/number
TPM = Throughput Per Minute. For standard use cases, the vetting score from your 10DLC brand registration determines which message class (and throughput) your campaign receives. Special use cases have fixed throughput regardless of vetting score.
T-Mobile assigns daily message caps at the brand level, shared across all campaigns under that brand.
Brand TierVetting ScoreDaily Cap
Top75-100200,000
High Mid50-7440,000
Low Mid25-4910,000
Low1-242,000
Unvetted brands default to Low tier unless listed on the Russell 3000. Sole Proprietor campaigns have a 1,000 daily cap.
Verizon has not published specific throughput limits but uses content filtering for 10DLC traffic.

Queuing

When you send messages faster than your rate limit allows, excess messages are automatically queued for delivery.

How Queuing Works

  1. Message submitted — Request validated against your Messaging Profile
  2. Rate limit check — Under limit: sent immediately. Over limit: queued
  3. Queue processing — Messages held up to 4 hours, released in FIFO order
  4. Delivery — Sent to carrier, webhook fired, visible in MDR search

Calculating Queue Size

Each sender type and message type combination has its own queue. The maximum queue length is:
Max Queue Length = Rate Limit (MPS) × 14,400 seconds (4 hours)
The following examples illustrate how sender and account queues interact:
Acme Corp sends SMS from a single Toll-Free number. Their application submits messages at 50 MPS, but the Toll-Free rate limit is 20 MPS.
QueueRate LimitMax Queue Length
Toll-Free #120 MPS288,000 segments
Messages are delivered at 20 MPS, but 30 MPS (50 - 20) accumulates in the queue. After 4 hours of sustained sending, the queue reaches its 288,000 segment limit. Any additional messages return error 40318 (queue full).
Acme Corp sends SMS from 5 Toll-Free numbers simultaneously, each at 20 MPS.
QueueRate LimitMax Queue Length
Toll-Free #120 MPS288,000 segments
Toll-Free #220 MPS288,000 segments
Toll-Free #320 MPS288,000 segments
Toll-Free #420 MPS288,000 segments
Toll-Free #520 MPS288,000 segments
Account SMS50 MPS720,000 segments
Combined sender capacity is 100 MPS (5 × 20), but the account limit is 50 MPS. Messages exceeding the account limit queue at the account level. Once the account queue (720,000) fills, additional messages return error 40318.
Acme Corp sends SMS from 10 Long Code numbers simultaneously, each at 0.1 MPS.
QueueRate LimitMax Queue Length
Long Codes (10 total)1 MPS combined14,400 segments each
Account SMS50 MPS720,000 segments
Here, the sender limit (1 MPS combined) is well below the account limit (50 MPS). The sender queues will fill first. Each Long Code queue holds 1,440 segments — once full, messages to that specific number return error 40318, even though the account has capacity.
When a queue is full, additional messages return error code 40318. See API Errors for details.

Monitoring Queued Messages

Queued messages return a queued status and won’t appear in MDR search until delivered. Monitor queue depth via the Mission Control Portal.
To avoid queue buildup, implement client-side rate limiting to match your throughput limits. See Client-Side Rate Limiting below.

Client-Side Rate Limiting

Implementing rate limiting in your application prevents queue buildup, avoids 40318 errors, and gives you control over message pacing. The examples below show a token bucket rate limiter that works for any sender type.
import time
import threading
import os

try:
    from telnyx import Telnyx
except ImportError:
    Telnyx = None


class RateLimiter:
    """Token bucket rate limiter for SMS sending."""

    def __init__(self, rate: float, burst: int | None = None):
        """
        Args:
            rate: Messages per second (e.g., 0.1 for long code, 20 for toll-free).
            burst: Max burst size. Defaults to rate (no bursting).
        """
        self.rate = rate
        self.burst = burst or max(1, int(rate))
        self.tokens = self.burst
        self.last_refill = time.monotonic()
        self.lock = threading.Lock()

    def acquire(self, timeout: float = 30.0) -> bool:
        """Wait until a token is available. Returns False on timeout."""
        deadline = time.monotonic() + timeout
        while True:
            with self.lock:
                self._refill()
                if self.tokens >= 1:
                    self.tokens -= 1
                    return True
            wait_time = min(1.0 / self.rate, deadline - time.monotonic())
            if wait_time <= 0:
                return False
            time.sleep(wait_time)

    def _refill(self):
        now = time.monotonic()
        elapsed = now - self.last_refill
        self.tokens = min(self.burst, self.tokens + elapsed * self.rate)
        self.last_refill = now


# Usage: Toll-Free at 20 MPS
limiter = RateLimiter(rate=20)

if Telnyx:
    client = Telnyx(api_key=os.environ.get("TELNYX_API_KEY"))

recipients = ["+15551234567", "+15559876543"]  # your recipient list

for to_number in recipients:
    if not limiter.acquire(timeout=60):
        print(f"Rate limit timeout sending to {to_number}")
        continue
    if Telnyx:
        response = client.messages.send(
            from_="+15550001111",
            to=to_number,
            text="Hello from Telnyx!",
        )
        print(f"Sent to {to_number}: {response.data.id}")

Adapting for your sender type

Change the rate parameter to match your sender type:
Sender TypeRate ParameterExample
Long Code0.1RateLimiter(0.1)
Toll-Free20RateLimiter(20)
Short Code1000RateLimiter(1000)
Alphanumeric0.1RateLimiter(0.1)
For Number Pool configurations, the effective rate is the per-number limit multiplied by the number of numbers in the pool. For example, 10 Long Codes at 0.1 MPS each gives an effective 1 MPS pool rate.

Handling Rate Limit Errors

When your sending rate exceeds limits, the API returns specific error codes. Handle them gracefully with retry logic:
import time
import os

from telnyx import Telnyx

client = Telnyx(api_key=os.environ.get("TELNYX_API_KEY"))


def send_with_retry(to: str, from_: str, text: str, max_retries: int = 3):
    """Send a message with exponential backoff on rate limit errors."""
    for attempt in range(max_retries + 1):
        try:
            response = client.messages.send(from_=from_, to=to, text=text)
            return response
        except Exception as e:
            error_code = getattr(e, "code", None)
            if error_code == "40318":  # Queue full
                wait = min(2**attempt, 30)
                print(f"Queue full, retrying in {wait}s (attempt {attempt + 1})")
                time.sleep(wait)
            else:
                raise
    raise Exception(f"Failed to send to {to} after {max_retries} retries")