Skip to main content
Smart encoding automatically replaces Unicode characters with visually similar GSM-7 characters. This keeps your messages in the more efficient GSM-7 encoding, reducing segment counts and costs.

Why use smart encoding

SMS messages using GSM-7 encoding fit 160 characters per segment. When a message contains even one Unicode character outside GSM-7, the entire message switches to UTF-16 encoding, which only fits 70 characters per segment. A single smart quote (") or em dash () can double your messaging costs. Example:
MessageEncodingSegmentsCost impact
Hello, how are you? (150 chars)GSM-71Base cost
Hello, how are you? (150 chars with smart quotes)UTF-1633x cost
Hello, how are you? (150 chars, smart encoding ON)GSM-71Base cost

How it works

When smart encoding is enabled:
  1. Your message text is scanned for Unicode characters.
  2. Characters with GSM-7 equivalents are automatically replaced.
  3. The final encoding type (GSM-7 or UTF-16) is determined after transformation.
  4. Segment count is recalculated based on the transformed message.
  5. The API response includes smart encoding metadata.

API response metadata

When you send a message with smart encoding enabled, the API response includes metadata about the transformation:
{
  "data": {
    "id": "...",
    "encoding": "GSM-7",
    "parts": 1,
    "smart_encoding": {
      "smart_encoding_applied": true,
      "final_encoding": "gsm7",
      "segment_count": 1,
      "character_count": 155,
      "replaced_character_count": 3,
      "length_change": 2
    }
  }
}
FieldDescription
smart_encoding_appliedWhether any characters were replaced.
final_encodingThe encoding used after transformation (gsm7 or ucs2).
segment_countNumber of segments after smart encoding.
character_countMessage length after transformation.
replaced_character_countNumber of unique characters that were substituted.
length_changeDifference in length (positive if message grew, e.g., ...).
The parts field in the API response reflects the segment count after smart encoding is applied, so you see the actual billing impact.

Enable smart encoding

Enable smart encoding on your messaging profile via the API or portal.
curl -X PATCH https://api.telnyx.com/v2/messaging_profiles/{id} \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "smart_encoding": true
  }'

Character substitutions

Smart encoding replaces 200+ Unicode characters with GSM-7 equivalents. The tables below show all supported substitutions grouped by category.

Quotation marks

UnicodeGlyphDescriptionReplacement
U+00AB«Left-pointing double angle quotation mark
U+00BB»Right-pointing double angle quotation mark
U+201CLeft double quotation mark
U+201DRight double quotation mark
U+02BAʺModifier letter double prime
U+02EEˮModifier letter double apostrophe
U+201FDouble high-reversed-9 quotation mark
U+275DHeavy double turned comma quotation mark ornament
U+275EHeavy double comma quotation mark ornament
U+301DReversed double prime quotation mark
U+301EDouble prime quotation mark
U+FF02Fullwidth quotation mark
U+201EDouble low quotation mark

Apostrophes and single quotes

UnicodeGlyphDescriptionReplacement
U+2018Left single quotation mark
U+2019Right single quotation mark
U+02BBʻModifier letter turned comma
U+02C8ˈModifier letter vertical line
U+02BCʼModifier letter apostrophe
U+02BDʽModifier letter reversed comma
U+02B9ʹModifier letter prime
U+201BSingle high-reversed-9 quotation mark
U+FF07Fullwidth apostrophe
U+00B4´Acute accent
U+02CAˊModifier letter acute accent
U+0060`Grave accent
U+02CBˋModifier letter grave accent
U+275BHeavy single turned comma quotation mark ornament
U+275CHeavy single comma quotation mark ornament
U+0313̓Combining comma above
U+0314̔Combining reversed comma above
U+FE10Presentation form for vertical comma
U+FE11Presentation form for vertical ideographic comma

Dashes and hyphens

UnicodeGlyphDescriptionReplacement
U+2014Em dash-
U+2013En dash-
U+23BCHorizontal scan line-7-
U+23BDHorizontal scan line-9-
U+2015Horizontal bar-
U+FE63Small hyphen-minus-
U+FF0DFullwidth hyphen-minus-
U+2010Hyphen-
U+2022Bullet-
U+2043Hyphen bullet-

Slashes and division

UnicodeGlyphDescriptionReplacement
U+00F7÷Division sign/
U+00BC¼Vulgar fraction one quarter1/4
U+00BD½Vulgar fraction one half1/2
U+00BE¾Vulgar fraction three quarters3/4
U+29F8Big solidus/
U+0337̷Combining short solidus overlay/
U+0338̸Combining long solidus overlay/
U+2044Fraction slash/
U+2215Division slash/
U+FF0FFullwidth solidus/

Backslashes

UnicodeGlyphDescriptionReplacement
U+29F9Big reverse solidus\
U+29F5Reverse solidus operator\
U+20E5Combining reverse solidus overlay\
U+FE68Small reverse solidus\
U+FF3CFullwidth reverse solidus\

Underscores

UnicodeGlyphDescriptionReplacement
U+0332̲Combining low line_
U+FF3F_Fullwidth low line_
U+2017Double low line_

Vertical lines

UnicodeGlyphDescriptionReplacement
U+20D2Combining long vertical line overlay|
U+20D3Combining short vertical line overlay|
U+2223Divides|
U+FF5CFullwidth vertical line|
U+23B8Left vertical box line|
U+23B9Right vertical box line|
U+23D0Vertical line extension|
U+239CLeft parenthesis extension|
U+239FRight parenthesis extension|

Symbols and punctuation

UnicodeGlyphDescriptionReplacement
U+FE6BSmall commercial at sign@
U+FF20Fullwidth commercial at sign@
U+FE69Small dollar sign$
U+FF04Fullwidth dollar sign$
U+01C3ǃLatin letter retroflex click!
U+FE15Presentation form for vertical exclamation mark!
U+FE57Small exclamation mark!
U+FF01Fullwidth exclamation mark!
U+203CDouble exclamation mark!!
U+FE5FSmall number sign#
U+FF03Fullwidth number sign#
U+FE6ASmall percent sign%
U+FF05Fullwidth percent sign%
U+FE60Small ampersand&
U+FF06Fullwidth ampersand&
U+2026Horizontal ellipsis

Commas

UnicodeGlyphDescriptionReplacement
U+201ASingle low-9 quotation mark,
U+0326̦Combining comma below,
U+FE50Small comma,
U+3001Ideographic comma,
U+FE51Small ideographic comma,
U+FF0CFullwidth comma,
U+FF64Halfwidth ideographic comma,

Parentheses

UnicodeGlyphDescriptionReplacement
U+2768Medium left parenthesis ornament(
U+276AMedium flattened left parenthesis ornament(
U+FE59Small left parenthesis(
U+FF08Fullwidth left parenthesis(
U+27EEMathematical left flattened parenthesis(
U+2985Left white parenthesis(
U+2769Medium right parenthesis ornament)
U+276BMedium flattened right parenthesis ornament)
U+FE5ASmall right parenthesis)
U+FF09Fullwidth right parenthesis)
U+27EFMathematical right flattened parenthesis)
U+2986Right white parenthesis)

Brackets

UnicodeGlyphDescriptionReplacement
U+2774Medium left curly bracket ornament{
U+FE5BSmall left curly bracket{
U+FF5BFullwidth left curly bracket{
U+2775Medium right curly bracket ornament}
U+FE5CSmall right curly bracket}
U+FF5DFullwidth right curly bracket}
U+FF3BFullwidth left square bracket[
U+FF3DFullwidth right square bracket]

Asterisks

UnicodeGlyphDescriptionReplacement
U+204ELow asterisk*
U+2217Asterisk operator*
U+229BCircled asterisk operator*
U+2722Four teardrop-spoked asterisk*
U+2723Four balloon-spoked asterisk*
U+2724Heavy four balloon-spoked asterisk*
U+2725Four club-spoked asterisk*
U+2731Heavy asterisk*
U+2732Open center asterisk*
U+2733Eight spoked asterisk*
U+273ASixteen pointed asterisk*
U+273BTeardrop-spoked asterisk*
U+273COpen center teardrop-spoked asterisk*
U+273DHeavy teardrop-spoked asterisk*
U+2743Heavy teardrop-spoked pinwheel asterisk*
U+2749Balloon-spoked asterisk*
U+274AEight teardrop-spoked propeller asterisk*
U+274BHeavy eight teardrop-spoked propeller asterisk*
U+29C6Squared asterisk*
U+FE61Small asterisk*
U+FF0AFullwidth asterisk*

Math and comparison

UnicodeGlyphDescriptionReplacement
U+02D6˖Modifier letter plus sign+
U+FE62Small plus sign+
U+FF0BFullwidth plus sign+
U+FE64Small less-than sign<
U+FF1CFullwidth less-than sign<
U+0347͇Combining equals sign below=
U+A78AModifier letter short equals sign=
U+FE66Small equals sign=
U+FF1DFullwidth equals sign=
U+FE65Small greater-than sign>
U+FF1EFullwidth greater-than sign>
U+2039Single left-pointing angle quotation mark>
U+203ASingle right-pointing angle quotation mark<

Periods and colons

UnicodeGlyphDescriptionReplacement
U+3002Ideographic full stop.
U+FE52Small full stop.
U+FF0EFullwidth full stop.
U+FF61Halfwidth ideographic full stop.
U+02D0ːModifier letter triangular colon:
U+02F8˸Modifier letter raised colon:
U+2982Z notation type colon:
U+A789Modifier letter colon:
U+FE13Presentation form for vertical colon:
U+FF1AFullwidth colon:
U+204FReversed semicolon;
U+FE14Presentation form for vertical semicolon;
U+FE54Small semicolon;
U+FF1BFullwidth semicolon;
U+FE16Presentation form for vertical question mark?
U+FE56Small question mark?
U+FF1FFullwidth question mark?

Fullwidth digits

UnicodeGlyphDescriptionReplacement
U+FF10Fullwidth digit zero0
U+FF11Fullwidth digit one1
U+FF12Fullwidth digit two2
U+FF13Fullwidth digit three3
U+FF14Fullwidth digit four4
U+FF15Fullwidth digit five5
U+FF16Fullwidth digit six6
U+FF17Fullwidth digit seven7
U+FF18Fullwidth digit eight8
U+FF19Fullwidth digit nine9

Fullwidth and small capital letters

Fullwidth uppercase (U+FF21–U+FF3A):
UnicodeGlyphReplacement
U+FF21–U+FF3AA–ZA–Z
Fullwidth lowercase (U+FF41–U+FF5A):
UnicodeGlyphReplacement
U+FF41–U+FF5Aa–za–z
Small capital letters:
UnicodeGlyphDescriptionReplacement
U+1D00Latin letter small capital AA
U+0299ʙLatin letter small capital BB
U+1D04Latin letter small capital CC
U+1D05Latin letter small capital DD
U+1D07Latin letter small capital EE
U+A730Latin letter small capital FF
U+0262ɢLatin letter small capital GG
U+029CʜLatin letter small capital HH
U+026AɪLatin letter small capital II
U+1D0ALatin letter small capital JJ
U+1D0BLatin letter small capital KK
U+029FʟLatin letter small capital LL
U+1D0DLatin letter small capital MM
U+0274ɴLatin letter small capital NN
U+1D0FLatin letter small capital OO
U+1D18Latin letter small capital PP
U+0280ʀLatin letter small capital RR
U+A731Latin letter small capital SS
U+1D1BLatin letter small capital TT
U+1D1CLatin letter small capital UU
U+1D20Latin letter small capital VV
U+1D21Latin letter small capital WW
U+028FʏLatin letter small capital YY
U+1D22Latin letter small capital ZZ

Greek letters

Greek capital letters that visually resemble Latin letters are substituted:
UnicodeGlyphDescriptionReplacement
U+0391ΑGreek capital letter AlphaA
U+0392ΒGreek capital letter BetaB
U+0395ΕGreek capital letter EpsilonE
U+0397ΗGreek capital letter EtaH
U+0399ΙGreek capital letter IotaI
U+039AΚGreek capital letter KappaK
U+039CΜGreek capital letter MuM
U+039DΝGreek capital letter NuN
U+039FΟGreek capital letter OmicronO
U+03A1ΡGreek capital letter RhoP
U+03A4ΤGreek capital letter TauT
U+03A7ΧGreek capital letter ChiX
U+03A5ΥGreek capital letter UpsilonY
U+0396ΖGreek capital letter ZetaZ

Special language support

UnicodeGlyphDescriptionReplacement
U+00C7ÇLatin capital letter C with cedillaÇ (GSM-7 native)

Tildes and circumflex

UnicodeGlyphDescriptionReplacement
U+02C6ˆModifier letter circumflex accent^
U+0302̂Combining circumflex accent^
U+FF3EFullwidth circumflex accent^
U+1DCDCombining double circumflex above^
U+02DC˜Small tilde~
U+02F7˷Modifier letter low tilde~
U+0303̃Combining tilde~
U+0330̰Combining tilde below~
U+0334̴Combining tilde overlay~
U+223CTilde operator~
U+FF5EFullwidth tilde~

Whitespace characters

These characters are replaced with a standard space or removed:
UnicodeDescriptionReplacement
U+00A0No-break space(space)
U+2000En quad(space)
U+2001Em quad(space)
U+2002En space(space)
U+2003Em space(space)
U+2004Three-per-em space(space)
U+2005Four-per-em space(space)
U+2006Six-per-em space(space)
U+2007Figure space(space)
U+2008Punctuation space(space)
U+2009Thin space(space)
U+200AHair space(space)
U+200BZero width space(removed)
U+202FNarrow no-break space(space)
U+205FMedium mathematical space(space)
U+3000Ideographic space(space)
U+FEFFZero width no-break space(removed)
U+2028Line separator(removed)
U+2029Paragraph separator(removed)
U+2060Word joiner(removed)

Control characters

These control characters are removed or transformed:
UnicodeDescriptionReplacement
U+0009Tab7 spaces
U+0000Null(removed)
U+0003End of text(removed)
U+0004End of transmission(removed)
U+0010Escape(removed)
U+0011Device control one(removed)
U+0012Device control two(removed)
U+0013Device control three(removed)
U+0014Device control four(removed)
U+0017End of transmission block(removed)
U+0019End of medium(removed)
U+0080C1 control codes(removed)
U+008DReverse line feed(removed)
U+0090Device control string(removed)
U+009BControl sequence introducer(removed)
U+009FApplication program command(removed)
Tab characters (U+0009) are converted to 7 spaces, which can significantly increase message length and affect segment count.

Edge cases

Smart encoding handles several edge cases:

Message length increases

Some substitutions increase message length. For example:
  • Horizontal ellipsis () becomes three periods (...) — adds 2 characters.
  • Tab (U+0009) becomes 7 spaces — adds 6 characters.
  • Vulgar fractions like ½ become 1/2 — adds 2 characters.
The segment count is calculated after these replacements, so a message near the 160-character limit may become multi-part after transformation.

Mixed replaceable and non-replaceable characters

If your message contains both replaceable Unicode characters and non-replaceable ones (like emojis), the replaceable characters are still substituted. However, the non-replaceable characters will still cause UTF-16 encoding.

Extended GSM-7 characters

The characters ~^|\{}[] are part of the GSM-7 extended set and count as 2 characters each when calculating segment length. Smart encoding accounts for this when determining final segment count.

Zero-width characters

Zero-width characters (like U+200B zero-width space) are removed entirely from the message.

Empty message after transformation

If your message consists entirely of zero-width or control characters that get removed, the API will return an error. Messages cannot be empty after smart encoding transformation.

Limitations

  • Smart encoding applies to SMS only. MMS and RCS use UTF-8 encoding by default.
  • Not all Unicode characters have GSM-7 equivalents. Emojis and non-Latin scripts will still trigger UTF-16 encoding.
  • Substitutions may slightly alter the appearance of your message. Review the character tables above to understand what changes will occur.