Characters
Code Points
UTF-16 Units
UTF-8 Bytes
Try an example!
🧑🏾❤️💋🧑🏻
The most complex emoji in the current Unicode standard is composd of 10 code points including skin color modifiers, zero-width joiners, and a variation selector.
S̶t̶r̶i̶k̶e̶о𝘂𝘁
See how combining characters and misusing unusual characters can be used to create interesting text effects and homographs.
Å != Å
Learn about composing characters and normalized forms.
12345
This text renders backwards from the order of its characters using BIDI control code points. Inspired by https://trojansource.codes/.
↙ ~ ↙️ and 你好! ~ 你好!︁
Examples of an emoji variation sequence and an East Asian punctuation positional variant using variation selectors.
Send me other interesting Unicode examples at @josh@joshdata.me on Mastodon.
About Unicode.run
Text is unexpectedly complicated. Use Unicode.run to debug text.
Here are some things you can do here:
- See each code point’s escape code in a variety of programming languages.
- See the “length” of the text as it would be reported in different programming languages.
- See when characters (technically “extended grapheme clusters”) are composed of multiple code points.
- Click code points in the debugger output to highlight them in the text. (In Firefox you can also select text to highlight the code points in the debugger output.)
- Switch between the text and its UTF-32 or UTF-16BE hex encodings at the top of the page.
- See where text changes direction in bidirectional text, and get warnings when text direction depends on where it is used. Mirrored glyphs in bidirectional text are also noted.
- Get warnings about hidden code points that can alter the display of the text (see https://trojansource.codes/), invalidly placed combining code points, invalid code points, and characters that are not in normalized form.
This is a project by JoshData.
Thanks to
ucd-full (based on Unicode 15.1),
stdlib-js/string-split-grapheme-clusters (based on Unicode 13),
bidi-js (based on Unicode 13),
html-entities,
and the Inter Typeface.
Nikita Prokopov’s The Absolute Minimum Every Software Developer Must Know About Unicode in 2023 (Still No Excuses!) was inspiration for this project.