Pixlane

Converter · SOTA Unicode 16 + Intl.Segmenter

Unicode Inspector

Inspect any string at the Unicode codepoint level: names, General_Category, Script, Block, bidi class, and all four normalization forms (NFC/NFD/NFKC/NFKD). Grapheme cluster breakdown via Intl.Segmenter for emoji-accurate analysis.

Summary
Grapheme clusters (Intl.Segmenter)
Codepoints
Normalization forms (side-by-side)

How to Use Unicode Inspector in 3 Steps

  1. Configure. Paste any text — emoji, mixed scripts, invisible characters. The tool breaks it into codepoints with U+XXXX notation, official name, and category.
  2. Process. Compare the 4 normalization forms side-by-side. NFC is the web default (shortest composed form); NFKD decomposes compatibility forms like full-width characters and superscripts. Mismatches between forms hint at equality bugs.
  3. Export. View the Intl.Segmenter grapheme breakdown — family emoji 👨‍👩‍👧 shows as 1 grapheme but 7 codepoints, explaining why JS .length counts wrong.

Why Unicode Inspector on Pixlane

Unicode is full of invisible footguns — combining marks, zero-width joiners, homoglyphs, normalization mismatches that make equal-looking strings compare unequal. Pixlane's Unicode Inspector shows every codepoint in a string with its official name (Unicode 16 data), General_Category, Script, and Block. The side-by-side NFC/NFD/NFKC/NFKD comparison surfaces normalization issues (a common source of equality bugs), and Intl.Segmenter (ES2024) produces accurate grapheme cluster views so ZWJ sequences and variation selectors are grouped as users perceive them.

Frequently Asked Questions

What is a codepoint?

A number (0 to 0x10FFFF) that Unicode assigns to every character, symbol, and glyph. Written as U+XXXX in hex. A single user-perceived character (grapheme) may be made of multiple codepoints — especially for emoji, accented letters, and complex scripts.

What's the difference between NFC and NFKD?

NFC is canonical composition — shortest equivalent form, what the web uses by default. NFKD is compatibility decomposition — it also breaks apart visual variants like ⅓ → 1⁄3 and full-width A → A. Use NFC for equality, NFKD for matching/searching.

Why does my string length look wrong?

JavaScript's .length counts UTF-16 code units (not codepoints, not graphemes). An emoji like 😀 is 2 code units. A family emoji like 👨‍👩‍👧 is 8 code units but 1 grapheme. Use Intl.Segmenter for correct counting — Pixlane's Text Counter does this.

Is this tool free?

Yes. Unicode Inspector on Pixlane is completely free with no signup required.

Related Tools