A Simple Character Entity Chart

This article was originally posted on the evolt.org web site.

For those who use characters in their copy that don’t normally appear on the keyboard, it’s always been a hit-and-miss game of tracking down the ISO character entity and choosing between the named and numeric value. In fact, so many books on HTML, as well as online resources, have provided the wrong entities for so long, few knew it until the W3C validator started throwing them back as errors. You can’t imagine how many em-dashes I had to find and convert from to the correct .

The W3C character entity reference is certainly definitive, but not practical. It offers no display of how these entities might be rendered, leaving you to copy and paste based on descriptions until you find the right one.

Well, I got sick of guessing, so I took the W3C documentation and turned it into a handy chart for my staff. Now it sits on our intranet for all to enjoy. Oh… and here… now… as well.

For those who might not normally use character entities, or who’ve never heard of The Elements of Typographic Style, you might not have been exposed to reasons why you would even care about characters not on your keyboard. I’d suggest you surf over to the A List Apart article The Trouble With EM ‘n EN.

By the way, if you need to know what browsers support what character entities, just view this page and look for the characters in the tables. One column shows the character as called by its numeric entity, and one shows it as called by its named entity. This is the chart for HTML 4.x, but also applies to XHTML 1.x.

Character entity references in HTML 4

The charts below will allow you to copy and paste the appropriate character and numeric entities for your documents. To be sure a particular browser supports the entities (both named and numeric), simply open your browser to this pages and view the charts. If the character you want doesn’t appear in the target browser, it doesn’t work (simple, huh?).

What you’ll find below is the copy of the character entity specification from the W3C with my tabled versions of the entities interspersed.

24.1 Introduction to character entity references

A character entity reference is an SGML
construct that references a character of the document
character set.

HTML 4.x, as well as XHTML 1.x, supports several sets of character entity
references:

The following sections present the complete lists of character entity
references.

ISO 8859-1

24.2 Character entity references for ISO 8859-1 characters

The character entity references in this section produce characters whose
numeric equivalents should already be supported by conforming HTML 2.0 user
agents. Thus, the character entity reference ÷ is a more convenient
form than ÷ for obtaining the division sign (÷).

To support these named entities, user agents need only recognize the entity
names and convert them to characters that lie within the repertoire of
[ISO88591]
.

Character 65533 (FFFD hexadecimal) is the last valid character in UCS-2.
65534 (FFFE hexadecimal) is unassigned and reserved as the byte-swapped version
of ZERO WIDTH NON-BREAKING SPACE for byte-order detection purposes. 65535 (FFFF
hexadecimal) is unassigned.

24.2.1 The list of characters

Character Entity Numeric Entity Rendered (Character) Rendered (Numeric) Name
        non-breaking space
¡ ¡ ¡ ¡ inverted exclamation mark
¢ ¢ ¢ ¢ cent sign
£ £ £ £ pound sign
¤ ¤ ¤ ¤ currency sign
¥ ¥ ¥ ¥ yen sign
¦ ¦ ¦ ¦ broken bar
§ § § § section sign
¨ ¨ ¨ ¨ diaeresis
© © © © copyright sign
ª ª ª ª feminine ordinal indicator
« « « « left-pointing double angle quotation mark
¬ ¬ ¬ ¬ not sign
­ ­ ­ ­ soft hyphen
® ® ® ® registered sign
¯ ¯ ¯ ¯ macron
° ° ° ° degree sign
± ± ± ± plus-minus sign
² ² ² ² superscript two
³ ³ ³ ³ superscript three
´ ´ ´ ´ acute accent
µ µ µ µ micro sign
¶ pilcrow sign
· · · · middle dot
¸ ¸ ¸ ¸ cedilla
¹ ¹ ¹ ¹ superscript one
º º º º masculine ordinal indicator
» » » » right-pointing double angle quotation mark
¼ ¼ ¼ ¼ vulgar fraction one quarter
½ ½ ½ ½ vulgar fraction one half
¾ ¾ ¾ ¾ vulgar fraction three quarters
¿ ¿ ¿ ¿ inverted question mark
À À À À latin capital letter A with grave
Á Á Á Á latin capital letter A with acute
    latin capital letter A with circumflex
à à à à latin capital letter A with tilde
Ä Ä Ä Ä latin capital letter A with diaeresis
Å Å Å Å latin capital letter A with ring above
Æ Æ Æ Æ latin capital letter AE
Ç Ç Ç Ç latin capital letter C with cedilla
È È È È latin capital letter E with grave
É É É É latin capital letter E with acute
Ê Ê Ê Ê latin capital letter E with circumflex
Ë Ë Ë Ë latin capital letter E with diaeresis
Ì Ì Ì Ì latin capital letter I with grave
Í Í Í Í latin capital letter I with acute
Î Î Î Î latin capital letter I with circumflex
Ï Ï Ï Ï latin capital letter I with diaeresis
Ð Ð Ð Ð latin capital letter ETH
Ñ Ñ Ñ Ñ latin capital letter N with tilde
Ò Ò Ò Ò latin capital letter O with grave
Ó Ó Ó Ó latin capital letter O with acute
Ô Ô Ô Ô latin capital letter O with circumflex
Õ Õ Õ Õ latin capital letter O with tilde
Ö Ö Ö Ö latin capital letter O with diaeresis
× × × × multiplication sign
Ø Ø Ø Ø latin capital letter O with stroke
Ù Ù Ù Ù latin capital letter U with grave
Ú Ú Ú Ú latin capital letter U with acute
Û Û Û Û latin capital letter U with circumflex
Ü Ü Ü Ü latin capital letter U with diaeresis
Ý Ý Ý Ý latin capital letter Y with acute
Þ Þ Þ Þ latin capital letter THORN
ß ß ß ß latin small letter sharp s
à à à à latin small letter a with grave
á á á á latin small letter a with acute
â â â â latin small letter a with circumflex
ã ã ã ã latin small letter a with tilde
ä ä ä ä latin small letter a with diaeresis
å å å å latin small letter a with ring above
æ æ æ æ latin small letter ae
ç ç ç ç latin small letter c with cedilla
è è è è latin small letter e with grave
é é é é latin small letter e with acute
ê ê ê ê latin small letter e with circumflex
ë ë ë ë latin small letter e with diaeresis
ì ì ì ì latin small letter i with grave
í í í í latin small letter i with acute
î î î î latin small letter i with circumflex
ï ï ï ï latin small letter i with diaeresis
ð ð ð ð latin small letter eth
ñ ñ ñ ñ latin small letter n with tilde
ò ò ò ò latin small letter o with grave
ó ó ó ó latin small letter o with acute
ô ô ô ô latin small letter o with circumflex
õ õ õ õ latin small letter o with tilde
ö ö ö ö latin small letter o with diaeresis
÷ ÷ ÷ ÷ division sign
ø ø ø ø latin small letter o with stroke
ù ù ù ù latin small letter u with grave
ú ú ú ú latin small letter u with acute
û û û û latin small letter u with circumflex
ü ü ü ü latin small letter u with diaeresis
ý ý ý ý latin small letter y with acute
þ þ þ þ latin small letter thorn
ÿ ÿ ÿ ÿ latin small letter y with diaeresis

Symbols, Mathematical Symbols, and Greek Letters

24.3 Character entity references for
symbols, mathematical symbols, and Greek letters

The character entity references in this section produce characters that may
be represented by glyphs in the widely available Adobe Symbol font, including
Greek characters, various bracketing symbols, and a selection of mathematical
operators such as gradient, product, and summation symbols.

To support these entities, user agents may support full [ISO10646] or use
other means. Display of glyphs for these characters may be obtained by being
able to display the relevant [ISO10646] characters or
by other means, such as internally mapping the listed entities, numeric
character references, and characters to the appropriate position in some font
that contains the requisite glyphs.

When to use Greek entities. This entity set contains
all the letters used in modern Greek. However, it does not include Greek
punctuation, precomposed accented characters nor the non-spacing accents
(tonos, dialytika) required to compose them. There are no archaic letters,
Coptic-unique letters, or precomposed letters for Polytonic Greek. The entities
defined here are not intended for the representation of modern Greek text and
would not be an efficient representation; rather, they are intended for
occasional Greek letters used in technical and mathematical works.

24.3.1 The list of characters

Character Entity Numeric Entity Rendered (Character) Rendered (Numeric) Name
ƒ ƒ ƒ ƒ latin small f with hook
Α Α Α Α greek capital letter alpha
Β Β Β Β greek capital letter beta
Γ Γ Γ Γ greek capital letter gamma
Δ Δ Δ Δ greek capital letter delta
Ε Ε Ε Ε greek capital letter epsilon
Ζ Ζ Ζ Ζ greek capital letter zeta
Η Η Η Η greek capital letter eta
Θ Θ Θ Θ greek capital letter theta
Ι Ι Ι Ι greek capital letter iota
Κ Κ Κ Κ greek capital letter kappa/td>
Λ Λ Λ Λ greek capital letter lambda
Μ Μ Μ Μ greek capital letter mu
Ν Ν Ν Ν greek capital letter nu
Ξ Ξ Ξ Ξ greek capital letter xi
Ο Ο Ο Ο greek capital letter omicron
Π Π Π Π greek capital letter pi
Ρ Ρ Ρ Ρ greek capital letter rho
Σ Σ Σ Σ greek capital letter sigma
Τ Τ Τ Τ greek capital letter tau
Υ Υ Υ Υ greek capital letter upsilon
Φ Φ Φ Φ greek capital letter phi
Χ Χ Χ Χ greek capital letter chi
Ψ Ψ Ψ Ψ greek capital letter psi
Ω Ω Ω Ω greek capital letter omega
α α α α greek small letter alpha
β β β β greek small letter beta
γ γ γ γ greek small letter gamma
δ δ δ δ greek small letter delta
ε ε ε ε greek small letter epsilon
ζ ζ ζ ζ greek small letter zeta
η η η η greek small letter eta
θ θ θ θ greek small letter theta
ι ι ι ι greek small letter iota
κ κ κ κ greek small letter kappa
λ λ λ λ greek small letter lambda
μ μ μ μ greek small letter mu
ν ν ν ν greek small letter nu
ξ ξ ξ ξ greek small letter xi
ο ο ο ο greek small letter omicron
π π π π greek small letter pi
ρ ρ ρ ρ greek small letter rho
ς ς ς ς greek small letter final sigma
σ σ σ σ greek small letter sigma
τ τ τ τ greek small letter tau
υ υ υ υ greek small letter upsilon
φ φ φ φ greek small letter phi
χ χ χ χ greek small letter chi
ψ ψ ψ ψ greek small letter psi
ω ω ω ω greek small letter omega
ϑ ϑ ϑ ϑ greek small letter theta symbol
ϒ ϒ ϒ ϒ greek upsilon with hook symbol
ϖ ϖ ϖ ϖ greek pi symbol
• bullet
… horizontal ellipsis
′ primeminutes
″ double prime
‾ overline
⁄ fraction slash
℘ script capital P
ℑ blackletter capital I
ℜ blackletter capital R
™ trade mark sign
ℵ alef symbol
← leftwards arrow
↑ upwards arrow
→ rightwards arrow
↓ downwards arrow
↔ left right arrow
↵ downwards arrow with corner leftwards
⇐ leftwards double arrow
⇑ upwards double arrow
⇒ rightwards double arrow
⇓ downwards double arrow
⇔ left right double arrow
∀ for all
∂ partial differential
∃ there exists
∅ empty set
∇ nabla
∈ element of
∉ not an element of
∋ contains as member
∏ n-ary product
∑ n-ary sumation
− minus sign
∗ asterisk operator
√ square root
∝ proportional to
∞ infinity
∠ angle
∧ logical and
∨ logical or
∩ intersection
∪ union
∫ integral
∴ ∴ therefore
∼ tilde operator
≅ approximately equal to
≈ almost equal to
≠ not equal to
≡ identical to
≤ less-than or equal to
≥ greater-than or equal to
⊂ subset of
⊃ superset of
⊄ not a subset of
⊆ subset of or equal to
⊇ superset of or equal to
⊕ circled plus
⊗ circled times
⊥ up tack
⋅ dot operator
⌈ left ceiling
⌉ right ceiling
⌊ left floor
⌋ right floor
⟨ left-pointing angle bracket
⟩ right-pointing angle bracket
◊ lozenge
♠ black spade suit
♣ black club suit
♥ black heart suit
♦ black diamond suit

Markup-Significant and Internationalization Characters

24.4 Character entity references for
markup-significant and internationalization characters

The character entity references in this section are for escaping
markup-significant characters (these are the same as those in HTML 2.0 and
3.2), for denoting spaces and dashes. Other characters in this section apply to
internationalization issues such as the disambiguation of bidirectional text
(see the section on bidirectional
text
for details).

Entities have also been added for the remaining characters occurring in
CP-1252 which do not occur in the HTMLlat1 or HTMLsymbol entity sets. These all
occur in the 128 to 159 range within the CP-1252 charset. These entities permit
the characters to be denoted in a platform-independent manner.

To support these entities, user agents may support full [ISO10646] or use
other means. Display of glyphs for these characters may be obtained by being
able to display the relevant [ISO10646] characters or
by other means, such as internally mapping the listed entities, numeric
character references, and characters to the appropriate position in some font
that contains the requisite glyphs.

24.4.1 The list of characters

Character Entity Numeric Entity Rendered (Character) Rendered (Numeric) Name
" " " " quotation mark
& & & & ampersand
&lt; < < < less-than sign
&gt; > > > greater-than sign
&OElig; ΠΠΠlatin capital ligature OE
&oelig; œ œ œ latin small ligature oe
&Scaron; Š Š Š latin capital letter S with caron
&scaron; š š š latin small letter s with caron
&Yuml; Ÿ Ÿ Ÿ latin capital letter Y with diaeresis
&circ; ˆ ˆ ˆ modifier letter circumflex accent
&tilde; ˜ ˜ ˜ small tilde
&ensp; en space
&emsp; em space
&thinsp; thin space
&zwnj; zero width non-joiner
&zwj; zero width joiner
&lrm; left-to-right mark
&rlm; right-to-left mark
&ndash; en dash
&mdash; em dash
&lsquo; left single quotation mark
&rsquo; right single quotation mark
&sbquo; single low-9 quotation mark
&ldquo; left double quotation mark
&rdquo; right double quotation mark
&bdquo; double low-9 quotation mark
&dagger; dagger
&Dagger; double dagger
&permil; per mille sign
&lsaquo; single left-pointing angle quotation mark
&rsaquo; single right-pointing angle quotation mark
&euro; euro sign

Advertisements

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s