Entity reference

From MobileRead
Jump to: navigation, search

An entity in XML is a named body of data, usually text. HTML5 replaces this with Named character references.


[edit] Overview

Entities are often used to represent single characters that cannot easily be entered on the keyboard; they are also used to represent pieces of standard ("boilerplate") text that occur in many documents, especially if there is a need to allow such text to be changed in one place only.

Special characters can be represented either using entity references, or by means of numeric character references. An example of a numeric character reference is "€", which refers to the Euro symbol, €, by means of its Unicode codepoint in hexadecimal. This example is shown in hexadecimal as indicated by the x preceding the number. Decimal numbers can also be used by leaving out the x.

An entity reference is a placeholder that represents that entity. It consists of the entity's name preceded by an ampersand ("&") and followed by a semicolon (";").

[edit] DOCTYPE

The various versions of XML and HTML (XHTML) have predefined entities. To insure that the browser understands the correct full set a line at the top of the file should be included to define this distinction for example:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" 

would define a set of predeclared entities as well as attribute support.

[edit] XML entities

XML has five predeclared entities:

  • &amp; (& or "ampersand")
  • &lt; (< or "less than")
  • &gt; (> or "greater than")
  • &apos; (' or "apostrophe")
  • &quot; (" or "quotation mark")

[edit] HTML entities

There are 252 predefined entities for HTML including 4 from the list above (apos is missing). XHTML technically has 253 entries since the apos is included. The 5 entries mentioned above are the only named entities from the ASCII text characters.

For a list of predefined Entities see HTML entities or Character Entities Additional articles in this wiki that show these special entities include Windows-1252, ISO-8859-1 and special characters, primarily the last two. The entry numbers are the same as defined in Unicode characters.

[edit] Entity creation

Entities can be defined once and then used over and over. They are defined by specifying the keyword followed by the expanded definition. Here is the syntax for creating an ENTITY:

<!ENTITY greeting1 "Hello world">
<!ENTITY nbsp "&#160;">

[edit] Entity conversion

An online tools is available, Unicode/html entities converter to do conversion from unicode to HTML and vice versa.

[edit] For more information

Personal tools

MobileRead Networks