From MobileRead
Jump to: navigation, search

RTF (Rich Text Format) is an word processor interchange format developed by Microsoft to facilitate the exchange of documents between different tools and different operating systems.


[edit] Description

Files in this format will usually have an .rtf extension. The first RTF reader and writer shipped in 1987 as part of Microsoft Word 3.0 for Macintosh. It is the native save format for WordPad files. Some eBook readers can read files this format but it is fairly verbose so the file sizes will be larger.

RTF is an example of a markup language and uses pure text even for graphics by encoding the binary. It is capable of representing fairly complicated layouts. It can be viewed as a markup using any text editor. Returns stored in the file are ignored.

When an eBook reader claims to read RTF it does not mean that it supports all of the fonts, or all of the available constructions. There have been many versions of RTF over the years. Generally if a construction is not understood it will be ignored and not preserved in the file if it is saved.

RTF has two sections in the file, a header and a body. The header contains default information for the file such as the default font size and may contain metadata. If the file is loaded into an editor and then saved some or all of the metadata may be lost. WordPad for example saves files natively in RTF format but does not support metadata.

There have been many versions of RTF. See Microsoft Knowledge base for complete details. Basically modern versions with full exchange of documents including images begins with 1.5 which was released to support MS Word 97.

  • 1.4 for Word 95 (Word 7)
  • 1.5 for Word 97 (Word 8)
  • 1.6 for Word 2000
  • 1.7 for Word 2002
  • 1.8 for Word 2003
  • 1.9 for Word 2007 - has DOCX additions.

Each version adds control words for new database features in that version. Older documents are fully supported. Older RTF readers will ignore control words they don't understand and will remove them if the file is written using an old editor. The specification for the latest version defines the version that was first to support a particular control word.

[edit] Tools

Just about every word processor on the planet can support RTF files. However there are differences in the way the files are produced.

  • WordPad uses RTF as its native format. It does not support images or metadata. It has a very compact footprint for files. Anything WordPad does not understand will be removed from the file when it is saved.
  • MS Word uses RTF as an exchange format. It support all features but produces a larger file than some others due to the fact that it includes information about every change to the file. Every time you save an RTF in MS word the file size will increase.

Cut and Paste - Cut and Paste from many word processors is actually captured in RTF to preserve the character formatting. Some web browsers such as those from Microsoft will also support Cut operation in RTF and HTML formats while others only support HTML. Paste operations that don't support the cut format will just paste the text itself and lose the formatting.

Conversion tools

[edit] Syntax

RTF files include data and control structures. The control structures include:

  • Control Words - begin with a \ and consist of case sensitve letters followed by a delimiter. A delimiter can be a space (the space itself will be ignored), a number or - indicating a parameter for the control word, or another character which will start a new sequence.
  • Control Symbols - begin with a \ and consist of one special character. An example is \~ which is the symbol for a non-breaking space.
  • Groups - are delimited with { and } and are used to specify the text and the attributes of that text.
  • Destinations are special control words that specify that the text goes somewhere else in the document. An example is \footnote. The footnote itself will be defined as a group.

[edit] examples

  • Here is an example of an RTF document using legal syntax.
{\rtf1\ansi{\fonttbl\f0\fswiss Helvetica;}\f0\pard
This is some {\b bold} text.\par

This would be displayed as:

This is some bold text.

Bolding could also been achieved using \b bold\b0

  • Here is the same document saved in wordpad
{\fonttbl{\f0\froman\fprq2\fcharset0 Times New Roman;}}
\viewkind4\uc1\pard\f0\fs24 This is some \b bold\b0  text.\par}
  • Here is a similar document saved in MS Word 2002:
{\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}
{\f36\froman\fcharset238\fprq2 Times New Roman CE;}
{\f37\froman\fcharset204\fprq2 Times New Roman Cyr;}{\f39\froman\fcharset161
\fprq2 Times New Roman Greek;}{\f40\froman\fcharset162\fprq2 Times New Roman Tur;}
{\f41\froman\fcharset177\fprq2 Times New Roman (Hebrew);}
{\f42\froman\fcharset178\fprq2 Times New Roman (Arabic);}{\f43\froman\fcharset186
\fprq2 Times New Roman Baltic;}{\f44\froman\fcharset163\fprq2 Times New Roman (Vietnamese);}}
\red128\green128\blue128;\red192\green192\blue192;}{\stylesheet{\ql \li0\ri0\widctlpar 
\fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 \snext0 Normal;}{\*\cs10 \additive 
\ssemihidden Default Paragraph Font;}
\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap
\fs20\lang1024\langfe1024\cgrid\langnp1024\langfenp1024 \snext11 \ssemihidden Normal Table;}}
{\*\rsidtbl \rsid604983\rsid1011638\rsid2916272\rsid3999307\rsid4476289
{\*\generator Microsoft Word 10.0.6846;}{\info{\title This word is bold}{\author Author1}
{\operator Author1}{\creatim\yr2008\mo9\dy17\hr12\min45}{\revtim\yr2008\mo9\dy17\hr12\min45}
{\version2}{\edmins2}{\nofpages1}{\nofwords2}{\nofchars17}{\*\company  }
{\nofcharsws18}{\vern16393}{\*\password 00000000}}{\*\xmlnstbl }
\paperw12240\paperh15840\margl1800\margr1800\margt1440\margb1440\gutter0 \widowctrl\ftnbj
\snaptogridincell\allowfieldendsel\wrppunct\asianbrkrule\rsidroot15999791 \fet0
{\*\wgrffmtfilter 013f}\sectd \linex0\endnhere\sectlinegrid360\sectdefaultcl\sftnbj 
{\*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl2\pnucltr\pnstart1
\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl3\pndec\pnstart1\pnindent720\pnhang 
{\pntxta .}}{\*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang {\pntxta )}}
{\*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}
{\*\pnseclvl6\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}
{\*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}
{\*\pnseclvl8\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}
{\*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}
\pard\plain \ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0
\lin0\itap0 \fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 
{\insrsid16668571 This }{\b\insrsid16668571\charrsid16668571 word}{\insrsid16668571  is bold.}
{\insrsid4476289 \par }}

It says: This word is bold. Note that some metadata is included. It includes revision data and will get longer every time it is saved.

  • Here is similar text saved in AbiWord. It is a little cleaner and also supports metadata
{\f0\froman\fcharset0\fprq2\fttruetype Times New Roman;}
{\f1\fswiss\fcharset0\fprq2\fttruetype Arial;}
{\f2\fnil\fcharset0\fprq2\fttruetype Dingbats;}
{\f3\froman\fcharset0\fprq2\fttruetype Symbol;}
{\f4\fmodern\fcharset0\fprq1\fttruetype Courier New;}}
{\s1\fi-431\li720\sbasedon28\snext28Contents 1;}
{\s2\fi-431\li1440\sbasedon28\snext28Contents 2;}
{\s3\fi-431\li2160\sbasedon28\snext28Contents 3;}
{\s8\fi-431\li720\sbasedon28Lower Roman List;}
{\s5\tx431\sbasedon24\snext28Numbered Heading 1;}
{\s6\tx431\sbasedon25\snext28Numbered Heading 2;}
{\s7\fi-431\li720Square List;}
{\*\cs11\sbasedon28Endnote Text;}
{\s4\fi-431\li2880\sbasedon28\snext28Contents 4;}
{\s9\fi-431\li720Diamond List;}
{\s10\fi-431\li720Numbered List;}
{\*\cs12\fs20\superEndnote Reference;}
{\s13\fi-431\li720Triangle List;}
{\s14\tx431\sbasedon26\snext28Numbered Heading 3;}
{\s15\fi-431\li720Dashed List;}
{\s16\fi-431\li720\sbasedon10Upper Roman List;}
{\s17\sb440\sa60\f1\fs24\b\sbasedon28\snext28Heading 4;}
{\s18\fi-431\li720Heart List;}
{\s34\fi-431\li720Box List;}
{\s20\fi-431\li720\sbasedon10Upper Case List;}
{\s21\fi-431\li720Bullet List;}
{\s22\fi-431\li720Hand List;}
{\*\cs23\fs20\sbasedon28Footnote Text;}
{\s24\sb440\sa60\f1\fs34\b\sbasedon28\snext28Heading 1;}
{\s25\sb440\sa60\f1\fs28\b\sbasedon28\snext28Heading 2;}
{\s19\qc\sb240\sa120\f1\fs32\b\sbasedon28\snext28Contents Header;}
{\s27\fi-431\li720Tick List;}
{\s26\sb440\sa60\f1\fs24\b\sbasedon28\snext28Heading 3;}
{\s29\fi-431\li720\sbasedon10Lower Case List;}
{\s30\li1440\ri1440\sa120\sbasedon28Block Text;}
{\s36\f4\sbasedon28Plain Text;}
{\s32\tx1584\sbasedon5\snext28Section Heading;}
{\s33\fi-431\li720Implies List;}
{\s35\fi-431\li720Star List;}
{\*\cs31\fs20\superFootnote Reference;}
{\s37\tx1584\sbasedon5\snext28Chapter Heading;}}
{\info\uc1{\title Title1}{\author Author1}{\company publisher1}{\subject Subject1}
{\doccomm This is a description}{\category cat1}}\deftab720\viewkind1\paperw12240
\abinodiroverride\ltrch This }{\f0\fs24\b\lang1033{\*\listtag0}word}{\f0\fs24\lang1033
{\* \listtag0} is bold.}{\f0\fs24\lang1033{\*\listtag0}\par}}

It says: This word is bold.

Note that all files begin with \rtf1 and this is followed with the character encoding. Unicode is supported as well as ANSI. Many readers will terminate reading the file if another \rtf1 is encountered however AbiWord will concatenate the sections together into one document.

[edit] For more information

Personal tools

MobileRead Networks