Accented characters

Next: Naming conventions Up: Miscellanea Previous: Preloading

Accented characters

Accented characters in LaTeX can be produced using commands such as \"a etc. The precise effect of such commands depends on the font encoding being used. When using a font encoding that contains the accented characters as individual glyphs (such as the T1 encoding, in the case of \"a) words that contain such accented characters can be automatically hyphenated. For font encodings that do not contain the requested individual glyph (such as the OT1 encoding) such a command invokes typesetting instructions that produce the accented character as a combination of character glyphs and diacritical marks in the font. In most cases this involves a call to the TeX primitive \accent. Glyphs constructed as composites in this way inhibit hyphenation of the current word; this is one reason why the T1 encoding is preferable to the original TeX font encoding OT1.

It is important to understand that commands like \"a in LaTeX2e represent just a name for a single glyph (in this case `umlaut a') and contain no information about how to typeset that glyph--thus it does not mean `put two dots on top of the character a'. The decision as to what typesetting routine to use will depend on the encoding of the current font and so this decision is taken at the last minute. Indeed, it is possible that the same input will be typeset in more than one way in the same document; for example, text in section headings may also appear in table of contents and in running heads; and each of these may use a font with a different encoding.

For this reason the notation \"a is not equivalent to:

  \newcommand \chara {a}     \"\chara

In the latter case, LaTeX does not expand the macro \chara but simply compares the notation (the string \"\chara) to its list of known composite notations in the current encoding; when it fails to find \"\chara it does the best it can and invokes the typesetting instructions that put the umlaut accent on top of the expansion of \chara. Thus, even if the font actually contains `ä' as an individual glyph, it will not be used.

The low-level accent commands in LaTeX are defined in such a way that it is possible to combine a diacritical mark from one font with a glyph from another font; for example, \"\textparagraph will produce . The umlaut here is taken from the OT1 encoded font cmr10 whilst the paragraph sign is from the OMS encoded font cmsy10. (This example may be typographically silly but better ones would involve font encodings like OT2 (Cyrillic) that might not be available at every site.)

There are, however, restrictions on the font-changing commands that will work within the argument to such an accent command. These are TeXnical in the sense that they follow from the way that TeX's \accent primitive works, allowing only a special class of commands between the accent and the accented character.

The following are examples of commands that will not work correctly as the accent will appear above a space: the font commands with text arguments (\textbf{...} and friends); all the font size declarations (\fontsize and \Large, etc.); \usefont and declarations that depend on it, such as \normalfont; box commands (e.g. \mbox{...}).

The lower-level font declarations that set the attributes family, series and shape (such as \fontshape{sl}\selectfont) will produce correct typesetting, as will the default declarations such as \bfseries.

Next: Naming conventions Up: Miscellanea Previous: Preloading

Rainer Schoepf
Thu Jul 31 16:42:26 MEST 1997