The e-TeX Short Reference Manual
NTS team
October 1996
Derived from a paper originally presented as:
Philip Taylor, "e-TeX: a 100%-compatible successor to TeX"
(Following humbly in the footsteps of the Grand Wizard)
in: Proceedings of the Ninth European TeX Conference EuroTeX'95, September 4-8, 1995, Arnhem, The Netherlands, pp. 359-370.
e-TeX is the first concrete result of an international research & development project, the NTS Project, which was established under the ægis of DANTE e.V. during 1992. The aims of the project are to perpetuate and develop the spirit and philosophy of TeX, whilst respecting Knuth's wish that TeX should remain frozen.
The group were very concerned that unless there existed some evolutionary flexibility within which TeX could react to changing needs and environments, it might all too soon become eclipsed by more modern yet less sophisticated systems. Accordingly they agreed to investigate a possible successor or successors to TeX, successors which would enshrine and encapsulate all that was best in TeX whilst being freed from the evolutionary constraints which Knuth had placed on TeX itself. To avoid any suggestion that it was TeX which the group sought to develop against Knuth's wishes, a working title of NTS (for New Typesetting System) was chosen for the project.
During the initial meetings of the NTS group, it became clear that there were two possible approaches to developments based on TeX: an evolutionary path which would simply continue where Knuth had left off, and which would use as its basis the source code of TeX itself (i.e. TeX.Web); the other a revolutionary path which would be based on a completely new implementation of TeX, using a modern rapid-prototyping language which could allow individual components of the system to be modified or replaced in a simple and straightforward manner. The group agreed that the latter (revolutionary) approach had much greater potential, but were aware that the re-implementation would be non-trivial, and would require external funding to bring it to fruition in finite time; accordingly they agreed to concentrate their initial efforts on the former (evolutionary) path, and set to work to specify and implement a direct derivative of TeX which became known as e-TeX (the e of e-TeX may be read as extended, enhanced, evolutionary or European at will(!), and is also an acknowledgement of the parallel developments which have lead the LaTeX 3 team to modify their initial goal and to release an interim LaTeX, LaTeX2e, which is directly derived from the LaTeX sources.
The group took as the starting point for the development of e-TeX the many contributions which had been made on NTS-L (the open mailing list on which discussions pertinent to e-TeX & NTS take place), together with the extremely interesting list of ideas which Knuth gives at the end of TeX82.Bug, and which he describes as `Possibly nice ideas that will not be implemented' (and which he contrasts with `Bad ideas that will not be implemented'!). Individual members of the group also contributed ideas of their own which had not necessarily been discussed publicly. All proposals were then subjected to a rigorous vetting procedure to ensure that they conformed to the e-TeX philosophy, which may be summarised as follows:
e-TeX will in all ways demonstrate its affinity to, and derivation from, Knuth's TeX; it will be implemented as a change-file to TeX.Web, and will not exploit features which could only be achieved by using a particular implementation, operating system or language; it will be capable of being used successfully on a machine as small as an 80286-based PC or similar.
At format-generation time, a user will have the option of generating either a TeX-compatible format or an e-TeX format; if the TeX-compatible format is subsequently used in conjunction with e-TeX, the result will be Trip-compatible (i.e. indistinguishable from TeX proper). If an e-TeX format is generated and used in conjunction with e-TeX, then provided that none of the new e-TeX primitives are used, the results will be identical to those which would be produced using TeX proper. If an e-TeX format is used in conjunction with e-TeX and if one or more of the new e-TeX primitives are used, then those portions of the document which are affected by the new primitive(s) may be processed in a manner unique to e-TeX; other portions of the document will be processed in a manner identical to that of TeX proper. Only if an e-TeX format is used in conjunction with e-TeX and if an explicit assignment is made to one of the enhanced-mode variables to enable that particular enhanced mode will e-TeX behave in a manner which may be distinguishable from that of TeX even if no other reference to an e-TeX primitive occurs anywhere in the document. (These modes of operation are referred to as compatibility-mode, extended-mode and enhanced-mode respectively.)
All new e-TeX primitives will be syntactically identical to existing TeX primitives: that is, they will be either control-words or control-symbols within a normal category code régime. Where an analogous primitive exists within TeX, the corresponding e-TeX primitive(s) will occupy the same syntactic niche. Every effort will be made to ensure that new e-TeX primitives fit into the existing set of TeX datatypes; no new datatype will be introduced unless it is absolutely essential.
In brief, this implies that e-TeX will follow the principle of least surprise: an existing TeX user, on using e-TeX for the first time, should not be surprised by e-TeX's behaviour, and should be able to take advantage of new e-TeX features without having either to unlearn some aspects of TeX or to learn some new e-TeX philosophy.
It is intended that e-TeX be available ready-compiled for those systems for which pre-compiled binaries are the norm (e.g. MS-DOS, VMS, ...); for other systems such as Unix(TM), e-TeX is supplied as a change-file which will need to be applied to TeX.Web in the normal way. However, since there will already be an implementation-specific change-file for the system of interest, some means will be required of merging TeX.Web with not one but (at least) two change-files; possibilities include PatchWeb, Tie, etc., but if none of these are available then WebMerge, a TeX script, is supplied and can be used as a slower but satisfactory alternative. In practice, two or three change-files will be needed: the e-TeX system-independent change-file, the TeX system-dependent change-file, and perhaps a small e-TeX system-dependent change-file. The system-independent e-TeX change-file is supplied as part of the e-TeX kit, and sample system-dependent e-TeX change-files are also supplied which may be used as a guide to those places at which system-dependent interactions are to be expected: an experienced implementor should have little difficulty in modifying one of these to produce an e-TeX system-dependent change-file for the system of interest. Once e-TeX has been tangled and woven, it should be compiled and linked in the normal way.
Once a working binary (or binaries, for those systems which have separate executables for IniTeX and VirTeX) has been acquired or produced, the next step will be to generate a suitable format file or files. Whilst e-TeX can be used in conjunction with Plain.TeX to produce a Plain e-format, it is better to use the supplied etex.src file which supplements the e-TeX primitives with additional useful control sequences.
When generating the format file, and regardless of the format source used, one fundamental decision must be made: is e-TeX to generate a compatibility mode format, or an extended mode format? If the former, all e-TeX extensions and enhancements will be disabled, the format will contain only the TeX-defined set of primitives, and any subsequent use of the format in conjunction with e-TeX will result in completely TeX-compatible behaviour and semantics, including compatibility at the level of the Trip test. If the latter option, however, is selected, then all extensions present in e-TeX will automatically be activated, and the format file will contain not only the TeX-defined set of primitives but also those defined by e-TeX itself; any subsequent use of such a format in conjunction with e-TeX will result in e-TeX operating in extended mode; documents which contains no references to any of the e-TeX-defined primitives will continue to generate results identical to those which would have been produced were the document processed by TeX, but compatibility at the Trip-test level can no longer be accomplished, and of course any document which makes reference to an e-TeX primitive will generate results which could not have been accomplished using TeX. It should be noted that neither a compatibility mode format nor an extended mode format may be used in conjunction with TeX itself; they are only suitable for use in conjunction with e-TeX, since formats are not in general portable. Finally it should be emphasised that even if an extended mode format is generated, any document processed using such a format but not referencing any e-TeX-defined primitive will produce results identical to those which would have been produced had the same document been processed using TeX; only if the document makes an explicit assignment to one of the enhanced mode state variables (\TeXXeTstate is the only instance of these in V1 of e-TeX) will compatibility with TeX be compromised: e-TeX is then said to be operating in enhanced mode rather than extended mode.
The choice between generating a compatibility mode format and an extended mode format is made at the point of specifying the format source file: assuming that the operating system supports command-line entry with parameters, then a normal TeX format-generation command would probably resemble:
initex plain \dump
or if the more verbose interactive form is preferred:
initex **plain *\dump
With e-TeX, exactly the same command will achieve exactly the same effect, and the format generated will be a compatibility-mode format; thus assuming that the Ini-version of e-TeX is invoked with the command einitex, the following will both generate compatibility-mode formats:
einitex plain \dump
and
einitex **plain *\dump
In order to generate an extended mode format, the file-specification for the format source file must be preceded by an asterisk (*); whilst this may seem an inelegant mechanism, it has the great advantage that it avoids almost all system dependencies (Graphical user interface (GUI) systems excepted, of course), and the asterisk as a component element of a filename is a very remote possibility (most filing systems reserve the asterisk as a `wild card' character, which can therefore not form a part of a real file name per se). Thus to generate an extended mode Plain format, the following dialogue may be used:
einitex *plain \dump
or
einitex ***plain *\dump
and to generate an extended mode etex.src format, the following instead:
einitex *etex.src \dump
or
einitex ***etex.src *\dump
Once suitable formats have been generated, they can then be used in conjunction both with e-IniTeX and e-VirTeX without further formality: in particular, no asterisk is needed (nor should be used!) if a format is specified, since the format implicitly defines (depending as its mode of generation) in which mode (compatibilty or extended) e-TeX will operate. Thus, for example, if a plain format had been generated in compatibility mode, and an etex format had been generated in extended mode, then both:
einitex &Plain
and
evirtex &plain
will cause e-TeX to process any subsequent commands in compatibility mode. On the other hand, both
einitex &etex
and
eVirTeX &etex
will cause e-TeX to process any subsequent commands in extended mode, but only because the etex format was generated in extended mode: it is not the name of the format, nor is it the contents of the source of the format, which determine the mode of operation -- it is the mode of operation which was used when the format was generated. Any format generated in compatibility mode will cause e-TeX to operate in compatibility mode whenever it is used, whilst the equivalent format, built from the same source but generated in extended mode, will cause e-TeX to operate in extended mode whenever it is used.
Although e-TeX is completely TeX-compatible, and there is therefore no real reason why any system should need both TeX and e-TeX, it is anticipated that until complete confidence exists in the compatibility of e-TeX many sites and users will prefer to retain instances of each. For this reason it is intended that change-files and binaries should ensure that both TeX and e-TeX can happily co-exist on any system by a careful choice of name-spaces. In the case of the reference VMS implementation, for example, this is accomplished by using the prefix "etex_" for each logical name which defines the e-TeX environment, in contrast to the prefix "tex_" which defines the analogous TeX environment; the "etex_*" logical names are defined as search lists which first reference an e-TeX specific location followed by the analogous location for TeX.
Bearing in mind the contraints outlined in the introduction, the group identified 35 new primitives which they believed would give added functionality to e-TeX without compromising its compatibility with TeX; of the 35 new primitives, 29 are extensions (which by definition do not affect the semantics of existing TeX documents), whilst just six (all concerned with the implementation of TeX--XeT) are associated with an enhancement. In addition to the new primitives, additional functionality was added to some existing primitives, and TeX's behaviour in some unusual boundary conditions was made more robust (this last has been subsumed in the most recent version of TeX, so this is no longer e-TeX-specific).
The new features are listed and briefly described below, clustered together to indicate related functionality. The technical terms used below to describe syntax entities as defined in The TeXbook.
TeX--XeT was developed by Peter Breitenlohner based on the original TeX-XeT of Donald Knuth and Pierre MacKay; whereas TeX-XeT generated non-standard DVI files, TeX--XeT generates perfectly normal DVI files which can therefore be processed by standard DVI drivers (assuming, of course, that the necessary fonts are available). Both systems permit the direction of typesetting (conventionally left-to-right in Western documents) to be reversed for part or all of a document, which is particularly useful when setting languages such as Hebrew or Arabic.
\everyeof
tokens are not
inserted if the end-of-file is forced through the use of \endinput
.
(Put on the WWW by Bernd Raichle, Member of the NTS group; subsequently updated by Philip Taylor, with corrections by Peter Breitenlohner.)