troff-to-latex (available as
support/troff-to-latex), written by Kamal Al-Yahya at Stanford
University (California, USA), assists in the translation of a
troff document into LaTeX format. It recognises most
-ms and -man macros, plus most eqn and some
tbl preprocessor commands. Anything fancier needs to be
done by hand. Two style files are provided. There is also a man page
(which converts very well to LaTeX...). The program is
copyrighted but free. An enhanced version of this program,
tr2latex, is available in support/tr2latex
The DECUS TeX distribution (see
sources of software)
also contains a program which converts troff to TeX.
WordPerfect
wp2latex (available as
support/wp2latex) has recently been much improved, and is now
available either for MS-DOS or for Unix systems, thanks to its
current maintainer Jaroslav Fojtik.
PC-Write
pcwritex.arc, from support/pcwritex, is a
print driver for PC-Write that ``prints'' a PC-Write
V2.71 document to a TeX-compatible disk file. It was written by Peter
Flynn at University College, Cork, Republic of Ireland.
runoff
Peter Vanroose's (vanroose@esat.kuleuven.ac.be)
conversion program is written in VMS Pascal.
The sources and a VAX executable are available from
support/rnototex
refer/tib
There are a few programs for converting bibliographic
data between BibTeX and refer/tib formats.
They are in biblio/bibtex/utils/refer-tools
In spite of the directory name, it also contains a shell script to
convert BibTeX to refer as well. The collection
is not maintained.
RTF
A program for converting Microsoft's Rich Text Format to
TeX is available in support/rtf2tex, which was written and is
maintained by Robert Lupton (rhl@astro.princeton.edu).
There is also a convertor to LaTeX by Erwin Wechtl, in
support/rtf2latex
Translation to RTF may be done (for a somewhat
constrained set of LaTeX documents) by TeX2RTF, which
can produce ordinary RTF, Windows Help RTF (as well as
HTML, conversion to HTML).
TeX2RTF is supported on various Unix platforms and under
Windows 3.1; it is available from support/tex2rtf
Microsoft Word
A rudimentary program for converting MS-Word to
LaTeX is wd2latex, for MS-DOS (dviware/wd2latex); a better
idea, however, is to convert the document to RTF format and use one
of the RTF converters mentioned above.
A group at Ohio State University (USA) is working on
a common document format based on SGML, with the ambition that any
format could be
translated to or from this one. FrameMaker provides
``import filters'' to aid translation from alien formats
(presumably including TeX) to Framemaker's own.
The aim here is to emulate the Unix nroff, which formats
text as best it can for the screen, from the same
input as the Unix typesetting program troff.
Ralph Droms (droms@bucknell.edu) has a style file and a
program that provide the LaTeX equivalent of nroff,
though it doesn't do a good job with tables and mathematics. The
software is available in support/txt; the original
dvi2tty often does an acceptable job and is available in
dviware/dvi2tty
Another possibility is to use screen.sty (available as
macros/latex209/contrib/misc/screen.sty). Use a dvi2tty program of some kind;
you might try dviware/crudetype as well. Another possibility is to
use the LaTeX-to-ASCII conversion program, l2a
(support/l2a), although this is really more of a de-TeXing
program.
The canonical de-TeXing program is detex
(support/detex), which removes all comments and control sequences
from its input before writing it to its output. Its original purpose
was to prepare input for a dumb spelling checker.
SGML is a very important system for document storage and interchange,
but it has no formatting features; its companion ISO standard
DSSSL
(http://www.jclark.com/dsssl/) is designed for writing
transformations and formatting,
but this has not yet been widely implemented. Some SGML authoring
systems (e.g., SoftQuad Author/Editor) have formatting
abilities, and
there are high-end specialist SGML typesetting systems (e.g., Miles33's
Genera). However, the majority of SGML users probably transform
the source to an existing typesetting system when they want to print.
TeX is a good candidate for this. There are three approaches to writing a
translator:
Write a free-standing translator in the traditional way, with
tools like yacc and lex; this is hard, in
practice, because of the complexity of SGML.
Use a specialist language designed for SGML transformations; the
best known are probably Omnimark and Balise.
They are expensive, but powerful, incorporating SGML query and
transformation abilities as well as simple translation.
Build a translator on top of an existing SGML parser. By far
the best-known (and free!) parser is James Clark's
nsgmls, and this produces a much simpler output format,
called ESIS, which can be parsed quite straightforwardly (one also
has the benefit of an SGML parse against the DTD). Two
good public domain packages use this method:
Both of these allow the user to write `handlers' for every SGML
element, with plenty of access to attributes, entities, and
information about the context within the document tree.
If these packages don't meet your needs for an average SGML
typesetting job, you need the big commercial stuff.
Since HTML is simply an example of SGML, we do not need a specific
system for HTML. However, Nathan Torkington
(Nathan.Torkington@vuw.ac.nz) developed
html2latex from the HTML parser in NCSA's
Xmosaic package.
The program takes an HTML file and generates a LaTeX file from it.
The conversion code is subject to NCSA restrictions, but the whole
source is available as support/html2latex
Michel Goossens and Janne Saarela published a very useful summary of
SGML, and of public domain tools for writing and manipulating it, in
TUGboat 16(2).
TeX is a typesetting language, not a markup system.
With properly-used LaTeX, you may be luckier,
but don't expect a free lunch. Remember that a) if you want a really
good Web document, you had better redesign it from scratch, and b) HTML
(even HTML3) has pretty poor `typesetting' facilities, and anything
beyond the trivial will probably need to end up a graphic.
LaTeX2HTML (support/latex2html) is a package by Nikos Drakos
(mostly of perl scripts) that breaks up a LaTeX document
into one or more components, and links them together so that they can
be read over the World-Wide Web as an hypertext document.
It defines a mapping between LaTeX intra-document
references and hyperlinks, and extends the
mechanisms to permit reference to other (possibly remote) documents
and other Internet resources. It translates LaTeX accented and
other characters (as best it can) to things that World-Wide Web
browsers can display, and translates mathematics
(and other things that browsers can't deal with) to
images that can be loaded in-line into the hypertext document.
LaTeX2HTML needs Perl, the PBM utilities,
dvips, Ghostscript, and other sundries; it
assumes it is running on a Unix system.
Michel Goossens and Janne Saarela published a detailed discussion of
LaTeX2HTML, and how to tailor it, in TUGboat 16(2).
There are two alternative strategies:
Free-standing LaTeX to HTML translations. Hard, but
not impossible. Julian Smart's latex2rtf (available from
support/latex2rtf) does a plausible job on a subset of LaTeX;
Writing an HTML-output backend in LaTeX itself. See
Sebastian Rahtz' paper in TUGboat 16(3) for a discussion of how
to go about this for the general case of SGML.
Rewrite your document using Texinfo
(see Texinfo macro package), and
convert that to HTML;
Look at Adobe Acrobat, an electronic delivery system guaranteed
to preserve your typesetting perfectly.
See Making Acrobat documents from LaTeX;
Invest in the hyperTeX conventions (standardised \special
commands); there are supporting macro packages for plain TeX and
LaTeX).
The HyperTeX project aims to extend the functionality of all the
LaTeX cross-referencing commands (including the table of contents)
to produce \special commands which are parsed by DVI processors
conforming to the HyperTeX guidelines;
it provides general hypertext links, including those
to external documents.
The HyperTeX specification says that conformant viewers/translators
must recognize the following set of \special commands:
href:
html:<a href = "href_string">
name:
html:<a name = "name_string">
end:
html:</a>
image:
html:<img src = "href_string">
base_name:
html:<base href = "href_string">
The href, name and end commands are used to do
the basic hypertext operations of establishing links between sections
of documents.
Further details are available on http://xxx.lanl.gov/hypertex/; there
are two commonly-used implementations of the specification, a
modified xdvi and (recent releases of) dvips. Output from the
latter may be used in recent releases of Ghostscript or Acrobat Distiller.
There are three general routes to Acrobat output: Adobe's original
`distillation' route (via PostScript output), conversion of an
DVI file, and the use of a direct PDF generator such
PDFTeX (see the PDFTeX project) or
MicroPress's VTeX (see
commercial TeX implementations).
For simple documents (with no hyper-references), you can either
process the document in the normal way, produce PostScript
output and distill it,
(on a Windows or Macintosh machine with AcrobatExchange installed) pass the output through the
PDFwriter in place of a printer driver. This route is a dead
end: the PDFwriter cannot create hyperlinks.
process the document in the normal way and process the
DVI with dvipdfm (available from
dviware/dvipdfm, and on the latest TeX-live disc), or
process the document direct to PDF with PDFTeX or
VTeX. PDFTeX has
the advantage of availability for a wide range of platforms, VTeX
(available commercially for Windows, or free of charge for Linux -
systems/linux/micropress) has wider graphics capability, dealing with
encapsulated PostScript and some in-line PostScript.
To translate all the LaTeX cross-referencing into Acrobat
links, you need a LaTeX package to suitably redefine
the internal commands. There are two of these for LaTeX, both
capable of conforming to the HyperTeX specification
(see Making hypertext documents from TeX):
Sebastian Rahtz's hyperref (available from
macros/latex/contrib/supported/hyperref), and Michael Mehlich's
hyper (available from macros/latex/contrib/supported/hyper). Hyperref
uses a configuration file to determine how it will generate hypertext;
it can operate using PDFTeX primitives, the hyperTeX
\specials, or DVI driver-specific \special commands.
Both
dvips or Y&Y's \ProgName|DVIPSONE|
to translate the DVI into PostScript acceptable to
Distiller.
There is no free implementation of all of AdobeDistiller's
functionality, but Ghostscript (version 4.00 onwards)
provides some restricted distilling capability (note the restrictions
on the fonts it can use). However, Distiller itself is now
remarkably cheap (for academics at least).
For viewing (and printing) the resulting files, Adobe's
AcrobatReader is available for a wide
range of platforms (see
ftp://ftp.adobe.com/pub/adobe/acrobatreader). For those
platforms for which Adobe's reader is unavailable,
GhostScript (versions 3.51 onwards) can display
and print PDF files.