This paper explains how to set upTeX4ht, Eitan Gurari’s TeX/LaTeX-to-hypertext translation system, when the underlying TeX system is MiKTeX for Win32 platforms. Thanks to Eitan for answering lots of TeX4ht questions and to Steve Mayer for extensive tests and valuable suggestions regarding htrun.
This section lists changes and updates to the instructions, for easy reference.
The current TeX4ht distribution (wfiles.zip) and glyphgif.bat in the MikTeX utilities available below both use -transparent, and this matches ImageMagick version 5.4.3 or later.
Finally, note that it is an especially good idea to upgrade TeX4ht itself from the Bug Fixes page of TeX4ht (as recommended in the detailed instructions below). There are a couple of issues connected with LaTeX itself that have been resolved.
These instructions assume that you have already installed MiKTeX 2.1 to its default location, c:\texmf. In addition, the instructions here assume that you told the MiKTeX install routine to add MiKTeX’s bin directory to your path.
Starting with version 2.0, the procedure for updating MiKTeX’s database is slightly different. You can do it by:
MiKTeX 2.0 was something of an anomaly because it installed by default to c:\Program Files\miktex, and because this directory name contains a space, installation of TeX4ht was more complicated. As noted above, version 2.1 reverts to the standard c:\texmf location of earlier versions; but you can, if you want, still install to a directory whose name contains a space. In this case, however, when you set up TeX4ht, you will need to use the “short form” of these directories, as explained below.
These instructions support the TeX4ht version released February 18, 2001 (or later). If you have an earlier version, you should upgrade.
These instructions apply to version 5.20+ only. If you have an earlier version, you should upgrade.
Except for the default installation directory, these instructions will support GhostScript versions 6.0+.
Run the executable for the distribution gs700w32.exe, which contains a built-in setup routine. You are asked where to install: I recommend that you accept the defaults, which will put the executables in c:\gs\gs7.00\bin. Do not uncheck the box for GhostScript’s fonts: you may need them. Do not install GhostScript to a directory whose path contains a space character — TeX4ht will be unable to process it.
Unzip the distribution to c:\ preserving subdirectories. This currently places the executables in c:\ImageMagick. Do not install ImageMagick to a directory whose path contains a space character — TeX4ht will be unable to process it.
In order for ImageMagick to find its delegates file, you need to set the environment variable magick_delegate_path to point to the directory containing delegates.mgk, currently c:\ImageMagick. You must also set the variable magick_module_path to the location of modules.mgk, which will typically be the same place as delegates.mgk. You can let the batch files described in section 13 do it for you, or you can do it directly, as described in Appendix A.
Unzip wfiles.zip to c:\tex4ht preserving folders (subdirectories). This places the files in c:\tex4ht and subdirectories. If you retrieved the TeX4ht updates, unzip them to c:\tex4ht where the new files will over-write the old ones.
Starting in about May of 2001, the update files are distributed with their read-only attribute set; and this can cause problems. However, some unzippers — eg WinZip — ignore this, so you may not have to do anything. But you should check, as follows:
The TeX4ht style files must now be moved into MiKTeX’s LaTeX tree. (Plain TeX will also find them.)
Next, you need to configure tex4ht.env. If you installed MiKTeX 2.0 to its default location, or any version of MiKTeX to a directory whose name contains a space, skip to section 6.2.1.
Otherwise, open tex4ht.env in a text editor and:
Note that the c:\Localtexmf\fonts directory may not yet exist, if you’ve never needed to make additional metric files beyond those installed with MiKTeX. (You can force it to be made by running a short document at 12 points through LaTeX). Nonetheless, if c:\Localtexmf\ exists, you should add this t entry even if the actual directory referred to does not yet exist.
Save tex4ht.env.
Next, add TeX4ht’s directory (ie c:\tex4ht) to your path. You can do this permanently from the Windows NT Control Panel or by editing autoexec.bat in Windows 95/98 (note that if you edit autoexec.bat you will need to re-boot your computer for it to take effect), or you can let the batch files described in section 13 do it for you on-the-fly.
Finally, copy the .tab files (ht, htlatex, httex, and httexi) to be .bat files and move them to some directory in your path. Since MikTeX’s \bin directory will normally already be in your path, that’s one possible choice.
This section provides instructions for setting up TeX4ht when the MiKTeX installation directories contain a space. In this case you must always use the “short form” of these directory names when setting up TeX4ht. You can check the short form of any file or directory by starting a DOS session and doing dir /p in Win 95/98/ME or dir /x/p in Win NT4/Win2k.
Open tex4ht.env in a text editor and:
Note that the c:\Local TeXMF\fonts directory may not yet exist, if you’ve never needed to make additional metric files beyond those installed with MiKTeX. (You can force it to be made by running a short document at 12 points through LaTeX). Nonetheless, if c:\Local TeXMF\ exists, you should add this t entry even if the actual directory referred to does not yet exist.
Save tex4ht.env.
Next, add TeX4ht’s directory (ie c:\tex4ht) to your path. You can do this permanently from the Windows NT Control Panel or by editing autoexec.bat in Windows 95/98 (note that if you edit autoexec.bat you will need to re-boot your computer for it to take effect), or you can let the batch files described in section 13 do it for you on-the-fly.
Finally, copy the .tab files (ht, htlatex, httex, and httexi) to be .bat files and move them to some directory in your path. Since MikTeX’s \bin directory will normally already be in your path, that’s one possible choice.
When TeX4ht needs to represent a single character not present in the default viewing font, it produces a gif image using GhostScript and ImageMagick. These single-character images — call them “glyph-gifs” — never vary with the document being processed, and TeX4ht is intelligent enough to know that if they are present in the current directory, there’s no reason to re-generate them.
TeX4ht also provides a way to use different programs to create glyph-gifs and other gifs, which opens up the possibility of more sophisticated handling of glyph-gifs. You do this by setting up an F script in tex4ht.env. If you do this, the G script (entries beginning with G) will control other gifs, while the F script will be invoked for glyph-gifs.
The simplest way to set up an F script is to have it call a batch file, say glyphgif.bat to do the work. To do this, just add a line
to tex4ht.env, being sure to use doubled %’s.
The next step is to create glyphgif.bat itself in c:\tex4ht. A sample glyphgif.bat, which reflects the set-up described in this document, is included in the Tex4ht-MiKTeX supplemental utilities, which you can download from here . As you can see, what it does is:
To set up this facility:
Warning There is one down-side to using the cache setup. Suppose you need to generate a glyph-gif, but for some reason the gif generation (via ImageMagick + GhostScript) fails. You now have an incorrect — typically empty — glyph-gif file in your cache and TeX4ht will continue to use it until you delete it. In fact, you must delete not only the incorrect gif in the cache, but also the version in your source directory, because the first thing glyphgif.bat will do upon finding the gif in your source directory is put it back into the cache, and you’re back where you started. This can be very confusing unless you’re alert to what can happen. You can (temporarily) stop the F script from running by opening tex4ht.env and putting a space before the character F on the line in tex4ht.env calling the F script. This forces all glyph-gifs to be made in the normal way, and the cache is not consulted or updated.
Section 13 explains how to obtain and use some utilities which may simplify your use of TeX4ht. You may want to glance at that now. In addition, you might want to take a look at Steve Mayer’s TeXConverter, a GUI interface to TeX4ht and other conversion programs. It is available from here. The TeXConverter assumes you have a working MiKTeX+TeX4ht setup, so you should check out the installation as described below, before getting it.
The TeX4ht distribution contains two test files. Each creates an HTML file containing a single line of text, one single-character gif (a “glyph-gif”), and one multi-character gif. To run these tests, assuming that you’ve set up all environment and path variables correctly:
The TeX compiler will run, and there will be two calls to ImageMagick/GhostScript to generate gifs.
The TeX compiler will run, but because the glyph-gif will already be present, there will be only one call to ImageMagick/GhostScript.
If something goes wrong, see the next section.
If TeX4ht seems unable to find a .tfm file and you are using MiKTeX 2.0, did you remember to use the short filenames when you configured the t entries of tex4ht.env? See section 6.2.
Under Win95/98/ME LaTeX can crash if it is called from a batch file with Unix line-endings; the supplied batch files, like htlatex.bat (which you copied from htlatex.tab) can qualify. The symptom is a message beginning This program has performed an illegal operation and will be shut down.The fix is to convert these into files with Windows/DOS line-endings. The utility unix2win.exe, available here in unix2win.zip can do this for you. (The .zip archive is also included in the supplemental utilities, described below). To use this, in the case of htlatex:
The batch file htlatex.bat should now be OK.
Other things to check include:
If you think that the post-processing (GhostScript + ImageMagick) isn’t getting the right parameters, you can check this by adding lines to the G script, something like the following:
Gecho parameter 1 is: %%1 parameter 2 is: %%2
Gecho parameter 3 is: %%3 parameter 4 is: %%4 |
(This also illustrates that you can make these scripts do almost anything you like).
If all else fails, I’m willing to try to help get the TeX4ht system set up. (This is distinct from general TeX4ht support: for that you should contact Eitan Gurari). Here’s what you need to do.
where xht is the full TeX4ht command, beginning with the name of the appropriate batch file. You should not see any output. If you get a message saying The VDM redirector is already loaded then repeat the command including the full path to redir (eg c:\mydir\redir -o viton.txt -eo xht).
This creates the file viton.txt which contains (most of) the output generated by TeX4ht and its helper applications.
This section contains details of what needs to be changed if you upgrade parts of the system.
The following files contain references to GhostScript’s location:
If you upgrade ImageMagick, the following files contain references ImageMagick’s location:
To install an upgrade file, unzip the upgrade distribution to c:\tex4ht, preserving subdirectories. Then examine this directory.
The file tex4ht-miktex.zip contains several supplemental utilities which may help you to run TeX4ht more easily or debug a setup that is not performing correctly. Note that except for htcmd.exe these are not part of Eitan Gurari’s distribution. If they appear to be going wrong, you should contact me, and not Eitan Gurari. You can download the utilities here. Unzip the files to a temporary directory.
The archive redir.zip contains an aid to debugging the installation; its use is discussed in section 9. Extract redir.exe to some directory in your path. The documentation is in redir.doc; you can delete redir.c.
glyphgif.bat is a sample batch file used in conjunction with the F scripts, discussed in section 6.3. Place it in c:\tex4ht.
htcmd.exe is by Eitan Gurari, and handles certain problems associated with passing directory names to the TeX4ht system on the command line. It is used in the all the batch files in this archive; so you should move it to your TeX4ht directory, c:\tex4ht.
unix2win.zip contains a program which will convert a file with Unix line-ending to one with DOS/Windows line-endings: see section 8.
The batch files in the TeX4ht distribution assume that TeX4ht is in your path and that the necessary environment variables are already set up. I’ve created an alternative set of batch files which change your path, set up the appropriate environment variables, run TeX4ht, and then put everything back as it was.
The batch files have the same syntax as the distributed files, and have names ending in “m”, as in the following table:
Original batch file | Replacement | For Processing | ||
|
|
| ||
htlatex.bat | htlatexm.bat | LaTeX without explicit TeX4ht package | ||
httex.bat | httexm.bat | Plain-TeX | ||
httexi.bat | httexim.bat | TeXInfo | ||
ht.bat | htm.bat | LaTeX with explicit TeX4ht package |
One possible drawback to these batch files is that they use up environment space, and under Windows 95/98/ME you may exceed the size of the area that the operating system reserves for these strings (this will never happen under Windows NT/2K). The symptom is a message Out of environment space: see Appendix A for help on dealing with this.
To set up the batch files:
You can test the new files by starting a DOS session, switching to the tex4ht directory, and repeating the tests from section 7 as follows
The results should be the same as with the previous tests.
Previous versions of TeX4ht required that you explicitly load the tex4ht package in your source. The latest version of TeX4ht makes this unnecessary: the ht*.bat (or ht*m.bat) batch files take care of the details for you. However, you have to run a different batch file for each “type” of source — LaTeX, plain-TeX, TeXinfo, etc. It would be convenient to be able to call the appropriate batch file automatically, depending on what type of document your source is, and this is what htrun.exe tries to do. The program reads your source, deduces from the contents what kind of file it is, and then calls the appropriate batch file.
htrun can also help if you want to use TeX4ht to produce alternative forms of hypertext, such as XML and/or MathML. See Appendix B.1
The htrun setup consists of two files, htrun.exe and htrun.ini, plus additional batch files, as follows:
Batch File | For Processing | |
|
| |
htlatexm.bat | LaTeX without explicit TeX4ht package | |
httexm.bat | Plain-TeX | |
httexim.bat | TeXInfo | |
htlatexpm.bat | LaTeX with explicit TeX4ht package | |
htlatexsm.bat | SWP LaTeX without explicit TeX4ht package (see note below) | |
htswpm.bat | SWP LaTeX with explicit TeX4ht package (see note below) | |
htlatex2e.bat | LaTeX2e document without explicit TeX4ht package | |
htlatex209.bat | LaTeX-2.09 document without explicit TeX4ht package |
Note: The two SWP batch files are for processing Scientific Word/WorkPlace/Notebook documents, and require special setups — see Apendix B.2.
To install htrun:
If you need to process Scientific Word/WorkPlace/Notebook documents, go to Appendix B.2. Once this is done, you can use it in exactly the same way you’d use any of the batch files, namely
where the [optionsn] are as described on the TeX4ht web site, and may be omitted if you want standard processing. You can get some information on what the program’s doing by running htrun with no arguments. See also Appendix B.3.
Once htrun and all batch files are in your path you can repeat the tests: start a DOS session, switch to c:\tex4ht and run
These should give the same results as before, and you should see a message on your screen telling you which batch file is being run, based on what htrun has decided about your source. You can set up the htrun to pause after displaying this information — see Appendix B for details.
This appendix explains two aspects of dealing with the Windows environment: how to set environment variables, and how to increase the size of the environment.
For purposes of illustration, we assume that you want to set the environment variable delegate_path to the value c:\ImageMagick\ImageMagick
If you run out of environment space (the symptom is the message Out of Environment Space) you will need to increase the environment size. Note that this happens only with Windows 95/98/ME: it will never happen under Windows NT or Win2K. Here’s how to fix the problem.
TeX4ht is an extremely flexible system, and supports translation of TeX and LaTeX to a variety of other output formats beyond HTML. htrun can support these, too. The distribution comes with the following sets of batch files for additional translations, each in its own archive (entries in the Prefix column are discussed later):
Output type | Archive | Prefix | ||
|
|
| ||
XHTML+bitmap math | xhtml.zip | xh | ||
XHTML+MathML | mathml.zip | xhm | ||
XHTML+MathML on Mozilla | mozilla.zip | mz |
Each zip file contains a version of htrun.ini which is identical with the one supplied with the main (HTML) setup. To enable any of these translations:
To generate hypertext in any of these other formats, you must specify a “Translation Prefix” on the htrun command line. You do this with an entry of the form -xx, which must appear on your command line before the name of the source file. For example:
Command Line | Result | |
|
| |
htrun myfile | translate myfile to HTML (this is the default) | |
htrun -ht myfile | same as above | |
htrun -mz myfile | translate myfile to XHTML+MathML/Mozilla output |
The effect of the Translation Prefix is that we run the batch file listed in htrun.ini, but with the file name prefixed by the Prefix. For example, if the .ini file contains an entry to run latex.bat, and your Prefix was -mz, we would actually run mzlatex.bat. If you do not provide a Prefix, we default to ht, which translates to HTML. The Prefixes with associated batch files are shown in the table at the beginning of the sub-section; however to enable another Prefix all you need to do is supply the appropriately named batch file.
There is one last consideration to bear in mind. Suppose you’re using the package form of TeX4ht with the html option specified — that is, your source file contains a line like
You can certainly run this file through the version of htrun configured for one of the alternative output translators but you will still get HTML output (because that’s what the package options request). If you are seriously interested in alternative formats, clearly the no-package form of your source document is the one to use. Steve Mayer’s TeXConverter provides an nice alternative solution: if you request an output format with the “wrong” package options, the Converter will offer to (temporarily) remove the entire package line for you. Your file will then be processed by a batch file which does not expect an explicit package, and all will be well.
Support for Mackichan Software’s programs Scientific Word, Scientific WorkPlace or Scientific Notebook (all referred to here as SWP) are provided via two batch files, htlatexsm.bat and htswpm.bat. (They are also supported with other translations, as xxlatexsm.bat and xxswpm.bat). As distributed, they simply report that SWP is not installed and quit. It is simple to enable at least basic SWP support: this will permit you to translate Scientific Notebook documents, and at least some documents produced by the other systems. However, to process many Scientific Word/WorkPlace documents (for example, documents which use the SWP Style Editor) you will need to obtain other LaTeX packages. Some of these are available at Mackichan Software’s FTP site, see below.
To enable basic SWP support:
Now open the batch files htlatexsm.bat and htswpm.bat
htrun needs two kinds of information in order to do its work: a set of document recognition strings, which it uses to determine the source type, and a list of batch files to run. Many of the strings, and all the batch files, are set up in the file htrun.ini and may be configured by the user. As distributed, htrun is set up as follows:
And call the batch file | ||||
If we find (=default): | We decide that the source is | named in htrun.ini by | ||
|
|
| ||
\begin{document} (*) and the | ||||
string named in the .ini file by | ||||
T4PACK (default ={tex4ht}) (**) | LaTeX w/ explicit TeX4ht package | LatexPackName | ||
SWPPACK (default ={swpht}) (**) | SWP w/ explicit package | SWPackName | ||
SWPINPUT (default =tcilatex) (***) | SWP w/o explicit package | SWPName | ||
(none of the above) | LaTeX w/o explicit TeX4ht package | LatexName | ||
The string named in the .ini file by | ||||
TEXIINPUT (default =texinfo) (***) | TeXInfo | TexiName | ||
CSNAME (default =tex4ht\endcsname) (****) | plain-TeX, suitable for processing | PlainName | ||
(End of file, none of the above) | plain-TeX | (report an error) | ||
Notes: | ||||
(*): Not user-changeable | (**): Braces required | |||
(***): As part of an \input statement | (****): Must be entered exactly as written |
Note that as distributed, the system assumes that you will be using the “replacement” batch files (names end in “m”).You can change any of these by editing the .ini file and altering the text to the right of the equal signs. Before running the selected batch file, htrun looks for it in your path, and will refuse to run if the file isn’t found.
The following options are supported in the [integers] section of htrun.ini (any omitted statement in this series is taken to be “off”, equivalent to =0):
Finally, note that htrun’s document-recognition strategy isn’t foolproof. Consider a plain-TeX file containing \input texinfo in a verbatim-like envirornment. This will be — mistakenly — recognized as a TeX4ht-processable TeXInfo file, instead of a plain-TeX file requiring some additional code (the string in CSNAME) before being processed by TeX4ht. Short of building a mini TeX parser, I don’t see how I can cope with this. Of course, LaTeX files, with their well-defined notion of a preamble, are much less susceptible to this sort of thing.