Saturday, August 1, 2009

From Latex to MS Word

Latex is not necessarily a standard format, e.g., the working paper series of many places require MS word documents. There are good and expensive converters --- Latex to MS Word and vice-versa --- as well as a host of free tools (see here). One possibility that I have found to work particularly well is first to convert from Latex to HTML, and then from HTML to MS Word --- or the processor of your choice. Here is how:

1. From Latex to HTML: Use TeX4ht (LaTeX and TeX for Hypertext). TeX4ht is part of the Miktex distribution so chances are that it's already in your machine. Just run a command like this:

> htlatex filename "html,word" "symbol/!" "-cvalidate

if it's the first time, it will install by itself, and then start the conversion. This command is tuned towards MS Word. Such a format relies on bitmaps for mathematical formulas. For the right conversion of formula/pictures to bitmaps Tex4ht needs imagemagik (which by the way, is very useful for converting across different formats). I had to modify tex4ht.env in the following way:

#Gconvert -trim +repage -density 110x110 -transparent "#FFFFFF" zz%%4.ps %%3 Gc:"\Program Files\ImageMagick-6.5.1-Q16\convert" -trim +repage -density 110x110 -transparent "#FFFFFF" zz%%4.ps %%3


2. From HTML to MS Word: Open your html document in MS Word and be careful to follow these instructions to be sure that your images (math and figures, basically) get embedded. If you do not do this, they'll get lost if you only keep the .doc file. Don't blame that on me.

3. For Ubuntu users:
$ sudo apt-get install tex4ht
$ sudo apt-get install dvipng
Use the same command above:
$ htlatex filename "html,word" "symbol/!" "-cvalidate"

And step 2 is unchanged.

1 comment:

  1. I would just like to mention that I appreciate the additional Linux instructions :)

    ReplyDelete