Word HTML Cleaner

A tool that strips proprietary Microsoft tags and other cruft from Word HTML documents, leaving basic formatting intact. File sizes are greatly reduced, and the returned HTML is easier to read, revise and employ.

This is intended for fairly basic styled text documents; there is no support for notes, sectioning, ‘widow’ and ‘orphan’ control, etc. Typographic quotes, proper dashes and other special characters, if they exist, will be converted to HTML entities to increase their portability among browsers and platforms. Links, tables and image references should come through fine. Everything else is stripped.

How to Use
Save a Word document ‘as Web Page’ to your hard drive (this will not work with ordinary Word files).

Select the HTML file:


Notes regarding privacy