For years, I used Luke Francl’s Word Unmunger to strip the gunk out of Word-generated HTML files (an occupational hazard when you work in higher ed). I fell out of the habit when I started working more on application interfaces than static files, but recently I had another batch of files that needed cleaning, and I discovered that the Python script no longer worked in Leopard. Oh noes!
Fortunately, Luke is still awesome. Mere moments after I emailed him, he posted an updated version. I’ve updated my Automator script as well, adding a Growl notification and compatibility with the latest version of Automator.
I still haven’t found a better tool for dealing with Word HTML than Luke’s script. The Automator script just makes it a little easier to use if you aren’t comfortable with the command line. Enjoy!