Stephanie Leary

Writer and WordPress consultant

  • Books
    • Content Strategy for WordPress (2015)
    • WordPress for Web Developers (2013)
    • Beginning WordPress 3 (2010)
  • Blog
    • Fascism Watch (2016-17)
    • Content Modeling for WordPress series
    • WordPress Hidden Gems series
  • Work
    • Portfolio
    • Services
    • WordPress Plugins
    • WordPress Themes
    • Presentations and Interviews
    • on GitHub →
  • About
    • Press Kit
  • Contact
    • Mailing List

Automator script for Word HTML cleanup

November 22, 2005 Stephanie Leary 3 Comments

So, the other Steph and I were kvetching earlier about the lousy Word HTML we have to clean up all day… and I remembered something. A long time ago, I’d tried to use AppleScript to make the Word Unmunger’s batch mode easier to use. At the time, AppleScript defeated me… but now it’s Automator, and it’s a lot better.

Voila… the Word Unmunger Automator script. You’ll need to grab the Unmunger itself, of course, and edit the workflow to match your path to the script. (I had renamed mine fix.py because I was constantly typing the file name in Terminal.)

Now if only Dreamweaver’s commands were available to Automator. See, the Unmunger sometimes can’t handle HTML from Word files created on a Mac, and running it through Dreamweaver’s Clean Up Word HTML command first solves the problem.

Oh well. This is still going to make my professional life a lot easier.

Update, July 2009

Luke has kindly updated the Unmunger to work in newer versions of Python, which means the script now works in Leopard. I’ve also added a Growl notification to let you know when the files are done.

So, here’s how to use this thing. Download the Automator script and the Unmunger. Open up the script in Automator and adjust the path in the last step to match your preferred file location. Save as an application. Run your new application, and choose the Word-created HTML file(s) you want to clean up. It will save over the originals.

If you just need to fix one or two files, or you can’t run Python scripts, wordoff.org is a great alternative.

Macs, Techy Goodness, Web Design

Comments

  1. Phil Freo says

    December 20, 2008 at 7:00 pm

    The .zip file to the script is a 404. I’d love to see this once the link is updated.

    Reply
  2. Stephanie says

    December 20, 2008 at 9:22 pm

    I seem to have misplaced the file entirely. I’ll keep looking….

    Reply
  3. Stephanie says

    February 10, 2009 at 9:34 pm

    Found it! I’ve fixed the link.

    Reply

Leave a Reply to Stephanie Cancel reply

Your email address will not be published. Required fields are marked *

Latest WordPress Book

Content Strategy for WordPress

A short book for content strategists and managers on implementing a complete content strategy in WordPress: evaluation, analysis, content modeling, editing and workflows, and long-term planning and maintenance.

Read the sample chapter

Kindle Nook iBooks Kobo Smashwords

WordPress for Web Developers

WordPress for Web Developers (9781430258667)

This is a book for professional web designers and developers who already know HTML and CSS, and want to learn to build sites with WordPress. The book begins with a detailed tour of the administration screens and settings, then digs into server-side topics like performance and security. The second half of the book is devoted to development: learning to build WordPress themes and plugins.

This is the second, much-revised and updated edition of Beginning WordPress 3, with a more accurate title. Everything’s been updated for WordPress 3.6.

WordPress for Web Developers is out now. See what's inside...

The best WordPress features you’ve never noticed

  • WordPress Hidden Gems: Screen Options
  • WordPress Hidden Gems: Bulk Edit
  • WordPress Hidden Gems: Private Status
  • WordPress Hidden Gems: Dashboard Feed Readers
  • WordPress Hidden Gems: Options.php

Content Modeling for WordPress series

  • Content modeling for WordPress, part 1: analyze content
  • Content modeling for WordPress, part 2: functional and organizational requirements
  • Content modeling for WordPress, part 3: a sample content model

This is an excerpt from Content Strategy for WordPress.My latest books are Content Strategy for WordPress (2015) and WordPress for Web Developers (2013). Sign up to be notified when I have a new book for you.

Copyright © 2022 Stephanie Leary