ffutures: (Default)
[personal profile] ffutures
While my aim with the Tooth and Claw RPG is to publish it primarily as a PDF, I also want to do an HTML release. But I'd really prefer not to go to the hassle of coding it by hand. Currently it's a Word document, and getting the PDF is really easy, I just print to Acrobat. Getting HTML appears to be a nightmare by comparison, at least if I use Word.

Is there an alternative that'll give me nice clean HTML without the huge amount of crud that Word puts in, e.g. micromanaging the position of every letter? But without mangling the page layout too badly?

Just tried Open Office - it is definitely NOT the answer to this one, layout was awful.

I have a very vague memory of a program called Stripper that did something like this, does anyone know if it's still around?

Date: 2007-10-11 08:20 pm (UTC)
From: [identity profile] pengshui-master.livejournal.com
I used abiword to generate html the last time I need to do this, but I didn't need naything special from the layout.

There was sill a bit of crud - but it was quite easy to strip out. Unfortuantely I don't think abiword runs on windows.

How many files are there? it might be possible do it by hand/custom perl relatively easily.

Date: 2007-10-11 08:25 pm (UTC)
From: [identity profile] ffutures.livejournal.com
I can do it by hand OK, it's just a bit of a pain. I have abiword on my iBook, but it's a big document and the layout is quite complex - drop caps, sidebars, etc. - so I suspect it won't be very simple.

Date: 2007-10-11 11:12 pm (UTC)
From: [identity profile] vincentursus.livejournal.com
I have used a utility called 'Tidy' to do that in the past. http://www.w3.org/People/Raggett/tidy/ (http://www.w3.org/People/Raggett/tidy/)

Date: 2007-10-12 06:28 am (UTC)
From: [identity profile] ffutures.livejournal.com
Thanks, I'll take a look tonight.

Date: 2007-10-12 12:00 am (UTC)
From: [identity profile] jgracio.livejournal.com
Have you tried saving it as Webpage, filtered?

Still leaves some crud, but all Office specific tags should be removed, leaving fairly standard html.

Date: 2007-10-12 06:27 am (UTC)
From: [identity profile] ffutures.livejournal.com
Yes - the results were NOT good. But I may use this for the first pass.

Date: 2007-10-12 12:27 pm (UTC)
ext_16733: (Default)
From: [identity profile] akicif.livejournal.com
The original version of 1stpage from evrsoft has a very nice implementation of Dave Ragett's tidy that degrots Office HTML and does a nice conversion to using stylesheets....

Date: 2007-10-12 02:15 pm (UTC)
From: [identity profile] ffutures.livejournal.com
Thanks, I'll check if out.

February 2026

S M T W T F S
1 23 4567
8 910 11 121314
15 1617 18 192021
22 232425262728

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 25th, 2026 12:45 pm
Powered by Dreamwidth Studios