From:  Fabrice EstiĆ©venart <fe@cetic.be>
Date:  15 Mar 2005 17:35:44 Hong Kong Time
Newsgroup:  news.mozilla.org/netscape.public.mozilla.editor
Subject:  

Re: HTML 4.01 formatter

NNTP-Posting-Host:  cetic.customer.charleroi.belnet.net

could you please give me some tips to write and build a stand-alone 
program that converts html files into well-formed xml files using the 
same converter than mozilla ?

Fabrice

Akkana wrote:
> Fabrice writes:
> 
>>the editor includes a nice HTML 4.01 formatter (much better than 
>>Tidy)...how can i use this module independently (and automatically) on 
>>many ugly HTML files ? (kind of batch mode that would take as input a 
>>list of html documents and returns the files reformatted and renamed)
> 
> 
> Take a look at the serializer tests in
> parser/htmlparser/tests/outsinks.  You may be able to write a
> standalone program similar to Convert.cpp in that directory,
> which reads in an html file then prints it out with the formatting
> flag nsIDocumentEncoder::OutputFormatted set (OutputFormatted is 2 --
> see content/base/public/nsIDocumentEncoder.h for the available flags
> and their meanings).
> 
> In fact, the TestOutput program in that directory might already do
> what you want, using a command like:
> 
>   TestOutput -i text/html -o text/html -f 2 infile.html
> 
> 	...Akkana