From:  Fabrice EstiĆ©venart <>
Date:  15 Mar 2005 17:35:44 Hong Kong Time

Re: HTML 4.01 formatter


could you please give me some tips to write and build a stand-alone 
program that converts html files into well-formed xml files using the 
same converter than mozilla ?


Akkana wrote:
> Fabrice writes:
>>the editor includes a nice HTML 4.01 formatter (much better than 
>>Tidy) can i use this module independently (and automatically) on 
>>many ugly HTML files ? (kind of batch mode that would take as input a 
>>list of html documents and returns the files reformatted and renamed)
> Take a look at the serializer tests in
> parser/htmlparser/tests/outsinks.  You may be able to write a
> standalone program similar to Convert.cpp in that directory,
> which reads in an html file then prints it out with the formatting
> flag nsIDocumentEncoder::OutputFormatted set (OutputFormatted is 2 --
> see content/base/public/nsIDocumentEncoder.h for the available flags
> and their meanings).
> In fact, the TestOutput program in that directory might already do
> what you want, using a command like:
>   TestOutput -i text/html -o text/html -f 2 infile.html
> 	...Akkana