From:  Emile Kroeger <flammifer@gmail.com>
Date:  12 Nov 2005 11:04:57 Hong Kong Time
Newsgroup:  news.mozilla.org/netscape.public.mozilla.layout
Subject:  

Re: Unicode in the DOM ?

NNTP-Posting-Host:  rheet.mozilla.org

> > So what are you doing exactly, to get something in a different encoding
> > than one of those?  ;)
>
> Sounds like XMLHttpRequest to me, which does some *really* dodgy
> character encoding stuff.

Nono, I was just reading off a normal webpage - greasemonkey-like stuff :P

I read the text from the page, sent it to Python, and Python said "Wah
! It's not unicode !"

Since I noticed that I'd get different stings when I selected
different encodings for the page (which was orignially in unicode), I
assumed that the charset was being converted from unicode to whatever
was convenient for the display.

Turns out I was wrong, and that the javascript strings were indeed in
two-byte unicode (at least, when I had selected the correct encoding
in the browser), but that the strings were being cast into one-byte
strings when I sent them to python, because I was using the wrong
chartype ("string" instead of "wstring") in the XPIDL interface
definitions. I guess that's what I get when I work with two
dynamically-typed language with a glue layer of statically typed
language in between :P (And when my JavaScript skills are not that
great)

So, it had nothing to do with Gecko at all, sorry for the disturbance.

Emile