Charsets ; I hate them :(
So the project I am working on uses a number of technologies including JSTL and freemarker.
Somebody had entered the following character: "è" which was displaying perfectly in JSP, but freemarker was replacing it with "?".
Of course I thought, it must be freemarkers fault :) I was meticulously careful about never converting a byte into a String using the string constructor unless the charset was specified, but I never realised you had to do it the other way as well :( When calling getBytes(), the encoding of the string is completely ignored and the platform default is used…. Why?
So the following will do bad things:
new String(new String("some string with a funny character è", encoding).getBytes(), encoding);
The problem was also a little bit more interesting because on windows machines the default platform is unicode based, on solaris it isn't, so the problem only exhibited itself on solaris.
I have (I am ashamed to say) never really delved into the joy of charsets and text encoding, instead preferring to stick my head in the sand. Luckily Chris May sits next on the desk next to me :)