Google
 

Google Base and Unicode

Tuesday, August 11th, 2009

For quite some time, Google Base feeds for some of of my projects were either partially ingested or rejected out of hand with a message "Required attribute missing". I ran xmllint and several online validation tools, and found nothing. But thanks to a Mac blog, I finally figured it out. It ...

QuickBase and Unicode Support

Monday, October 27th, 2008

Some quick notes on QuickBase and Unicode: QuickBase stores Unicode data natively on the backend Unicode encoding must be set as default in the browser Any QuickBase functionality that relies on Javascript or AJAX support, DOES NOT work with Unicode The last point is due to the two issues: 1. The bug with UTF-8 encoding ...

XmlHttpRequest, Unicode and Firefox

Monday, October 27th, 2008

Just a quick post about a problem I ran into earlier today. It seems that when using Javascript in Firefox 3 with document.implementation.createDocument to create XML documents (for XmlHttpRequest), the encoding stays as ISO-8859-1 instead of Unicode (UTF-8). What it really should be doing, is making the encoding same as ...

Fixing Malformed UTF-8 via Regex

Wednesday, June 21st, 2006

I have been struggling with a weird problem on one of my sites that prevent that site from functioning. One of XML files that is used for this site is supposed to come in UTF-8 but unfortunatly it had some extra characters that were not encoded properly. After looking at ...

Removing Vowels from Hebrew Unicode Text

Friday, June 3rd, 2005

One of the questions that recently came up is how to remove vowels from Hebrew characters in Unicode (or any other similar language). A quick look at Hebrew Unicode chart shows that the vowels are all located between 0x0591 (1425) and 0x05C7 (1479). With this and Javascript's charCodeAt function, it ...

Great Unicode Charts

Saturday, January 22nd, 2005

I just ran across these great Unicode charts from Matt Corks.