Google
 

«           »

Removing Vowels from Hebrew Unicode Text

Posted June 3, 2005 – 4:28 pm by Yakov Shafranovich in Politics, Programming

One of the questions that recently came up is how to remove vowels from Hebrew characters in Unicode (or any other similar language). A quick look at Hebrew Unicode chart shows that the vowels are all located between 0×0591 (1425) and 0x05C7 (1479). With this and Javascript’s charCodeAt function, it is trivial to strip them out:

function stripVowels(rawString)
{
	var newString = '';
	for(j=0; j<rawString.length; j++) {
		if(rawString.charCodeAt(j)<1425
			 || rawString.charCodeAt(j)>1479)
		{ newString = newString + rawString.charAt(j); }
	}
	return(newString);
}

You can test it below:



Tags: , ,

Permalink | Trackback URL | This post has

  1. 2 Responses to “Removing Vowels from Hebrew Unicode Text”

  2. Hello,
    I’d very much like to use your code to strip the vowels from either a Unicode file or an Excel spreadsheet with multiple Hebrew words. How can I make it work? Thanks so much.
    Best,
    Lance

    By Lance Laytner on Aug 31, 2008

  3. Thank you! This code is incredible, even if I haven’t any idea how to actually use it on my mac….. So I’ll be using your page to remove the vowels when I need to. I hope you’ll keep it right here.

    By Jeff on Oct 4, 2009

Post a Comment