Entitize Strings with Groovy

July 30, 2010 by Anthony Scotti

Not sure if ‘entitize’ is the right term but it’s what people at my old job would call the process of taking high level characters and replacing them with the unicode. For example, € would be replaced with €. This was needed with some inhouse applications or dealing with output to HTML/XML to ensure it would show the right character to the users. Well, I needed to do this awhile ago and I looked up some old Python code that I had done before and re-wrote it in Groovy. So here is the code snippet.

def entitize(dirty_string) {
	def clean_string = ""
	dirty_string.each { it ->
		def ordcode = it.codePointAt(0)
		if (ordcode > 127){ clean_string += "&#${ordcode};" } else { clean_string += it }
	}
	return clean_string
}

So, this

println entitize("Test Test")
println entitize("€ 1250")

Should output this,

Test Test
€ 1250

I’m stilling trying to learn Groovy so if anyone knows a better way of doing this please share, also feedback and questions are welcome!

© 2018 | Follow on Twitter | Hugo