Sunday, 12 October 2008

IDN use and abuse

JWZ blogged about the Unicode snowman. If you're running a proper browser, take a close look at the domain name:
.net

A brief, two sentence overview: For any domain name which begins with xn-- followed by some gobbledygook, certain clients like web browsers can interpret the gobbledygook as a Punycode representation of some Unicode string. So the snowman's real domain name is xn--n3h.net.

I was quite excited for a while since many of these Unicode dingbats and symbols are unregistered in combinations of two or more, but then I found that the killjoys at the IETF had put a stop to that with RFC 4690. So while the snowman registration can be continued, no new dingbats can be registered.

Nevertheless, we can still have fun abusing the simpler Chinese characters. For a laugh I registered 丄.com and 丿乀.com. These might not be active when you read this, and to be honest I'm not quite sure where I'll point them at the moment. The first looks like bottom, the symbol for non-terminating programs. Hmmm maybe that'd be good for some insightful blog about functional programming? The second is a total abuse of two characters together, but looks like the number 8 in Japanese (IETF rules forbid registering actual numbers, even non-Arabic ones).

Someone should do the world a favour and register 丅丨丅.com (xn--9gqa8h.com).

Update

Subdomains are of course not regulated by the IETF jobsworths. Here's another, prettier unicode snowman: http://☃.earthlingsoft.net/, and I can have http://☆☆☆.annexia.org/

1 comment:

alexander.bostrom.net said...

Non-IDN abuse can be fun too. :)

http://www.bostrom.net/ʇǝu˙ɯoɹʇsoq˙ʍʍʍ//:dʇʇɥ