making bbPress (and WordPress) work better!

Posts tagged “unicode

How to fix: Can’t see Unicode (UTF8) in Notepad++ on Windows XP

This is a little late to help most people because they have moved on from Windows XP to newer flavors, however there are still some die-hards going to 2019 with the simple PosReady registry tweak.

If you have a full unicode font installed like Symbola on Windows XP, you may still not see proper characters in applications like Notepad++ and instead get double empty boxes in their place.

Punycode to Unicode Converter in simple PHP

Today I needed a way to simply convert Punycode internationalized domain names to Unicode for proper display in UTF-8. I was hoping for some easy iconv magic but no such luck, PHP can’t even do part of it directly.

Googling for a bit I was only able to find one existing class that did this in pure PHP but it was well over 100k in size which was disturbing for my simple needs.

So I whittled it down to 50 lines or so and made some tweaks: (download)

It can now handle in one function “multi-part” domains that have punycode in the sub-domain, domain and/or TLD.

ie. all the examples here work:

so xn--r8jz45g.xn--zckzah is properly converted to 例え.テスト

it also works with mixed domains, ie.

(you can only pass it the host part of the url, do not pass it the full URL with http or slashes or it will fail – use PHP’s parse_url to get just the host)

Note this does not do any sanitizing or other thorough checks or fixes – if you need that functionality (ie. raw user input from unknown sources) you’ll probably need the original full class over here: