in search of better tag cloud algorithms
Tag clouds (aka text clouds or word clouds) first appeared on the web around 2004-2005, most notably on Flickr. It took a little while for me to warm up to them, but once I accepted the idea, I found them useful to spot patterns that didn’t strike me previously in linear form.
(Digg used to have a great tag cloud, if anyone knows how to get it back let me know!)
WordPress and bbPress have halfway decent tag clouds, but they are missing some additional features I’ve seen on other clouds. Aditya Naik did a nice hack to show related tags by highlighting them simultaneously in his Enhanced Tag Heat Map (demo). I decided to add some color to the map by making the more popular tags grow “warmer” in color (example).
But there’s one thing I really don’t like in the average WordPress/bbPress tag clouds and that’s poor clustering / arrangement of the words. The positioning seems to be first come first placed and that doesn’t scale well. In a better tag cloud I think more popular terms, ie. larger terms, should be distributed between the center and edges, rather than all on the edges or all in the center. When it comes to very large clouds with many low frequency terms, this can become a noticeable problem. I also wish it would lose it’s rigid, fixed line placement of the words and have a looser arrangement like you usually see on Flash based tag clouds.
I have a project I am working on now which is bringing all this up. I discovered my Top 1000 list for bbPress forums makes a very nice tag cloud. But the word distribution within the cloud sucks. This page seems to have some nice theories and examples so I’ll probably start my research there.
I guess I have a big weekend project ahead!