Upon popular request (my post was even featured on Debian Weekly News), I re-ran my previous query on the changelog files in Debian packages. This time, however, I didn't only retrieve 40 random package release names, but "all" of them, for unknown values of "all". I didn't analyze some of the files (missing permissions), and maybe I missed one or two because my query sucked, but I think I've got most of them.
I ran a slightly more complicated query than last time, using the data from gluck:/org/lintian.debian.org/laboratory/. I have not the slightest idea how old the files in that archive are, but there's ca. 10.000 packages in there — more than enough, if you ask me.
The results (78 KB) this time are in alphabetical order, and include the package names where the strings were found. There's a total of 1408 strings.
Here are 20 randomly chosen strings, for some more fun:
gdb: * The "Ahhhhhhhhhhhhhhhhhh!" Release.
glibc: * The "Fuck Me Harder" release.
abiword: * The "Foolin' Myself" release.
opensc: * The "RTFM" release.
directory-administrator: * The "On Train" release
xchat: * The "Merry Christmas, mine beloved Xchat users!" release.
apache: * The "Yes, we know there is a new upstream release" upload.
mmm-mode: * The "But I'm Not Dead Yet!" Release
mozilla-firefox: * The "becoming more and more an iceweasel" release.
nano: * The "Marbella, ciudad hermanada con Benidorm" release.
thy: * The `Empty Spaces' release.
glibc: * The "Chainsaw Psycho" release.
sam: * The `Minime' release.
xchat: * The "Binary only" release.
tellico: * The "pbuider and buildds are not the same" package release
pingus: * The "All you pingus are belong to blendi" release
xchat: * The "Ok, wrong patch, excuse me guys :)" release.
cappuccino: * The "It's time for the upload" release
abiword: * The "Got A Good Thing Goin'" release.
firefox: * The "what he taketh, he giveth back" release.
I also created a small statistic this time. Here's the Top-20 packages (the ones with the most release names):
Feel free to grab the whole results file for more reading fun during boring hours of the day.
If you do any further processing or analysis of any kind with the data, please post a comment and let us all know ;-)
Update 2006-05-23: Enrico Zini has done some interesting things with the data...
Frederico Oliveira talks about some interesting issues regarding the information overload most of us are experiencing. He unsubscribed from several RSS feeds in order to cut down the mass of information.
I'm goint to do the same thing, too. My first step was to reduce the amount of daily email, though. I have just finished unsubscribing from 16 mailing lists which I don't really read very often (this automatically reduces the amount of spam I get, too). I'm left with ca. 80 mailing lists now. I'll probably remove some more and concentrate on those which I really need and/or read regularly.
The next step is to unsubscribe from several RSS feeds, but that's not that much of an issue. I find tracking RSS feeds easier and more manageable than tracking mailinglists (not sure why). For the statistics freaks, I currently subscribe to ca. 320 feeds, but I read way more of them regularly than I do with mailing lists...
You can use this to find out where the visitors of your site come from (similar to GEOLOC and HitsMaps/ClustrMaps). The cool thing about it is that it uses Google Maps to display the visitors' position, which means you can zoom in and out as you see fit, and you get real satellite maps of the area. Now, is this cool or what?
See my stats for an example.