More than two weeks ago I blogged about my server being down. After multiple emails, phone calls, and even a fax trying to reach the support team, the server is still dead. But at least I know (a little bit) more now.
I managed to get someone from support on the phone and he fixed the system at least so far that I could ssh to it again. I was able to pull a complete backup of the system, including a database dump.
That means that Unmaintained Free Software and all other sites hosted on the server will eventually return, no data will be lost.
After I created the backup, I wanted to reinstall the whole system and then install the backup to restore all services. As it turned out, the (automatic) reboot- and reinstall-script they use is obviously broken, I cannot reach the server anymore after I initated the reinstall. This is probably something more serious, as other people seem to be affected, too.
I have not the slightest idea what the hell happened on the server. There was something really, really strange going on. An example:
# ls -l /usr/bin/traceroute
-rw-rw---- 1 mysql mysql 310872 Jun 21 03:21 traceroute
Why the hell is
traceroute not executable and belongs to user/group
mysql? There are several other anomalies there:
/usr/share/doc/apt is not a directory as it is supposed to be, but a Perl script.
/usr/bin/id is a directory. Multiple system tools (awk, sed, ...) are not executable and partly directories with strange stuff in them. What gives?
One possible explanation is that the server was hacked and some rootkit wrecked havoc on the server. After a quick glance at the logs, I couldn't find any hints for a successful breakin, though. Another possibility is that the hard drive simply died and/or the filesystem was (heavily) corrupted. I don't know...
Has anybody ever seen something like this? Please enlighten me what could have happened...