Build idiot proof systems!

Few days ago, I was asked to evaluate a few possibilities for a new project. I wanted to check the Jetpack plugin but we/I did not have any WordPress installation in any of our/my servers. In order to save time, I directly came to this site and tried to install Jetpack 1.9. It did not let me because I was using an older version of WordPress, probably 3.5.2. I was in hurry, so I downloaded a backup utility with a good rating, XCloner and took a full backup. Despite all those “verify your backup before you proceed” warnings, I tried to update to the newer version directly.

As you’ve guessed, my site went down half way through the update! Not to worry, I was smart, I took a backup! I started googling how to restore a XCloner backup. It seemed more complex than I had anticipated, but was not rocket-science. I started the restore process but it came back with a warning in red lines, my extracted files did not have the expected file sizes. This is when it strike me that I was an ass not to take a proper backup and verify it. I probably lost all of my articles. I am a lazy man. It took me one year to motivate myself for a personal blog, one more year to set it up and one more year to actually start writing! I am not the type to not lose faith and start over. So I mentally prepared myself for the worst. I was feeling like an idiot!

When I came back from office, I started to dig down.  I looked at the source files on my server. There was a lot of files showing zero bytes. I looked at the XCloner backup files and database script. There were only a few files, I must have done something wrong while setting up the backup. Also, the database script was broken half way through a INSERT script, a lot of tables were missing. In summary, the backup wasn’t usable at all.

I decided to delete everything and go for a fresh installation. I downloaded the latest version of WordPress. While downloading, I noticed that there is a process for manual upgrade. I also noticed that its not very uncommon for people to crash their site during version upgrades and then they have to switch to manual upgrade. I started with finding all the files with zero byte size, and replace them with the files from the downloaded distro. To my greatest surprise, the site started showing up! Well, not all of it, but I could read the articles and see the pictures, at least some of it, but I could not use the backend. The UI was broken, the site was missing CSS styles, the scripts were not working. Firebug console was full of warnings and errors. So, instead of replacing only some files, according to the WordPress manual upgrade guide, I decided to replace all files except wp-config.php, .htaccess and wp-contents folder from the downloaded archive of the new WordPress 3.8.2. After copying everything, I tried to log into the backend again. It automatically took me to the database upgrade screen. This time, I performed a mysql backup from cpanel before upgrading it, but the upgrade was smooth. My site was back, up and running in less than an hour!

I was awestruck by the simplicity of the upgrade process. As a developer, there are times when you hate the users but you should also expect them to make very unusual mistakes. When they cause disasters even after being instructed specifically on how to avoid it, they will come to you and at the end of the day you should be able to find a way around. Just because you told them not to push that red button does not mean somebody will not push it. May be it was accidental, may be he was in a bad mood, may be something else happened but your system should be solid enough to handle catastrophes. Deleting a file or a record, power failure during a transaction, changing a password and not remembering it, the list goes on and on. We are sitting on a world of automated systems where for every system, multiple automation options exist. These small idiot handling will make your system come on top of another. No matter how good your system is, if its not idiot proof, its simply not good enough.