26 Oct 2014

Removing complexity hosting static websites

The incident

One of my product websites was hit with the Google Viagra Hack. This hack injects different content to your website when is fetched by Google.

It’s subtle because you might not be aware of it for weeks. When you visit your site everything seems fine. If you take a look at your source files, everything is fine. However, when Google crawls your site it only finds Viagra links and related content.

When you google your site, you’ll find that it displays an scary message:

This site may be hacked

And the descriptions altered with medicine ads.

The investigation

Ok, so someone somehow managed to change the content of my website, at least to Google. After the initial panic, I used Google Webmaster Tools to see my website through Google eyes. It was a mess.

I investigate a bit this hack and it seems that it’s just a script that takes advantage of some PHP plugin or misconfiguration. On my website root folder there was an obfuscated PHP files called stats.php.

This website, as many others I have to maintain, are just static content. HTML, CSS and JS client code. Some of these websites are hosted on Dreamhost, together with some old WP blogs and old PHP websites, and they are likely to be the door to this hack.

My resolution: simplify

I’ve been migrating my websites away from WordPress for some months. First I discovered Jekyll but now I’m using Hugo. It just feels right to have my content as version controlled files instead of having a full PHP framework running including a MySQL database.

In my view the next logical step is to get away from traditional hosting all together. There are several alternatives like using Dropbox+Github or using Amazon S3.

I chose the later.

Hosting static websites in Amazon S3

The setup is very simple.

Basically you create a bucket with your website address, put there your files, configure the bucket to serve the website and update DNS records using Amazon Route 53.