Backing up to Amazon Glacier

I built my business on the premise that data is valuable. My most profitable websites are the result of hundreds of hours of gathering and organizing data, so backups are important to me. I have a cron that runs daily that gathers all my databases, configuration files, keys, backup scripts, and a few other random bits of information into a compressed archive. The compression ratio is pretty good: about 1.8GB of data is compressed to about 112MB.

For the past couple of years I’ve been using Dropbox to store the backups. I have the command line version of Dropbox installed so the backup script just has to copy the backup archive into a folder, then Dropbox uploads it and pushes it out to my home computers. This has been working great for the most part, but it has a few downsides. First, it is sort of expensive. I can’t fit all the backups I want plus my own personal files into a free Dropbox account, so I’ve been paying $99/year for Dropbox Pro ($8.25 per month). Second, it isn’t terribly secure. Dropbox employees have access to your data, which isn’t really unexpected but it still isn’t something I’m thrilled about. Last, it makes all my home computers download a large 100MB+ archive every morning which is particularly annoying on my laptop while traveling.

So I decided to switch to Amazon Glacier. They charge $0.01 per GB per month for long term storage. The pricing model is very wonky, with lots of random little fees here and there depending on what you do, but as best as I can tell I think I’ll be paying less than $0.40 per month to store a rolling 100 days of backups.

To do the actual heavy lifting I decided to use glacier-cli, which is a Python tool that provides a pretty easy interface to Glacier and is easy to use in a cron or bash script. It requires Python 2.7 but my server only had Python 2.6. I was able to follow the instructions at Too Much Data to install Python 2.7 next to my existing installation. Next, I followed the installation instructions for glacier-cli to clone the required scripts. I also set up environment variables to hold my AWS access key ID and secret access key. Finally, I was ready to go. Or so I thought. I ran a simple command to list my vaults: /usr/local/bin/python2.7 /root/glacier-cli/ vault list

Egh. It didn’t work. I was missing some required Python stuff. I then spent a fair amount of time installing all the missing stuff I needed by using the easy_install-2.7 command. I was also missing the sqlite headers, which I installed using yum.

I then tried again with and it worked! Yay! I created a vault, uploaded a file, and finally modified my backup script to use Glacier instead of Dropbox.

Anyways, I’m pretty excited. I’ve canceled my Dropbox Pro subscription and completely removed Dropbox from my server. It has freed up a pile of space on my computer, saved me some unneeded bandwidth usage, and I’m now storing backups longer than I ever have before. It is so inexpensive that I’ll likely start a weekly backup of all the files for my websites, too (about 7.4GB uncompressed if I leave out Fake Name Generator order files). This data is almost entirely in git on at least one other server, so I’m not terribly worried about losing it, but it doesn’t hurt to have redundancies in backups.

Read More

Switching from Route 53 to DNS Made Easy

I discovered recently that, sometimes and for some people, my Amazon Route 53 DNS is quite slow. I started to dig into this and discovered that, from several different servers I have access to, Route 53 is slower than the free DNS I get from SoftLayer. Since I am paying for Route 53 and I care a lot about how fast my websites are, I decided it was time to switch to something better.

My hunt for better DNS brought me to DNS Made Easy. I did some research and some testing of people already using DNS Made Easy and discovered that they are drastically faster than both Route 53 and the free DNS I get from SoftLayer. For $5/month you get up to 25 domains and up to 10 million queries. Additional queries can be purchased for a discount up front, or can be automatically billed in the event you unexpectedly go over. They also provide some awesome features like vanity name servers and DNS failover.

Anyways, took me about 3 minutes to get everything switched over and now I’m just waiting for everything to propagate.

Read More

Ich bin ein Berliner

Ich bin ein Berliner

Today I completed my first website designed entirely for a foreign language. Wegwerf-eMail-Adresse is a clone of my Fake Mail Generator site, but has been tailored for German visitors.

It was a lot of fun to work on. Most of the work was straightforward (replacing English with German), but there were a few interesting bits having to do with date/time formatting, time zones, and making sure the URLs were all in German.

Read More

Why you shouldn’t have fake pages on your site

Why you shouldn’t have fake pages on your site

I’m in the market for some rack space in a colocation facility. I’ve been running the numbers and it looks like I could save some substantial cash and add redundancy to my websites by buying a couple of servers instead of renting from SoftLayer. But where to colocate?

Ideally it’d be somewhere I either already live, am moving to, or near someone I visit often. I don’t plan on living in Connecticut any longer than I have to but I have no idea where I’m going to move, so that leaves me with the option of near someone I visit often. My parents have a goal of moving overseas so that leaves Becca’s parents in the Roanoke, VA area.

So I search for “roanoke, va colocation” and lucky me! The first result is a Roanoke colo from a company called Coloco! I check the pricing, spend time crunching numbers, checking my bandwidth usage to see what I need, pricing servers, etc. Becca then asks where the colo is actually located. I search their site but can’t find an address. Weird. They give the addresses of other locations.. The page definitely says Roanoke. Where in the world is this colo?

And then I realize what is happening. This company has flooded Google with fake pages that say whatever city name you are looking for. To test my theory, I visit:’s%20basement.HTM

Sure enough, I’m greeted with this entirely convincing sales pitch (emphasis added):

Grrr. I’ve just wasted 30-45 minutes evaluating a spammy company that is at least 3 hours from where I want to host my servers. I’m not sure which misguided individual at their company decided it’d be a good idea to introduce their company to the world using blatant lies, but I’m definitely not going to host with these guys.

I decided to thank them for wasting my time by offering them some free SEO services. I’ve submitted a few of their URLs to Google that were missing before, including your mom’s basement, the ghetto, and the ball pit at your local McDonald’s. I sure hope it gives them some extra traffic.

Read More

How iptables earned me an extra $500 per year

A few weeks ago I started taking a more active role in monitoring the traffic going to my server. I discovered that lots of people were scraping my sites, or in other words, they were writing programs to extract the data off of my sites without actually browsing them in something like Chrome or Firefox. Very rude.

So I started using iptables, a Linux program that lets you configure the kernel firewall, to block IP addresses that were obviously abusing my services.

One of these scrapers was very persistent. They were scraping my ABA Number Lookup site instead of using the very inexpensive API that I provided. As soon as I blocked an IP, a new one started up. I probably would have let them get away with it but their programming was atrocious. Within the space of a few minutes they were looking up the same routing numbers dozens of times instead of looking up unique routing numbers. So I kept blocking their IPs until apparently they ran out, and the scraping stopped.

A few days later I was hanging out with my family when my cell phone starting ringing on my business line. I answered the phone and was greeted by an individual that needed help signing up for the API. I gave him the information he needed, and then he bashfully asked if I could unblock their IP addresses. Ah hah! This was the man that was hammering my server! Turns out he works for a finance-related company on Wall Street and instead of paying the measly $1 per thousand look-ups he was scraping my site.

So now they are using the API like they should have been the whole time, and I’m making an extra $500 per year. Yay!

Moral of the story: Sometimes it pays to check your logs.

Read More

SoftLayer hardware firewall is awful

I’ve been having a problem lately with people hitting my server more than I’d like. I’ve been using iptables to drop requests from these IPs, but I wanted something that took the load completely off of my server, and could be bypassed in case I put in a bad rule that locked me out. So I decided to try the SoftLayer hardware firewall.

This feature is expensive: $49 per month for a 10Mbps hardware firewall. That is a lot, but I figured it would be worth it to have the added protection for my server. Sadly, I was wrong. The interface to manage the firewall is garbage.

It shows a basic form listing all the firewall rules. You select the rule priority by numbering the rules 1 through however many rules you have. There is, however, no way to add a new rule to the top of the list (or the middle of the list) unless you re-number every single rule. If you have 20 rules, this can be tedious. If you have 100 rules, this can be extremely frustrating. Even worse, if you make a mistake and have a duplicate priority number then the page refreshes with all the rules set to a priority of “1”. So now you have to start all over.

Unlike iptables, there is very little help available for the firewall. SoftLayer provides a handful of knowledge base articles, but none of them include screenshots or advanced examples.

Within a few hours of using the service, I realized that it wasn’t going to work out and I’d be better off with iptables. I started a chat with the billing department, and was told they’d create a ticket to review crediting me for the service, and then they gave me instructions on how to cancel. I followed the instructions and the service was promptly removed from my server.

Sadly, I was informed that the terms of service prevented them from giving me a refund, and they said that the billing department only said they’d look into crediting me for the service, not that they actually would. How deceptive! Do they really expect me to believe that their own billing department doesn’t know what they claim is the standard SoftLayer refund policy? I likely still would have canceled the service, but I feel pretty ripped off having used the service for just a few hours, experiencing several issues with it, being told (from my perspective) that I’d get a refund, and then being stuck with the bill for a full month of service. Egh.

Overall, SoftLayer is awesome. Quality servers at great prices. In this situation though, complete fail. The firewall is garbage and they handled the situation very poorly. I’m not disappointed enough to start immediately hunting for a new host, but I’ll definitely be considering other options for my future server needs.

Read More