Wednesday, May 19, 2010

linode: a paradigm of indifference

my linode was unavailable for 2 hours yesterday from 11:50am to 1:47pm. the linode itself stayed up but its connection to the internet was down. the newark datacenter it's hosted in went down and though i could apparently console it, i could not connect to it outside the control panel linode has for its users to admin their node. obviously they had some kind of connection established between their different datacenters if i could console it.

so i inquired if i could migrate the node to another datacenter.... not while the internet was down, apparently. and i couldn't do it without submitting a ticket and waiting for a response. to me this is aggravating. why is it open-source and commercial VPS solutions can all automatically migrate nodes from one dom0 to another but linode cannot do it automatically? a question that will probably never be answered.

what troubled me more than the downtime and lack of actual information or an ETA was the response from IRC support channels. lots of people were joining the channel, asking if there was a problem, what the cause was, and when it would be resolved. for a service we all pay a minimum of about $24 a month, probably the highest amount you can pay for a VPS with the same specs, this should be a completely acceptable set of queries. some people pay a lot more than that, btw. you'd think they would want to be courteous and attentive to all their questions, right? or, i dunno, put in the fucking topic of the channel the current status?

they didn't update the topic. most of the people coming in were treated with sarcasm, and no effort was made to silence those who were in the channel merely to be a dick - which is common on irc. but not a good idea for an official support channel for a paid service. most people who were there with concern that their services were unavailable were treated with a general indifference and in many cases as if they were simply pests and whiners. i guess most customers figured a 2 hour downtime in the middle of the day was something worth complaining about. i guess we were wrong.

there is still no explanation for the issue on the linode status page. at first we were told in the channel it was an issue with level 3, and the london datacenter was also down. then it was only the newark datacenter. through continuously attempting console access and traceroute/tracepath i saw how the connection would drop right at the router before my linode is usually hit. eventually the whole newark datacenter's routes stopped, but soon came back; sometime during this time console access was terminated, so clearly there was some real traffic going to linode inside the DC throughout.

because the datacenter's routers were inaccessible via traceroute for a short time i assume they were somehow convinced there may indeed be a problem with their routers, and so i'm not ready to completely blame linode. but certainly they had some connectivity and could have offered to migrate our nodes to another DC so, for example, we may have only had a 1-hour downtime instead of 2 hours. no such attempt was made. the estimated refund for this downtime is something like $0.08, which is of course nothing compared to the amount lost in business to the linode users who suffered the downtime (SLA's never reimburse near the amount you lose, so you shouldn't count on them that much). not that it would have made a difference, but the heartless way we as customers were dealt with makes me really dislike this company. now i know if i have a problem in the future, nobody's going to really try to help me. apparently they don't need me as a customer. and that's OK; i can find cheaper hosts with bigger caps elsewhere.

in the end this will be helpful. i already had a secondary VPS i paid about $5 for monthly, whose billing i let lapse out of laziness. this event will help motivate me to move back to that host and a couple others and have truly redundant services for the same cost as the one node i'd been paying for at linode. sure their web interface is fancy and you have a good deal of freedom. but considering the availability and bad customer service? i think i'll go with the cheap guys.

if you want a cheap VPS, check out Special VPS and Low End Box. they review and give promo codes for low-end VPS providers. by reading their reviews you can learn how to spot shady and unreliable hosters. do your research!

1 comment:

  1. Disclaimer: I am not a Linode employee and have no official ties to them. I'm just a very satisfied customer.

    First of all, migrating between DCs requires time and a lot of *stable* bandwidth. Second, there would still be a lot of downtime and site unavailability for customers because the IP(s) would change since they are not portable between DCs. Since the average TTL is around a day this would actually *increase* the downtime, not decrease it.

    I have a node in the Newark DC and did have clients complaining to me. If it would have been enough of a concern I would have restored a backup of their site to a server in another DC but two hours in two years is not enough to yell and scream like you are.

    I'm one of those "dicks" you commented about. However, I was only a "dick" to people like you that yelled and screamed about it in the *community* support channel (not official) but wouldn't contact *official* support people to work through the problem. Linode is very responsive to customers but *you* need to contact them to start the process.

    You may also want to go back and read the logs then correct the totally inaccurate claims in regards to the reason(s) for the outage. Level3 was blamed by customers having unrelated outages and nothing more.