Tuesday, January 29, 2013

VPC Migration: Post Mortem

All done! Every last one of the servers is running inside of Amazon's VPC. For the most part, everything went as expected. There are just a few loose ends I'd like to note.

Friday, January 25, 2013

OpenVPN with Amazon VPC

Although I did not initially plan to setup a VPN between Lucidchart's office and the newly setup VPC, I changed my mind before I even migrated the first server.

The reasoning is simple. I don't want our services to be publicly accessible; however, our office needs access to those services. The services I'm talking about include git, chef, apt, jenkins, and more. 

These services are not the only issue. Imagine a problem in production that requires manual debugging. I would have to tunnel through the NAT instance manually just to debug the problem server. When I'm having any issue in production, the last thing I want is an extra step.

Thursday, January 24, 2013

VPC Migration: DNS

Shortly after starting the migration to VPC, I ran into an unexpected issue with DNS. I hope to give more a more informed view than I received prior to starting the migration.

Wednesday, January 23, 2013

VPC Migration: NATs & Bandwidth Bottleneck

I ran into an unexpected issue during the migration to VPC over the weekend. The NAT instances, all of which are t1.micro size, could not handle the network traffic between the web servers and the backend servers. Our traffic backed up to the point that requests started timing out. The disastrous result was downtime.

Tuesday, January 22, 2013

VPC Migration: Setup

In my last post, I gave reasons for and against the move to VPC. I have now set up my VPC and hope to help some trouble-bound soul not have the same mistakes.

The first thing to do in a migration to Amazon's Virtual Private Cloud is to set up the subnets, routes, gateways, and NAT instances. My intention in this post is to layout the steps and generic principles contained in our VPC setup. The setup that I chose is not perfect for all situations. I assume you know how to operate the AWS console and are familiar with basic EC2 and networking concepts.

Saturday, January 19, 2013

VPC Migration: Planning

I'm looking at moving all of Lucidchart's servers into Amazon's VPC. This is no small task, nor should it be approached without a good plan and collective knowledge.

I will be recording the migration to VPC during the transition. As part of that, here are the advantages and disadvantages of moving to VPC, and my plan to do it.

Saturday, January 5, 2013

Disk Failures and Service Interruptions

I am currently employed as the chief architect at Lucidchart. In my spare time (literally) I am also the ops guy. All of our servers are running on Amazon's EC2 cloud. Using the cloud is amazing and frustrating at the same time. Managing hardware, using tape drives, and co-location facilities are all nightmares; on the other hand, so are service outages, network failures, and ephemeral storage drives.

As the CTO of Amazon, Werner Vogels, says, "Everything fails all the time." I would like to give a report of one such failure: how it happened, what was affected, how we got through it, what I did because of it, and how I'll never have to deal with it again.