Wednesday, October 23, 2013

How to Use LVM and LUKS with EBS Volumes

A while back, I had posted my findings on encryption at rest using LUKS. Circling back, here's the procedure I used. Although I was operating on Ubuntu 12.04 and EBS volumes, this same procedure can be used in many different scenarios.

Overview

Here's the end goal looks like for every mountpoint. If you have multiple mountpoints, this entire structure will be duplicated multiple times.


  1. Four EBS volumes
  2. One LVM physical volume on each EBS volume
  3. One LVM volume group with four physical volumes
  4. One LVM logical volume taking 100% of the free space from the volume group
  5. One LUKS partition on the logical volume
  6. One filesystem on the LUKS partition
  7. One mountpoint for the filesystem

Block Devices

If you've already got the disks in /dev/, then you can skip this section.

The first thing you need to do is establish your block devices. For me, that meant creating and attaching a number of EBS volumes. I found that using 4 striped block devices yielded the fastest I/O, so that's what I used and what I recommend. These disks don't have to be EBS volumes, they could be physical disks attached to your backup machine, ephemeral storage on AWS instances, or a couple of thumb drives. Except for EBS volumes, I'm going to assume that you know how to attach block devices to your server.

There are a couple ways to create and attach EBS volumes. You could use the command line tools or the AWS console. If you don't know how to do this step, AWS has great documentation. Go with the AWS console and perhaps even ask for help in their forums.

As for device names, I use /dev/xvdf1, /dev/xvdf2, /dev/xvdf3, /dev/xvdg1, /dev/xvdg2, /dev/xvdg3, etc. I typically will give each separate mountpoint its own device letter. Just for reference, AWS uses sdf1 instead of xvdf1.

Each block device should be of the same size as all the others. We'll be using LVM to create a striped RAID array. If one disk is larger than the others, you will run into errors when creating logical volumes.

EBS Security Warning

When an EBS volume is newly created, it is full of zeros. LUKS will only fill the disk with encrypted data as it needs to. So, unless you want to give a hacker clues about which portions of the disk are full of "real" data, you would be smart to sanitize the disk. One way to do that is to fill it with data from /dev/urandom. The way that I've found to be the fastest is to create a LUKS partition with a very small key, write zeros to it, and then delete the LUKS partition.

LVM Setup

Now that we have block devices, we're going to create physical volumes, a volume group, and a logical volume. Before you can do that, you'll need to install lvm tools. On ubuntu 12.04, the command is:

`sudo apt-get install lvm2`

Physical Volumes

For each block device, create a new physical volume. This writes LVM headers to the disk. The command is very simple.

`sudo pvcreate /dev/xvdf1`
`sudo pvcreate /dev/xvdf2`
...

You can now use `sudo pvscan` to display all the physical volumes. `sudo pvdisplay` will show additional details.

Volume Group

Create exactly one volume group with all the physical volumes.

`sudo vgcreate myvg /dev/xvdf1 /dev/xvdf2 ...`

You can now use `sudo vgscan` to display all the volume groups. `sudo vgdisplay` will show additional details.

Logical Volume

Create exactly one logical volume on the volume group. The logical volume will take up 100% of the volume group space.

`sudo lvcreate -i4 -I4 -l100%VG -nmylv myvg`

You can now use `sudo lvscan` to display all the logical volumes. `sudo lvdisplay` will show additional details.

Each volume group supports multiple logical volumes, but in practice, I like to associate a set of physical volumes with a single mountpoint and a single application. The only exception that I've found is ephemeral storage. AWS provides zero to many ephemeral volumes and I use LVM with RAID 0 to create 20G swap, 20G /tmp, and XG /mnt partitions.

A logical volume is presented as a block device. At this point, you should be able to see /dev/myvg/mylv on the file system. If you want to skip LUKS setup, do it. This block device can be used for XFS, EXT3, or any other file system you desire.

LUKS Setup

First, let me urge you to think about your keys. Linux Unified Key Setup (LUKS) partitions are unrecoverable if the keys are lost. You don't want to find yourself in a situation like that of tech startup Recurly. They lost their encryption keys due to a hardward failure, and it cost them a lot of time, money, and customers to recover. It's very important that you keep your keys in redundant safe locations. At Lucid, we have 2 keys per LUKS partition, and each key is stored in at least 2 different secure locations. I highly recommend you do something similar.

Before we create the LUKS partition, we need to install LUKS. On Ubuntu 12.04:

`sudo apt-get install cryptsetup`

Create Keys

Any file or passphrase can be used as a key for LUKS. I recommend using two different 4kB random files. To generate these, run these commands.

`dd if=/dev/urandom of=luks1.key bs=4k count=1`
`dd if=/dev/urandom of=luks2.key bs=4k count=1`

Create Backups of Keys

Reminder! Now is a good time to create backups of your keys. Make sure they are in multiple locations. Again, you don't want to lose these keys.

Linux Modules

To help LUKS perform at its peak, we need to enable some kernel modules. The modules we need are:
  1. dm-crypt
  2. aes
  3. rmd160
Run these commands to load each of these modules.

`sudo modprobe <module>`
`echo <module> | sudo tee -a /etc/modules`

Create LUKS Partition

We're going to create the LUKS partition with one of the keys, and then add the second key. This is possible because your keys are not actually the encryption keys. LUKS will encrypt the actual encryption key using your key files. The actual encryption key is stored multiple times (once per key file) on the LUKS partition itself.

`sudo cryptsetup luksFormat --cipher aes-cbc-essiv:sha256 --hash ripemd160 --key-size 256 /dev/myvg/mylv luks1.key`
`sudo cryptsetup luksAddKey --key-file luks1.key /dev/myvg/mylv luks2.key`

Open LUKS Partition

The LUKS partition exists on /dev/myvg/mylv, but there is no way to access the data in its unencrypted format right now. We need to open the LUKS partition. We can do this using either key file.

`sudo cryptsetup luksOpen --key-file <key file> /dev/myvg/mylv myencryptedlv`

There is a new block device waiting for you in /dev/mapper/myencryptedlv. When data is written to this block device, LUKS will encrypt it and write it to the logical volume we set up earlier.

XFS Setup

Your volume does not have to have XFS, it can be any file system. I chose XFS because I'm familiar with it.

`sudo mkfs.xfs /dev/mapper/myencryptedlv`

Mounting

The new XFS filesystem can be mounted now, but it won't be available at boot time. We'll add 'noauto' to the fstab to make sure the server doesn't get stuck in booting state waiting for a drive to show up.

`sudo mkdir /mnt/mymountpoint`
`echo "/dev/mapper/myencryptedlv /mnt/mymountpoint xfs rw,noatime,nouuid,noauto" | sudo tee -a /etc/fstab`
`sudo mount /mnt/mymountpoint`

You can verify your new setup using `df` or `mount`. You have done it!

Common Scenarios

Detaching EBS Volumes

If you ever want to detach an EBS volume while the instance is still running, you'll need to perform a few operations first. We need to unmount the filesystem, close LUKS, and disable LVM.
  1. `sudo umount /mnt/mymountpoint`
  2. `sudo cryptsetup luksClose myencryptedlv`
  3. `sudo vgchange -an myvg`
You can now detach the EBS volumes from the instance.

Shutdown & Reboot

There is no special procedure or commands needed to reboot or shutdown the machine. Operate as you normally would.

`sudo reboot`
`sudo shutdown -h now`
Stop / Terminate from the AWS console

Startup

The encrypted volume will not show up when you boot a machine, because it can't - it doesn't have the keys. Hopefully you were smart enough to not store the encryption keys on the box. After you get one of the keys on the server, run these commands.

`sudo cryptsetup luksOpen --key-file <key file> /dev/myvg/mylv myencryptedlv`
`sudo mount /mnt/mymountpoint`

Increase Disk Space by Adding More Disks

If you already have an existing encrypted LVM/LUKS setup, and want to increase the size by adding more disks, follow this procedure. The new disks to be added don't have to be the same size as previously added disks, but should the same size as all other new disks to be added right now. First, attach the new EBS volumes to the instance. Then, run these commands:

`sudo pvcreate /dev/xvdf5`
`sudo pvcreate /dev/xvdf6`
...

`sudo vgextend myvg /dev/xvdf5`
`sudo vgextend myvg /dev/xvdf6`
...

`sudo lvextend -i4 -I4 -l100%VG /dev/myvg/mylv`

`sudo cryptsetup resize /dev/mapper/myencryptedlv`

`sudo xfs_growfs /mnt/mymountpoint`

Verify using `df`.

Thinking Ahead

LVM and LUKS have been a tremendous help in managing Lucid's infrastructure. They have also presented their own set of problems and challenges. Allow me to shed some light on the fun you'll have in the near future.

Snapshots

Think about taking snapshots, deleting unused snapshots, restoring from snapshots, migrating snapshots to other regions, and anything else you do with snapshots. Introducing more EBS volumes means introducing more complexity in their snapshots. This is nothing that a short script won't handle, but be aware that it can get painful. For ease of management, I recommend that each logical snapshot contains a UUID and a count of EBS volumes used to create it.

Server Reboots

With an encrypted disk, every time you reboot a server, manual intervention is required to bring it back up fully. It requires somebody with access to an encryption key be available 24/7. Also, starting a server takes an additional minute or so, because of the time to scp the file, remember the cryptsetup command, and mount the disk.

Key File Scrubbing

You don't want to leave your key files anywhere but secure locations. Every place you scp the file needs to be scrubbed. We use this simple command.

`shred -u luks1.key`

It overwrites the keyfile with random junk, and then deletes the file.

9 comments:

  1. Good review. I was wondering how does the system work with automatic snapshots? For instance system could be in the middle of writing to encrypted volume when snapshot is taken. Would this render snapshot useless?

    ReplyDelete
    Replies
    1. My snapshot process looks like this:

      1. flush the database
      2. lock the database
      3. lock the filesystem
      4. sync
      5. initiate the snapshot
      6. unlock everything

      I haven't had any problems with consistency or uselessness of snapshots.

      Delete
  2. Nice post, can you clarify what is the purpose of having two luks.keys if they can both open the partition?

    ReplyDelete
    Replies
    1. Redundancy. The last thing I want is to lose a key or find out it's corrupted.

      Delete
  3. Very nice. Can you explain how to restore file-system from ebs snapshots? each ebs snapshot should be attached and created as a pv, but the pvs of the original volumes are still attached to the system, so how do you create logical volume, replacing the existing one without erasing any data?

    ReplyDelete
    Replies
    1. The easiest way is to remove the original volumes. LVM uses device UUIDs to track the EBS volumes. If you restore a snapshot of the same volume to the same machine while the original volume is still attached, there will be 2 volumes with the same device UUID. This will confuse LVM. See http://serverfault.com/questions/313034/lvm-regenerate-a-uuid for changing device UUIDs.

      Delete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Wonder if you have automated your reboot by maybe downloading the luks key from lets say S3 on boot up using cloud-init or EC2 instance roles, so manual intervention is not needed? Obviously the same process will have to remove the key when finished.

    Although by now you might be using encrypted EBS volumes instead i thought might be worth asking.

    ReplyDelete
    Replies
    1. No, the encrypted volumes typically house MySQL databases. Because of a complicated MySQL server replication configuration, we choose not to automate the reboots.

      Since I wrote this post, AWS has announced encrypted EBS volumes. These use a more secure mechanism for reboot persistence of keys, but come with the downside that you have to re-encrypt the data for an off-site disaster recovery solution.

      https://aws.amazon.com/about-aws/whats-new/2014/05/21/Amazon-EBS-encryption-now-available/

      Delete