Friday, June 21, 2013

Northeast PHP Conference 2013 Sessions

In addition to the strangeloop workshop, I've been accepted to speak at the 2013 Northeast PHP Conference. The Northeast PHP Conference is a web developer conference focused on PHP, Web Technology, and UX. The conference is in Boston, Massachusetts, on Aug 16-18, 2013.

Wednesday, June 19, 2013

Encryption at Rest using LUKS

At past companies, encryption at rest was done at the application layer. Only part of the data had to be encrypted, so code was inserted into the model that would encrypt the sensitive data before inserting into the database and decrypt after retrieval. This approach worked, and had no impact on the database - the hardest layer to scale.

At Lucidchart, we have failed to close large sales due to lack of encryption. Large companies want to make sure that their proprietary information is transmitted and stored using industry standard encryption. I took on the task to find a method of encryption that made the most sense for our use case, and had little overhead on our systems. After a lot of testing, benchmarking, and evaluating, I came to the conclusion that encrypting the disks on our database servers using LUKS was, and still is, the best solution.

Thursday, May 23, 2013

Strange Loop 2013 Workshop

I've been accepted as a speaker at strangeloop, a programmer conference that focuses on emerging languages, concurrent and distributed systems, new database technologies, front-end web, and mobile apps. The conference is in St Louis, Missouri, on September 18-20, 2013.


My workshop is entitled Your First Scala Web Application using Play! 2.1. Here's the quick blurb I submitted.

Wednesday, May 8, 2013

Failover with PHP & MySQL using IPTables

Not too long ago, I had to replace a MySQL server in our production environment. Its replacement would have the same CNAME, and so no configuration change was needed. This particular server is in a master-master replication set with another database server. The application is set up with automated failover between the two servers. While the failover was automated, I didn't want to just shut down the database server because its slave may miss some writes. I needed a way to fake shutdown to the web servers, but keep communication open between database servers.

Looking back, it would have been a lot easier to change configuration, wait 30 minutes for it to propagate, replace the MySQL instance, change the configuration again, and wait 30 minutes for it to come back up. At the time, however, I thought of a simpler solution. I would use iptables to reject all traffic from web servers to this one database server. Once the connection was rejected, the failover would take place, and the replication would continue to work.

Not only did my plan not work, it caused about 5 minutes of downtime on our production servers.

Monday, April 22, 2013

Storing UUIDs with Anorm

I am not a big fan of Object Relational Mappers (ORMs). It is too easy to turn a simple query into a massive CPU-eating hydra - complete with 7 outer joins and no index usage. When I write code, I try to write it in such a way that the worst things to do are the hardest. That way, when I'm in a rush, I still write good code.

Enough about my rant: enter Anorm. I've been developing software on the Scala / Play stack for over a year now. Anorm is currently the supported database connector. It is a thin layer on top of JDBC that takes out all of the ugliness, but leaves the manual SQL writing goodness.

One of the things that Anorm does not do is read and write UUIDs. In 200 lines of copy-paste code, you'll have Anorm reading and writing UUIDs like a pro. A side effect of this code is that Anorm will also be able to handle byte arrays.

Wednesday, April 17, 2013

"In" Clause with Anorm

At Lucidchart, two coding styles have been adopted because Anorm doesn't support the 'in' clause.

  1. Code that allows for, or is conducive to, SQL injection.

    Example:
    SQL(""" SELECT * FROM users WHERE id IN (%s) """.format(userIds.mkString(",")))

  2. Code that is hard to follow and duplicate.

    Example
    val params = List(1, 2, 3) 
    val paramsList = for ( i <- 0 until params.size ) yield ("userId" + 
    i) // results in List("userId0", "userId1", "userId2") 
    SQL(""" SELECT * FROM users WHERE id IN ({%s}) """.format(paramsList.mkString("},{")) // 
    produces "id in ({userId0},{userId1},{userId2})" 
    .on(paramsList.zip(params)) // produces Map("userId0" -> 1, 
    "userId1" -> 2, ...)
Ugly much? The madness ends here with code samples.

Wednesday, March 27, 2013

UUIDs in MySQL

Auto incremented numbers can be so nice for unique IDs in a database; they're guaranteed to be unique, can range to fit your needs (tinyint to unsigned bigint), and easy to use. When working in a sharded environment, they may not fit the bill anymore; you either lose global uniqueness or you lose ease of use (depending on your sharding mechanism). One very good alternative is to use UUIDs.

UUIDs solve the global uniqueness and ease of use problem at the same time, but introduces another problem in the database layer - memory footprint. A UUID is a 36 byte ASCII string of characters representing a 16 byte value. I've seen them stored as CHAR(36), VARCHAR(36), etc. The problem with storing them in their ASCII form is that the index will grow beyond the capacity of the database server faster than when they're stored in a binary form. Memory will fill faster, queries will hit the disk more frequently, and it will ultimately cost more money to save the exact same information. It's not worth it.