June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
We’ve been busy over the past several months working hard on what we consider a fundamental piece of infrastructure that the network has been lacking for quite some time. From “ping server for APIs” to “message bus”, we’ve been called a lot of things; and we are actually all of them rolled into one. I want to provide some insight into what our backend architecture looks like as systems like this generally don’t get a lot of fanfare, they just have to “work.” Another title for this blog post could have been “The Glamorous Life of a Plumbing Company.”
First, some production numbers.
Second, our approach.
Simplicity wins. These production transaction rate numbers, while solid, are not earth shattering. We have however, achieved much higher rates in load tests. We optimized for activity retrieval (outbound) as opposed to delivery into Gnip (inbound). That means every outbound POST/GET, is moving static data off of disk; no math gets done. Every inbound activity results in processing to ensure proper Filtration and distribution; we do the “hard” work on delivery.
We view our core system as handling ephemeral data. This has allowed us, thus far, to avoid having a database in the environment. That means we don’t have to deal with traditional database bottlenecks. To be sure, we have other challenges as a result, but we decided to take on those as opposed to have the “database maintenance and administration” ball and chain perpetually attached. So, in order to share contentious state across multiple VMs, across multiple machine instances, we use shared memory in the form of TerraCotta. I’d say TerraCotta is “easy” for “simple” apps, but challenges emerge when you start dealing with very large data sets in memory (multiple giga-bytes). We’re investing real energy in tuning our object graph, access patterns, and object types to keep things working as Gnip usage increases. For example, we’re in the midst of experimenting with pageable TerraCotta structures that ensure smaller chunks of memory can be paged into “cold” nodes.
When I look at the architecture we started with, compared to where we are now, there are no radical changes. We chose to start clustered, so we could easily add capacity later, and that has worked really well. We’ve had to tune things along the way (split various processes to their own nodes when CPU contention got too high, adjust object graphs to optimize for shared memory models, adjust HTTP timeout settings, and the like), but our core has held strong.
Our Stack
High-Level Core Diagram
Gnip owes all of this to our team & our customers; thanks!
« Newer Post Older Post »