Spread All Over: AppScale as a Distributed System

Posted by Raj Chohan on 8/18/14 5:21 PM

Earth from space

Google App Engine operates at massive scale and is able to do so by having multiple independent components working together to provide a highly available and fault tolerant platform.

The technologies that GAE is comprised of are well established internally, with each one providing massive scale. Each API provided is backed by Google technology-- the high replication datastore with Megastore, memcache with a proprietary implementation of memcached, blobstore with Google Cloud Storage, and XMPP and Channel API with Google Talk. GAE is able to host over 4 million applications using these different technologies, and do so with high uptime and little management requirements by their customers. Restrictions put in place are there for particular reasons, with three first order principles: security, usability, and scalability.

 

Distributed Large Data

AppScale emulates Google App Engine and provides large scale hosting capabilities for multiple applications using highly robust and scalable open source software. Of all the APIs, the datastore is arguably the most important, storing data in a persistent fashion. In order to implement the technology with its transaction semantics and querying capabilities AppScale relies on ZooKeeper and Cassandra, two technologies which have seen tremendous improvement over the years in both robustness and performance. Both technologies have their roots in distributed computing, having to make sure data is fault tolerant and consistent across multiple replicas (yes, Cassandra has the ability to be strongly consistent). AppScale enforces the same limitations that Google App Engine has in order to achieve high good put on datastore operations. One such limitation is that you can only have one inequality filter on a query. As you dig into how the datastore was implemented you’ll find that the App Engine team was able to squeeze out a tremendous amount of querying capabilities with the given model: scalable results from a single range query, or a few, without doing merges or joins (with the exception of zig-zag-merge-join [4]). This means that results are better than or on average O(n). AppScale, with its emulation, is able to provide this same powerful datastore API in open source.

AppScale has REST based servers which receive application level requests that are then converted to the correct ZooKeeper calls and Cassandra reads, writes, scans, and deletes. Each such server is on a datastore node with Cassandra running on it. The servers themselves receive a list of Cassandra nodes for which they can connect to. This provides fault tolerance in case one of the nodes goes into garbage collection or is experiencing issues (lack of disk space, memory issues, corruption of disks). ZooKeeper stores meta data about transactions. This data ensures that all transactions which take place adhere to ACID semantics.

 

Large Scale Caching

AppScale emulates Google App Engine and provides large scale hosting capabilities for multiple applications using highly robust and scalable open source software. For the memcache API AppScale uses the open source memcached library, which maps very well to the actual GAE memcache library. The servers running memcached can be seen as a python-like dictionary, and the intelligence on where to store the key and values are in the client. All clients in the distributed deployment must agree on the hashing algorithm and the servers on which they are storing their data. The only tricky item we found was that data was stored in protocol buffer format, and thus required using CAS (check and set) in order to correctly implement the memcache increment feature. The protocol buffer had to be deserialized, have the integer incremented, and then reserialized. The CAS then provided the protection to make sure that another operation on the same key had not occurred in the meantime resulting in a race condition.

 

Background Workers: Task Queue

The taskqueue in AppScale uses two powerful and resilient technologies: RabbitMQ and Celery. Celery provides a python library to manage queues and tasks, while rabbitmq is the broker which sends the tasks. You can specify multiple nodes to handle tasks. Each application has a celery worker which will fetch the paths provided when the task is enqueued. Each taskqueue server, similar to the datastore, has a REST server which receives protocol buffer requests from application servers. The fact that it runs on multiple nodes provides fault tolerance for the taskqueue API in the face of nodes failing. These protocol buffers are deserialized and converted to tasks with the given configuration required for retry logic. The celery worker then uses these configurations to know how to act when a failure occurs during a URL retrieval-- re-enqueue or revoke. The work done by Celery is relatively lightweight, being only a remote fetch and then an update to the taskqueue state which is in the datastore.

 

Stateless AppServers

Application servers run the code of a user. This code is what calls upon the fault tolerant API services. These servers can run on a multitude of nodes, spreading out load. AppScale has two modes for how many servers are on a node- manual scale, and auto scale. Manual scale is what it sounds like, you specify the number of application servers at initialization. Each node will then get the same numbers across the board, regardless of machine resources. Auto scale on the other hand will start with one application server and scale up based on how many requests are being queued up. When seven or more items are queued up, a new application server will be spawned up. This is different than GAE where it’s based on a min and max wait time.

Some major complaints about GAE happen to revolve around auto scaling in GAE. The restrictions of not allowing for direct access to the file system make sense when you understand that the hosting is done via auto-scaling. Your application server must not require state to be taken with it if it is scaled up or down. In a highly dynamic environment, persistent data should be stored and retrieved using clearly established APIs. The libraries which are whitelisted also have been scrutinized to make sure they do not violate security measures, and provide a means of breaking outside of the container environment. In AppScale you can use whichever libraries you want, and even write to the file system, but it should be noted, as Peter Parker’s uncle told him, “with great power comes great responsibility.”

 

Want to try out AppScale? Click below to deploy your first app on AppScale in less than 10 minutes.

Try AppScale!

Topics: Best Practices

Subscribe to Email Updates