Membase In The Cloud: Part 1
Membase is an open source, high performance, highly available, distributed NoSQL key-value database management system (see NoSQL In The Cloud) optimized for persisting data behind modern SaaS applications. Companies using Membase include Zynga, PayPal, Vodafone and Microsoft.
Membase processes client requests with low (sub-millisecond) latency and high sustained throughput, with the ability to scale from a single Membase server to a cluster of potentially thousands of servers.
Memcached compatibility
Memcached is a distributed memory caching technology and a core infrastructure component behind 18 of the top 20 most heavily-trafficked websites (including Google, Facebook, Twitter and Wikipedia). It is often deployed alongside relational databases, caching data and objects in RAM and reducing the number of times the relational database must be read.
Membase is a drop-in replacement for Memcached. On-the-wire compatibility means that existing Memcached code works as is. Membase is also:
Simple
Everything about Membase is designed to be easy and low maintenance. Getting, installing, managing, expanding and using it. As a NoSQL database, there is no need to create and manage schemas, and also no need to normalize, shard or tune the database.
Fast
Membase is arguably the lowest latency, highest throughput NoSQL database technology available.
Elastic
Scaling a Membase cluster in the cloud is as easy as starting up (or terminating) virtual machine instances and hitting the rebalance button. Data is automatically re-distributed across the cluster, increasing or reducing aggregate I/O and storage capacity. From a high availability perspective, there are no single points of failure in a Membase cluster.
Membase Concepts
You can interact with and administer a Membase cluster either through a web console, command line interface or a REST API.
Cluster Manager
The Cluster Manager runs on each Membase Server and provides the following services:
- Cluster Management
- Node startup and shutdown
- Node monitoring
- Gathering of statistics
- Logging
- Security
Client applications access these services via the admin port (8091) and data port (11211).
Node Memory and Disk Management
Membase automatically manages storing objects between disk and memory. For performance reasons object metadata are always kept in memory.
A memory quota is set with configuration and should not be more that 80% of the total physical RAM on the node. The quota set for the first node in the cluster is inherited by all nodes subsequently joining the cluster. Membase automatically migrates from memory to disk when the quota is reached.
Data Buckets
Data management services are provided through virtual data containers called buckets. A bucket is a logical grouping of physical resources within a cluster.
Membase bucket types:
- Memcached
Distributed, in-memory, key-value cache. Memcached buckets are designed to be used alongside relational databases, caching frequently-used data, thereby reducing database server load. - Membase
Highly-available and horizontally scalable data storage. Membase buckets are 100% protocol compatible with Memcached.
Membase bucket-type capabilities:
- Persistence
Data objects can be persisted asynchronously to hard-disk resources from memory. - Replication
Replica servers can receive copies of data objects in the bucket. A replica server can be promoted if the host server fails, providing a highly available cluster via fail-over. - Rebalancing
Rebalancing enables dynamic addition or removal of buckets and servers in the cluster.
In Part 2 I will look at installing Membase on Amazon EC2.












