Category Archives: Database

Voldemort backups

At VZnet we use a number of different storage technologies, including both relational and NoSQL databases.  One database that powers a number of our services is Voldemort.  Voldemort is an open source key value store written by LinkedIn.  Some of the features that we really like about it is its very high scalability and its automatic partitioning and replication of data across storage nodes.

When I talk about backing up Voldemort, many might question whether that is really needed.  Data is already replicated across nodes, if one node dies or becomes corrupted, the data should safely exist somewhere else in the system.  This is true, with data replication you don’t need to worry about backing up for the purposes of protecting yourself against hardware failure.  However, that’s not the only reason why you might want to backup.

All software contains bugs, and our services at VZnet are no exception.  Voldemort itself is also no exception.  We can do our best to eliminate bugs through testing, but they will always pop up and we need to be prepared to deal with that.  One type of bug that could occur is a bug that corrupts data.  What if we pushed an update that, though it passed the tests, contained a race condition that under load results in the wrong data being associated with the wrong keys.  Replication won’t help you there, Voldemort will, by design, replicate the data corruption your code has caused across its nodes.  There are many other places where replication won’t help you, like if all your storage nodes use the same SAN and your SAN overheats destroying all its disks, or a dangerously under-skilled business owner decides to try to query the database er… key-value store and in the process deletes all your data.  But potential corruption due to bugs in our code was the issue that we were most worried about.

So, how do we go about backing up a Voldemort datastore?  The first question that needs to be answered is how Voldemort is storing its data.  Voldemort has pluggable back-end storage support, with the primary two back-ends available being MySQL and the Java implementation of Berkeley-DB (BDB).  We’re using BDB, so the next question we need to ask is how do we backup a BDB database?  Some answers on forums on the web point out that since BDB is an append only database, backups can be done safely by just copying its files.  This is only partially true.  BDB is an append only database, however it also has a clean up thread.  The clean up thread periodically scans through the database and removes outdated entries from the database and compacts the files.  This stops the database from constantly growing every time a record is updated.  It also means that if you copy the files while the clean up thread is running, you risk having a corrupt backup.

Fortunately, the Java BDB implementation comes with a tool that can be used to assist with backups.  In short, the tool pauses the clean up thread, finishes the current file that is being appended to, and then gives you a list of files that you can backup.  It supports incremental backups too, which is great for large databases.

My first naive attempt to use this tool was to write a small Java program that ran independent of Voldemort.  This was a failure, because an external process can’t pause the clean up thread, nor can it even access the database because Voldemort has it locked.  So I needed to implement something into Voldemort.  Controlling this feature is most sensibly done using the Voldemort admin tool, which I found very easy to extend.  The result can be found in the bdb-backup branch of my fork of Voldemort on GitHub.  I’ve initiated a pull request and hopefully the folks at LinkedIn will be happy to accept it.

The feature can be used by passing a new --native-backup command line option, information about how to use this is built into the admin clients help.  Using this new feature you can now create guaranteed consistent backups of your BDB Voldemort nodes, and you can rest assured knowing that you can recover from the many data corruption issues that replication won’t save you from.

VZ-Networks @ Qcon London 2010

This year the fourth annual Qcon London conference takes place from March 10th to March 12th. VZ-Networks participates with the following presentation:

Qcon

Social networks and the Richness of Data: Getting distributed webservices done with Nosql

Social networks by their nature deal with large amounts of user-generated data that must be processed and presented in a time sensitive manner. Much more write intensive than previous generations of websites, social networks have been on the leading edge of non-relational persistence technology adoption. This talk presents how Germany’s leading social networks schuelerVZ, studiVZ and meinVZ are incorporating Redis and Project Voldemort into their platform to run features like activity streams.
Jodok Batlogg, Fabrizio Schmidt

More Information

Third December Hadoop Get Together video online

VZnet Netzwerke Ltd. is proud sponsor of the Hadoop Get Together which takes place in Berlin on a regular basis.
In the following video taken at the last Hadoop Get Together in Berlin Jörg Möllenkamp explains why Hadoop is interesting for Sun – and why Sun Hardware might be a good fit for Hadoop applications:

Hadoop Nikolaus Pohle from Isabel Drost on Vimeo.

Hadoop Richard Hutton from Isabel Drost on Vimeo.

Hadoop Jörg Möllenkamp from Isabel Drost on Vimeo.

In a blog post published after the event, Jörg gives more details on his idea of Parasitic Hadoop he introduced at the meetup.

Apache Hadoop Get Together Berlin

We would like to announce the December-2009 Hadoop Get Together in newthinking store Berlin.

When: 16. December 2009 at 5:00pm
Where: newthinking store, Tucholskystr. 48, Berlin, Germany

As always there will be slots of 20min each for talks on your Hadoop topic. After each talk there will be a lot time to discuss. You can order drinks directly at the bar in the newthinking store. If you like, you can order pizza. We will go to Cafe Aufsturz after the event for some beer and something to eat.

Talks scheduled so far:

Richard Hutton (nugg.ad): “Moving from five days to one hour.” – This talk explains how we made data processing scalable at nugg.ad. The company’s core business is online advertisement targeting. Our servers receive 10,000 requests per second resulting in data of 100GB per day.

As the classical data warehouse solution reached its limit, we moved to a framework built on top of Hadoop to make analytics speedy, data mining detailed and all of our lives easier. We will give an overview of our solution involving file system structures, scheduling, messaging and programming languages from the future.

Jörg Möllenkamp (Sun): “Hadoop on Sun”
Abstract: Hadoop is a well known technology inside of Sun. This talk want to show some interesting use cases of Hadoop in conjunction with Sun technologies. The first show case wants to demonstrate how Hadoop can used to load massive multicore system with up to 256 threads in a single system to the max. The second use case shows how several mechanisms integrated in Solaris can ease the deployment and operation of Hadoop even in non-dedicated environments. The last usecase will show the combination of the Sun Grid Engine and Hadoop. Talk may contain command-line demonstrations ;).

Nikolaus Pohle (nurago): “M/R for MR – Online Market Research powered by Apache Hadoop. Enable consultants to analyze online behavior for audience segmentation, advertising effects and usage patterns.”

We would like to invite you, the visitor to also tell your Hadoop story, if you like, you can bring slides – there will be a beamer. Thanks for Isabel Drost who is organizing this event and for Newthinking Store for providing Space. VZnet Netzwerke is sponsoring the video recording of the talks.

Registration:

  • http://upcoming.yahoo.com/event/4842528/
  • https://www.xing.com/events/apache-hadoop-berlin-426025
  • N✮SQL Berlin Roundup

    Key-value-stores and other non-relational databases are a hot topic right now, as it increasingly turns out that the traditional (= relational) approach is difficult to scale horizontally. In other words, it may be time — as Bob Ippolito put it — to “Drop ACID and think about data”.

    If you couldn’t make it to the first “NoSQL Meetup Berlin” which recently took place at newthinking store — all the talks have been recorded and are on vimeo now:

    Heise has a good summary (in german).

    Here at StudiVZ we are following this topic very closely and are actively investigating some of the alternatives. At the moment, Cassandra looks very promising to us.

    Some more links: