Here’s some interesting NoSQL stuff guys. It’s a presentation about how Twitter uses NoSQL for analytics by Kevin Weil (@kevinweil), Analytics Lead, Twitter.
About the presentation
Collecting data (Scribe)
Storing and Analyzing data (Hadoop)
Rapid Learning over Big Data (Pig)
For those who are not aware of the said technologies…
Scribe – Log collection framework over Thrift, built and open sourced by Facebook
Hadoop – A software framework that supports data-intensive distributed applications from Apache
Pig – A platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs.
Cassandra – An open source distributed database management system
HBase – An open source, non-relational, distributed database modeled after Google’s BigTable and is written in Java
FlockDB – An open source distributed, fault-tolerant graph database for managing data at webscale
Related articles
- NoSQL: NoSQL Screencast: HBase Schema Design (themindstorms.blogspot.com)
- (In) The new world of Big Data, how different is the world of databases? Part II (jhingran.typepad.com)
- Hadoop Comes to Azure (devx.com)
- NoSQL HBase and Hadoop with Todd Lipcon from Cloudera (allthingshadoop.com)
- The 10 Most Important Open Source Projects of 2011 (linux.com)
- Apache launches Cassandra 1.0 NoSQL DB (zdnet.com)
- A Introduction to Big Data Databases (larrycady.wordpress.com)
- NoSQL notes (dbms2.com)
Over time Twitter has been very open about the technologies they are using and they also open source some of the tools they have created. If you check this timeline http://nosql.mypopescu.com/tagged/twitter you’ll notice not only what Twitter has been using over time, but also how things evolved. In 2010, I invited Twitter’s Ryan King to give a presentation about the various NoSQL projects they’ve been using or experimenting with–you can watch it http://www.infoq.com/presentations/NoSQL-at-Twitter-by-Ryan-King
hope you’ll enjoy these
Most of Social networks (like Facebook) has stopped using mysql as main database and switched to use Cassandra or other no-sql DB. And we can consider this change as big grow for this new open-source data store, Cassandra, which was developed originally by Facebook to solve the problem of inbox search and to be fast, reliable and had the ability to handle read and write requests at the same time
source: Why does large Social Network projects switch to use Cassandra instead of Mysql?